Workshop Program

Invited speaker: George Forman "Feature Selection:  Reframing the Problem"

FSDM'08 Keynote
George Forman
Hewlett-Packard Labs, Palo Alto, CA, USA    

Feature selection is often thought of as a means to improve accuracy or reduce computing demands for machine learning or content analysis.  It is usually characterized as removing irrelevant or redundant features, or identifying the best features for some purpose.  As dataset sizes increase and the potential feature space grows to infinity with rich data types, so grows the importance of feature selection.

For all the great research papers on feature selection that demonstrate consistent and substantial improvements on benchmark datasets, one might think this sub-field in data mining is mature.  Not so.  Feature selection in practice still involves much trial-and-error and is prone to failures.  In my talk I'll reveal samples of some challenging industrial datasets and problem definitions, and autopsy an example failure in feature selection.  Based on these experiences, I will deliver some do's and don'ts to address frequent shortcomings in our research field, and set out some valuable areas for future work, such as feature selection knowledge transfer (aka meta-learning).

Bio: George Forman is a senior research scientist at Hewlett-Packard Labs.  His research interests stem from practical issues that arise in the application of machine learning to industrial problems, e.g. feature selection, robustness, small training sets, and novel problem formulations.  His Ph.D. in Computer Science & Engineering is from the University of Washington, Seattle, 1996.

Final program (September 15th 2008)

08.55Opening (Organizing committee)
09.00hJohn Lee and Michel Verleysen
Quality assessment of nonlinear dimensionality reduction based on K-ary neighborhoods
09.30hTaiji Suzuki, Masashi Sugiyama, Jun Sese and Takafumi Kanamori
A Least-squares Approach to Mutual Information Estimation with Application in Variable Selection
10.00hAndreas Janecek and Wilfried Gansterer
A Comparison of Classification Accuracy Achieved with Wrappers, Filters and PCA
10.20hCoffee Break
10.40hPeter Antal, Andras Millinghoffer, Gábor Hullám, Csaba Szalai and András Falus
A Bayesian View of Challenges in Feature Selection: Feature Aggregation, Multiple Targets, Redundancy and Interaction
11.10hVân Anh Huynh-Thu, Louis Wehenkel and Pierre Geurts
Exploiting tree-based variable importances to selectively identify relevant variables
11.40hZheng Zhao and Huan Liu
Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis
12.10hRoberto Ruiz, José C. Riquelme and Jesús S. Aguilar-Ruiz
Best Agglomerative Ranked Subset for Feature Selection
12.30hLunch Break
13.45hInvited Talk: George Forman (Hewlett-Packard Labs, Palo Alto, CA, USA)
Feature Selection: Reframing the Problem
14.45hSvetlana Kiritchenko and Mikhail Jiline
Keyword Optimization in Sponsored Search via Feature Selection
15.05hVictor Eruhimov, Vladimir Martyanov and Aleksey Polovinkin
Transferring Knowledge by Prior Feature Sampling
15.25hCoffee Break
15.40hMarine Campedel, Ivan Kyrgyzov and Henri Maitre
Unsupervised feature selection applied to SPOT5 satellite images indexing
16.10hAndreas Köhler, Matthias Ohrnberger, Carsten Riggelsen and Frank Scherbaum
Unsupervised Feature Selection for Pattern Discovery in Seismic Wavefields
16.30hSébastien Guérif
Unsupervised Variable Selection: when random rankings sound as irrelevancy
16.50hWorkshop closing