Invited speaker: George Forman "Feature Selection: Reframing the Problem"
FSDM'08 Keynote
George Forman
Hewlett-Packard Labs,
Palo Alto, CA, USA
Feature selection is often thought of as a means to improve accuracy or reduce
computing demands for machine learning or content analysis. It is usually
characterized as removing irrelevant or redundant features, or identifying the
best features for some purpose. As dataset sizes increase and the potential
feature space grows to infinity with rich data types, so grows the importance of
feature selection.
For all the great research papers on feature selection that demonstrate
consistent and substantial improvements on benchmark datasets, one might think
this sub-field in data mining is mature. Not so. Feature selection
in practice still
involves much trial-and-error and is prone to failures. In my talk I'll reveal
samples of some challenging industrial datasets and problem definitions, and
autopsy an example failure in feature selection. Based on these experiences, I
will deliver some do's and don'ts to address frequent shortcomings in our
research field, and set out some valuable areas for future work, such as feature
selection knowledge transfer (aka meta-learning).
Bio: George Forman is a senior research scientist at
Hewlett-Packard Labs. His research interests stem from practical issues that
arise in the application of machine learning to industrial problems, e.g.
feature selection, robustness, small training sets, and novel problem
formulations. His Ph.D. in Computer Science & Engineering is from the
University of Washington, Seattle, 1996.
Final program (September 15th 2008)
| 08.55 | Opening (Organizing committee) |
| 09.00h | John Lee and Michel Verleysen Quality assessment of nonlinear dimensionality reduction based on K-ary neighborhoods |
| 09.30h | Taiji Suzuki, Masashi Sugiyama, Jun Sese and Takafumi Kanamori A Least-squares Approach to Mutual Information Estimation with Application in Variable Selection |
| 10.00h | Andreas Janecek and Wilfried Gansterer A Comparison of Classification Accuracy Achieved with Wrappers, Filters and PCA |
| 10.20h | Coffee Break |
| 10.40h | Peter Antal, Andras Millinghoffer, Gábor Hullám, Csaba Szalai and András Falus A Bayesian View of Challenges in Feature Selection: Feature Aggregation, Multiple Targets, Redundancy and Interaction |
| 11.10h | Vân Anh Huynh-Thu, Louis Wehenkel and Pierre Geurts Exploiting tree-based variable importances to selectively identify relevant variables |
| 11.40h | Zheng Zhao and Huan Liu Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis |
| 12.10h | Roberto Ruiz, José C. Riquelme and Jesús S. Aguilar-Ruiz Best Agglomerative Ranked Subset for Feature Selection |
| 12.30h | Lunch Break |
| 13.45h | Invited Talk: George Forman (Hewlett-Packard Labs, Palo Alto, CA, USA) Feature Selection: Reframing the Problem |
| 14.45h | Svetlana Kiritchenko and Mikhail Jiline Keyword Optimization in Sponsored Search via Feature Selection |
| 15.05h | Victor Eruhimov, Vladimir Martyanov and Aleksey Polovinkin Transferring Knowledge by Prior Feature Sampling |
| 15.25h | Coffee Break |
| 15.40h | Marine Campedel, Ivan Kyrgyzov and Henri Maitre Unsupervised feature selection applied to SPOT5 satellite images indexing |
| 16.10h | Andreas Köhler, Matthias Ohrnberger, Carsten Riggelsen and Frank Scherbaum Unsupervised Feature Selection for Pattern Discovery in Seismic Wavefields |
| 16.30h | Sébastien Guérif Unsupervised Variable Selection: when random rankings sound as irrelevancy |
| 16.50h | Workshop closing |
| |
| |
| |
| |
| |