Comparing different strategies for variable selection in large dimensions

Abstract

In this talk i will present a joint work with Anis Ben Ishak. In statistical learning we model the relation between an output variable and a set of explanatory variables using data. In classification the output variable is discrete with two or more levels ((binary or multiclass). As usual we wish to learn the model generating the data, with the constraint that the sample size is too small relatively to the number of explanatory variables. We will show how we can select the most important variables within the learning task, and suggest some extensions using multiclass support vector machines.