| Summary: | A method is introduced for variable selection and prediction in linear regression problems where the number of predictors can be much larger than the number of observations. The methodology involves minimising a penalised Euclidean distance, where the penalty is the geometric mean of the $\ell_1$ and $\ell_2$ norms of the regression coefficients. This particular formulation exhibits a grouping effect, which is useful for model selection in high dimensional problems. Also, an important result is a model consistency theorem, which does not require an estimate of the noise standard deviation. An algorithm for estimation is described, which involves thresholding to obtain a sparse solution. Practical performances of variable selection and prediction are evaluated through simulation studies and the analysis of real datasets.
|