Bias-corrected quantile re gression fores ts for high-dimensiona l data

Nguyen Thanh Tungl, Joshua Zhexue Huangl, Thuy Thi Nguyen, Imran Khanl

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

The Quantile Regression Forest (QRF), a nonparametric regression method based on the random forests, has been proved to perform well in terms of prediction accuracy, especially for nonGaussian conditional distributions. However, the method may have two kinds of bias when solving regression problems: bias in the feature selection stage and bias in solving the regression problem. In this paper, we propose a new bias-correction algorithm that uses bias correction based on the QRF. To correct the first kind of bias, we propose a new scheme for feature sampling that allows to select good features for growing trees. The first level QRF is built based on this. For the second kind of bias, the residual term of the first level QRF model is used as the response feature to train the second level QRF model for bias correction. The second level model is then used to compute bias-corrected predictions. In our experiments, the proposedalgorithm dramatically reduces prediction errors and outperforms most of the existing regression random forests models for both synthetic and well-known real-world data sets.

Original languageEnglish
Title of host publicationProceedings of 2014 International Conference on Machine Learning and Cybernetics, ICMLC 2014
PublisherIEEE Computer Society
Pages1-6
Number of pages6
ISBN (Electronic)9781479942169
DOIs
Publication statusPublished - Jan 13 2014
Externally publishedYes
Event13th International Conference on Machine Learning and Cybernetics, ICMLC 2014 - Lanzhou, China
Duration: Jul 13 2014Jul 16 2014

Publication series

NameProceedings - International Conference on Machine Learning and Cybernetics
Volume1
ISSN (Print)2160-133X
ISSN (Electronic)2160-1348

Conference

Conference13th International Conference on Machine Learning and Cybernetics, ICMLC 2014
Country/TerritoryChina
CityLanzhou
Period7/13/147/16/14

Keywords

  • Bias Correction
  • Data mining
  • HighDimensional Data
  • Quantile Regression Forests
  • Random Forests

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Human-Computer Interaction

Cite this