TY - GEN
T1 - Bias-corrected quantile re gression fores ts for high-dimensiona l data
AU - Tungl, Nguyen Thanh
AU - Huangl, Joshua Zhexue
AU - Nguyen, Thuy Thi
AU - Khanl, Imran
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/1/13
Y1 - 2014/1/13
N2 - The Quantile Regression Forest (QRF), a nonparametric regression method based on the random forests, has been proved to perform well in terms of prediction accuracy, especially for nonGaussian conditional distributions. However, the method may have two kinds of bias when solving regression problems: bias in the feature selection stage and bias in solving the regression problem. In this paper, we propose a new bias-correction algorithm that uses bias correction based on the QRF. To correct the first kind of bias, we propose a new scheme for feature sampling that allows to select good features for growing trees. The first level QRF is built based on this. For the second kind of bias, the residual term of the first level QRF model is used as the response feature to train the second level QRF model for bias correction. The second level model is then used to compute bias-corrected predictions. In our experiments, the proposedalgorithm dramatically reduces prediction errors and outperforms most of the existing regression random forests models for both synthetic and well-known real-world data sets.
AB - The Quantile Regression Forest (QRF), a nonparametric regression method based on the random forests, has been proved to perform well in terms of prediction accuracy, especially for nonGaussian conditional distributions. However, the method may have two kinds of bias when solving regression problems: bias in the feature selection stage and bias in solving the regression problem. In this paper, we propose a new bias-correction algorithm that uses bias correction based on the QRF. To correct the first kind of bias, we propose a new scheme for feature sampling that allows to select good features for growing trees. The first level QRF is built based on this. For the second kind of bias, the residual term of the first level QRF model is used as the response feature to train the second level QRF model for bias correction. The second level model is then used to compute bias-corrected predictions. In our experiments, the proposedalgorithm dramatically reduces prediction errors and outperforms most of the existing regression random forests models for both synthetic and well-known real-world data sets.
KW - Bias Correction
KW - Data mining
KW - HighDimensional Data
KW - Quantile Regression Forests
KW - Random Forests
UR - http://www.scopus.com/inward/record.url?scp=84921459050&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84921459050&partnerID=8YFLogxK
U2 - 10.1109/ICMLC.2014.7009082
DO - 10.1109/ICMLC.2014.7009082
M3 - Conference contribution
AN - SCOPUS:84921459050
T3 - Proceedings - International Conference on Machine Learning and Cybernetics
SP - 1
EP - 6
BT - Proceedings of 2014 International Conference on Machine Learning and Cybernetics, ICMLC 2014
PB - IEEE Computer Society
T2 - 13th International Conference on Machine Learning and Cybernetics, ICMLC 2014
Y2 - 13 July 2014 through 16 July 2014
ER -