TY - JOUR
T1 - Data envelopment analysis and big data
AU - Khezrimotlagh, Dariush
AU - Zhu, Joe
AU - Cook, Wade D.
AU - Toloo, Mehdi
N1 - Funding Information:
We appreciate the valuable comments from three anonymous reviewers as they aided in significantly improving the clarity of our paper. This research is supported by Pennsylvania State University (No. 0206020 AC6ET), National Natural Science Funds of China (No. 71828101 ) and the Czech Science Foundation (GAČR 16-17810S ). All support is greatly acknowledged.
Funding Information:
We appreciate the valuable comments from three anonymous reviewers as they aided in significantly improving the clarity of our paper. This research is supported by Pennsylvania State University (No. 0206020 AC6ET), National Natural Science Funds of China (No. 71828101) and the Czech Science Foundation (GA?R 16-17810S). All support is greatly acknowledged.
Publisher Copyright:
© 2018 Elsevier B.V.
PY - 2019/5/1
Y1 - 2019/5/1
N2 - In the traditional data envelopment analysis (DEA) approach for a set of n Decision Making Units (DMUs), a standard DEA model is solved n times, one for each DMU. As the number of DMUs increases, the running-time to solve the standard model sharply rises. In this study, a new framework is proposed to significantly decrease the required DEA calculation time in comparison with the existing methodologies when a large set of DMUs (e.g., 20,000 DMUs or more) is present. The framework includes five steps: (i) selecting a subsample of DMUs using a proposed algorithm, (ii) finding the best-practice DMUs in the selected subsample, (iii) finding the exterior DMUs to the hull of the selected subsample, (iv) identifying the set of all efficient DMUs, and (v) measuring the performance scores of DMUs as those arising from the traditional DEA approach. The variable returns to scale technology is assumed and several simulation experiments are designed to estimate the running-time for applying the proposed method for big data. The obtained results in this study point out that the running-time is decreased up to 99.9% in comparison with the existing techniques. In addition, we illustrate the essential computation time for applying the proposed method as a function of the number of DMUs (cardinality), number of inputs and outputs (dimension), and the proportion of efficient DMUs (density). The methods are also compared on a real data set consisting of 30,099 electric power plants in the United States from 1996 to 2016.
AB - In the traditional data envelopment analysis (DEA) approach for a set of n Decision Making Units (DMUs), a standard DEA model is solved n times, one for each DMU. As the number of DMUs increases, the running-time to solve the standard model sharply rises. In this study, a new framework is proposed to significantly decrease the required DEA calculation time in comparison with the existing methodologies when a large set of DMUs (e.g., 20,000 DMUs or more) is present. The framework includes five steps: (i) selecting a subsample of DMUs using a proposed algorithm, (ii) finding the best-practice DMUs in the selected subsample, (iii) finding the exterior DMUs to the hull of the selected subsample, (iv) identifying the set of all efficient DMUs, and (v) measuring the performance scores of DMUs as those arising from the traditional DEA approach. The variable returns to scale technology is assumed and several simulation experiments are designed to estimate the running-time for applying the proposed method for big data. The obtained results in this study point out that the running-time is decreased up to 99.9% in comparison with the existing techniques. In addition, we illustrate the essential computation time for applying the proposed method as a function of the number of DMUs (cardinality), number of inputs and outputs (dimension), and the proportion of efficient DMUs (density). The methods are also compared on a real data set consisting of 30,099 electric power plants in the United States from 1996 to 2016.
KW - Big data
KW - Data envelopment analysis (DEA)
KW - Performance evaluation
KW - Simulation
UR - http://www.scopus.com/inward/record.url?scp=85056389811&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85056389811&partnerID=8YFLogxK
U2 - 10.1016/j.ejor.2018.10.044
DO - 10.1016/j.ejor.2018.10.044
M3 - Article
AN - SCOPUS:85056389811
SN - 0377-2217
VL - 274
SP - 1047
EP - 1054
JO - European Journal of Operational Research
JF - European Journal of Operational Research
IS - 3
ER -