Data envelopment analysis and big data

Dariush Khezrimotlagh, Joe Zhu*, Wade D. Cook, Mehdi Toloo

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

60 Citations (Scopus)


In the traditional data envelopment analysis (DEA) approach for a set of n Decision Making Units (DMUs), a standard DEA model is solved n times, one for each DMU. As the number of DMUs increases, the running-time to solve the standard model sharply rises. In this study, a new framework is proposed to significantly decrease the required DEA calculation time in comparison with the existing methodologies when a large set of DMUs (e.g., 20,000 DMUs or more) is present. The framework includes five steps: (i) selecting a subsample of DMUs using a proposed algorithm, (ii) finding the best-practice DMUs in the selected subsample, (iii) finding the exterior DMUs to the hull of the selected subsample, (iv) identifying the set of all efficient DMUs, and (v) measuring the performance scores of DMUs as those arising from the traditional DEA approach. The variable returns to scale technology is assumed and several simulation experiments are designed to estimate the running-time for applying the proposed method for big data. The obtained results in this study point out that the running-time is decreased up to 99.9% in comparison with the existing techniques. In addition, we illustrate the essential computation time for applying the proposed method as a function of the number of DMUs (cardinality), number of inputs and outputs (dimension), and the proportion of efficient DMUs (density). The methods are also compared on a real data set consisting of 30,099 electric power plants in the United States from 1996 to 2016.

Original languageEnglish
Pages (from-to)1047-1054
Number of pages8
JournalEuropean Journal of Operational Research
Issue number3
Publication statusPublished - May 1 2019
Externally publishedYes


  • Big data
  • Data envelopment analysis (DEA)
  • Performance evaluation
  • Simulation

ASJC Scopus subject areas

  • Computer Science(all)
  • Modelling and Simulation
  • Management Science and Operations Research
  • Information Systems and Management

Cite this