Incremental density-based ensemble clustering over evolving data streams

Imran Khan*, Joshua Z. Huang, Kamen Ivanov

*المؤلف المقابل لهذا العمل

نتاج البحث: المساهمة في مجلةArticleمراجعة النظراء

57 اقتباسات (Scopus)

ملخص

The recent advances in smart meter technology have enabled for collecting information about customer power consumption in real time. The measurements are generated continuously and in some cases, e.g. in the industrial smart metering the data exchange rates are highly-fluctuating. The storage, querying, and mining of such smart meter streaming data with a large number of missing and sparse values are highly computationally challenging tasks. To address such matters, we propose a new method called incremental density-based ensemble clustering (IDEStream) for incremental segmentation of various kinds of factories based on their electricity consumption data. It exploits a gamma mixture model to suppress the influence of sparse data units in the data streams that sequentially arrive within a time window and then generates a clustering from the processed data of that window. IDEStream uses a unique incremental ensemble approach to incrementally aggregate the clusterings of subsequent time windows. Experimental results on data streams collected by smart meters from manufacturing factories in Guangdong province of China have shown that the proposed algorithm outperforms several state-of-the-art data stream clustering algorithms. The obtained segmentation can find numerous applications, an exemplar one being to define customer rates in a flexible way.

اللغة الأصليةEnglish
الصفحات (من إلى)34-43
عدد الصفحات10
دوريةNeurocomputing
مستوى الصوت191
المعرِّفات الرقمية للأشياء
حالة النشرPublished - مايو 26 2016
منشور خارجيًانعم

ASJC Scopus subject areas

  • ???subjectarea.asjc.1700.1706???
  • ???subjectarea.asjc.2800.2805???
  • ???subjectarea.asjc.1700.1702???

قم بذكر هذا