Nonnegative matrix factorization based consensus for clusterings with a variable number of clusters

Imran Khan, Zongwei Luo*

*المؤلف المقابل لهذا العمل

نتاج البحث: المساهمة في مجلةArticleمراجعة النظراء

9 اقتباسات (Scopus)

ملخص

Consensus clustering is an aggregation of base clusterings into an ensemble clustering which is better than the individual base clusterings. It is beneficial to determine the clusters from heterogeneous data. This paper presents a new approach that generates a set of good quality base clusterings and finds a single by aggregation of base clusterings into one clustering solution. The new approach consists of two phases. In the first phase, we present a new tree-based $k$-means algorithm to build different base clusterings. It builds a cluster-tree which gives us one base clustering. The tree generation process uses two stopping criteria which base on the underlying data distribution of a data set. We change the value of the input parameter of the tree generation algorithm to produce multiple cluster-trees where each tree gives a base clustering with a variable number of clusters. In the second phase, we propose a new nonnegative matrix factorization-based consensus method to ensemble base clusterings into final clustering. We investigated the quality and diversity of base clusterings, which often have a large influence on the performances of consensus clustering. Experimental results on various real-world and synthetic data sets have demonstrated that the proposed algorithm was dominant over the well-known algorithms in term of clustering accuracy.

اللغة الأصليةEnglish
رقم المقال8542938
الصفحات (من إلى)73158-73169
عدد الصفحات12
دوريةIEEE Access
مستوى الصوت6
المعرِّفات الرقمية للأشياء
حالة النشرPublished - 2018
منشور خارجيًانعم

ASJC Scopus subject areas

  • ???subjectarea.asjc.1700.1700???
  • ???subjectarea.asjc.2500.2500???
  • ???subjectarea.asjc.2200.2200???

قم بذكر هذا