Ensemble clustering of high dimensional data with fastmap projection

Imran Khan*, Joshua Zhexue Huang, Nguyen Thanh Tung, Graham Williams

*المؤلف المقابل لهذا العمل

نتاج البحث: Conference contribution

11 اقتباسات (Scopus)

ملخص

In this paper, we propose an ensemble clustering method for high dimensional data which uses FastMap projection to generate subspace component data sets. In comparison with popular random sampling and random projection, FastMap projection preserves the clustering structure of the original data in the component data sets so that the performance of ensemble clustering is improved significantly. We present two methods to measure preservation of clustering structure of generated component data sets. The comparison results have shown that FastMap preserved the clustering structure better than random sampling and random projection. Experiments on three real data sets were conducted with three data generation methods and three consensus functions. The results have shown that the ensemble clustering with FastMap projection outperformed the ensemble clusterings with random sampling and random projection.

اللغة الأصليةEnglish
عنوان منشور المضيفTrends and Applications in Knowledge Discovery and Data Mining - PAKDD 2014 International Workshops
العنوان الفرعي لمنشور المضيفDANTH, BDM, MobiSocial, BigEC, CloudSD, MSMV-MBI, SDA, DMDA-Health, ALSIP, SocNet, DMBIH, BigPMA, Revised Selected Papers
المحررونWen-Chih Peng, Haixun Wang, Zhi-Hua Zhou, Tu Bao Ho, Vincent S. Tseng, Arbee L.P. Chen, James Bailey
ناشرSpringer Verlag
الصفحات483-493
عدد الصفحات11
رقم المعيار الدولي للكتب (الإلكتروني)9783319131856
المعرِّفات الرقمية للأشياء
حالة النشرPublished - 2014
منشور خارجيًانعم
الحدثInternational Workshops on Data Mining and Decision Analytics for Public Health, Biologically Inspired Data Mining Techniques, Mobile Data Management, Mining, and Computing on Social Networks, Big Data Science and Engineering on E-Commerce, Cloud Service Discovery, MSMV-MBI, Scalable Dats Analytics, Data Mining and Decision Analytics for Public Health and Wellness, Algorithms for Large-Scale Information Processing in Knowledge Discovery, Data Mining in Social Networks, Data Mining in Biomedical informatics and Healthcare, Pattern Mining and Application of Big Data in conjunction with 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2014 - Tainan, Taiwan, Province of China
المدة: مايو ١٣ ٢٠١٤مايو ١٦ ٢٠١٤

سلسلة المنشورات

الاسمLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
مستوى الصوت8643
رقم المعيار الدولي للدوريات (المطبوع)0302-9743
رقم المعيار الدولي للدوريات (الإلكتروني)1611-3349

Conference

ConferenceInternational Workshops on Data Mining and Decision Analytics for Public Health, Biologically Inspired Data Mining Techniques, Mobile Data Management, Mining, and Computing on Social Networks, Big Data Science and Engineering on E-Commerce, Cloud Service Discovery, MSMV-MBI, Scalable Dats Analytics, Data Mining and Decision Analytics for Public Health and Wellness, Algorithms for Large-Scale Information Processing in Knowledge Discovery, Data Mining in Social Networks, Data Mining in Biomedical informatics and Healthcare, Pattern Mining and Application of Big Data in conjunction with 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2014
الدولة/الإقليمTaiwan, Province of China
المدينةTainan
المدة٥/١٣/١٤٥/١٦/١٤

ASJC Scopus subject areas

  • ???subjectarea.asjc.2600.2614???
  • ???subjectarea.asjc.1700.1700???

قم بذكر هذا