The Data Mining Forum                             open-source data mining software data science journal data mining conferences machine learning in software engineering MLISE 2021 utility mining workshop at ICDM 2021
This forum is about data mining, data science and big data: algorithms, source code, datasets, implementations, optimizations, etc. You are welcome to post call for papers, data mining job ads, link to source code of data mining algorithms or anything else related to data mining. The forum is hosted by P. Fournier-Viger. No registration is required to use this forum!.  
聚类算法的使用
Posted by: qinqinzhou
Date: December 15, 2021 06:47PM

尊敬的philfv先生您好:
我在使用spmf库里的k-means聚类算法,数据是2436条67维的,拟采用欧氏距离聚成5类,但是聚类的效果不太好,聚类的结果文本也看不出来类间的区分,这是因为数据稀疏且维度太大了嘛?对于这样的数据聚类您有什么意见?非常感谢!
参数设置及可视化结果:

输入数据:

输出文本结果:

Options: ReplyQuote
Re: 聚类算法的使用
Posted by: qinqinzhou
Date: December 15, 2021 06:51PM




Options: ReplyQuote
Re: 聚类算法的使用
Date: December 23, 2021 09:05PM

Good afternoon,

I am sorry for the delay to answer. I have been very busy.

I think it is likely because the data is very sparse with so many dimensions.

In the visualization, the results may look strange but I think it is because we can only visualize two dimensions at a time.

By trying to look more in your data, maybe you can find some other reasons.

Best regards,

Philippe

Options: ReplyQuote
Re: 聚类算法的使用
Posted by: 琴琴周
Date: December 24, 2021 11:50PM

Thank you very much!I am tring to reduce the dimensions,filling the gap values and reading papers,I think I would solve the problem.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.