这是我的示例数据:
maybe add a higher-level description
min of spare daemons
data in the appropriate order
the compiled max daemons
an iovec to store the trailer sent after the file
data in the wrong order
an iovec to store the headers sent before the file
return err maybe add a higher-level desc
if a user manually creates a data file
我想进行集群方法并自动将这些数据放入基于类别的 on same word appear in the sentence, so what I am trying to achieve is like this:
添加
也许添加更高级别的描述
return err maybe add a higher-level desc
达蒙
最少备用守护进程
编译的最大守护进程
iovec
一个iovec来存储文件之前发送的 header
an iovec to store the trailer sent after the file
数据
数据顺序错误
数据按适当的顺序排列
if a user manually creates a data file
有人可以给我一些帮助吗?非常感谢!
最佳答案
听起来好像您想查找最常见的单词?
并不难做(也不是“聚类”,只是按频繁出现的单词进行计数和分组),你尝试过什么,你被困在哪里?
关于machine-learning - 如何根据相同的单词对句子进行聚类?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20308149/