I'm training an unsupervised learning model that needs to cluster datapoints. Right now, I possess the average of each class' datapoints for validation purposes and I need each of them to be assigned to a different class.
我正在训练一种需要对数据点进行集群的无监督学习模型。现在,出于验证目的,我拥有每个类的数据点的平均值,并且我需要将每个数据点分配到不同的类。
Let's say I have 4 classes, the averages of each class A,B,C,D and the centroids 1,2,3,4. I want the assignment to look like this:
假设我有4个班级,每个班级A、B、C、D和质心1、2、3、4的平均值。我希望作业是这样的:
A -> 3
B -> 2
C -> 1
D -> 4
In a situation where two averages land in the same centroids like this:
在两个平均值落在相同质心的情况下,如下所示:
A -> 3
B -> 2
C -> 1
D -> 1
i'd like to be able to retrain the model while keeping the centroids 2 and 3 as they are, since they don't need correction.
我希望能够在保持质心2和3不变的情况下重新训练模型,因为它们不需要修正。
Does sklearn's KMeans allow for that?
斯莱恩的KMeans允许这一点吗?
EDIT: I'd like to do this because the class' kmeans++
random initialization performs very well for my purposes and it would require significantly more effort to reimplement it from scratch
编辑:我之所以这样做,是因为类的kMeans++随机初始化对我来说执行得非常好,而且从零开始重新实现它需要付出更多的努力
更多回答
优秀答案推荐
我是一名优秀的程序员,十分优秀!