gpt4 book ai didi

r - 如何使用 cmean 预测集群成员资格?

转载 作者:行者123 更新时间:2023-12-04 00:39:45 24 4
gpt4 key购买 nike

我正在使用 e1071 R 包中的 cmean 对我的数据进行聚类。我想预测新数据的集群成员资格,但我不知道如何编写预测函数。虽然预测硬集群成员资格很简单(只需分配给最近的集群中心),但我不知道如何计算成员资格值,因为它们在 cl$membership 中给出:

cl <- cmeans( train, centers= 10, m= 1.08 )
# cl$membership contains the "soft" cluster membership
# the following line does not work, unfortunately
cl.new <- predict( cl, test )

# getting the hard cluster assignments is easy
predict.fclust <- function( cl, x ) {
which.cl <- function( xx )
which.min( apply( cl$centers, 1, function( y ) sum( ( y - xx )^2 ) ) )
ret <- apply( x, 1, which.cl )
names( ret ) <- rownames( x )
ret
}
# this works, but only predicts hard clustering
cl.new <- predict( cl, test )

最佳答案

成员资格定义为 ( Wikipedia )

cmeans membership

考虑 cmeans 帮助页面中的这个示例:

library("e1071")
set.seed(1)
x <- rbind(matrix(rnorm(100,sd=0.3), ncol=2),
matrix(rnorm(100,mean=1,sd=0.3), ncol=2))
cl <- cmeans(x, 2, 20, verbose=TRUE, method="cmeans", m=2)

然后成员值可以计算如下:

## compute distances between samples and cluster centers for default setting
## dist="euclidean"; use absolute values for dist="manhattan"
cc <- cl$centers
dm <- sapply(seq_len(nrow(x)),
function(i) apply(cc, 1, function(v) sqrt(sum((x[i, ]-v)^2))))

m <- 2
## compute cluster membership values
ms <- t(apply(dm, 2,
function(x) {
tmp <- 1/((x/sum(x))^(2/(m-1))) # formula above
tmp/sum(tmp) # normalization
}))

比较:

R> head(cl$membership)
1 2
[1,] 0.02669 0.9733
[2,] 0.01786 0.9821
[3,] 0.03622 0.9638
[4,] 0.13481 0.8652
[5,] 0.13708 0.8629
[6,] 0.20024 0.7998

R> head(ms)
1 2
[1,] 0.02669 0.9733
[2,] 0.01786 0.9821
[3,] 0.03622 0.9638
[4,] 0.13481 0.8652
[5,] 0.13708 0.8629
[6,] 0.20024 0.7998

R> all.equal(ms, cl$membership, tolerance=1e-15)
[1] TRUE

关于r - 如何使用 cmean 预测集群成员资格?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20243460/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com