gpt4 book ai didi

r - 层次聚类分析帮助——树状图

转载 作者:行者123 更新时间:2023-12-05 01:23:13 25 4
gpt4 key购买 nike

我使用 hclust 函数编写了一个代码来生成树状图,如您在图像中所见。所以,我想帮助解释这个树状图。请注意,这些点的位置很近。我得到的树状图结果是什么意思,你能帮帮我吗? 我真的很想对生成的输出进行更完整的分析

library(geosphere)

Points_properties<-structure(list(Propertie=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29), Latitude = c(-24.781624, -24.775017, -24.769196,
-24.761741, -24.752019, -24.748008, -24.737312, -24.744718, -24.751996,
-24.724589, -24.8004, -24.796899, -24.795041, -24.780501, -24.763376,
-24.801715, -24.728005, -24.737845, -24.743485, -24.742601, -24.766422,
-24.767525, -24.775631, -24.792703, -24.790994, -24.787275, -24.795902,
-24.785587, -24.787558), Longitude = c(-49.937369,
-49.950576, -49.927608, -49.92762, -49.920608, -49.927707, -49.922095,
-49.915438, -49.910843, -49.899478, -49.901775, -49.89364, -49.925657,
-49.893193, -49.94081, -49.911967, -49.893358, -49.903904, -49.906435,
-49.927951, -49.939603, -49.941541, -49.94455, -49.929797, -49.92141,
-49.915141, -49.91042, -49.904772, -49.894034)), row.names = c(NA, -29L), class = c("tbl_df", "tbl",
"data.frame"))

coordinates<-subset(Points_properties,select=c("Latitude","Longitude"))
plot(coordinates[,2:1])
text(x = Points_properties$Longitude,
y= Points_properties$Latitude, labels=Points_properties$Propertie, pos=2)

enter image description here

d<-distm(coordinates[,2:1])
d<-as.dist(d)
fit.average<-hclust(d,method="average")
plot(fit.average,hang=-1,cex=.8, main = "")

enter image description here

最佳答案

您选择使用平均 方法执行层次聚类。

根据?hclust:

This function performs a hierarchical cluster analysis using a set of dissimilarities for the n objects being clustered. Initially, each object is assigned to its own cluster and then the algorithm proceeds iteratively, at each stage joining the two most similar clusters, continuing until there is just a single cluster. At each stage distances between clusters are recomputed

您可以使用 merge 字段跟踪发生的情况:

Row i of merge describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation −j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm

fit.average$merge
[,1] [,2]
[1,] -21 -22
[2,] -15 1
[3,] -13 -24
[4,] -6 -20
[5,] -2 -23
[6,] -16 -27
...

这是您在树状图中看到的:
enter image description here

树状图 y 轴上的高度表示点与其关联的簇中心之间的距离(因为您使用方法 average)。

  1. 点 21 和 22(最近的点)合并在一起创建了带有重心的簇 1
  2. 簇 1 与点 15 合并创建簇 2
  3. ...

然后您可以调用 rect.clust,它允许各种参数,例如您想要的组数 k:

rect.hclust(fit.average, k=3)

enter image description here

您还可以使用 rect.clust 的输出来为原始点着色:

groups <- rect.hclust(fit.average, k=3)
groups

#[[1]]
# [1] 5 6 7 8 9 10 17 18 19 20

#[[2]]
# [1] 1 2 3 4 15 21 22 23

#[[3]]
# [1] 11 12 13 14 16 24 25 26 27 28 29

colors <- rep(1:length(groups),lengths(groups))
colors <- colors[order(unlist(groups))]

plot(coordinates[,2:1],col = colors)

enter image description here

关于r - 层次聚类分析帮助——树状图,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72730592/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com