gpt4 book ai didi

algorithm - DBSCAN Clustersize 小于 MinPts

转载 作者:行者123 更新时间:2023-11-30 09:03:19 27 4
gpt4 key购买 nike

我只是想到了 DBSCAN 的一些特殊情况。案例如图here 。假设 eps 等于圆的半径。对于 MinPts=3,p 和 r 是核心点。目前还不清楚 q 属于 p 簇还是 r 簇。如果使用递归实现并且算法首先检查 r,则 q 将成为 r 簇的一部分。因此 p 将定义一个只有两个元素的簇。原文paper状态:“请注意,簇 wrt. Eps 和 MinPts 至少包含 MinPts 点 [...]”我在这里遗漏了一些东西还是只是没有考虑这种特殊情况?

最佳答案

例如,q也是一个核心点:圆内有三个点:p、q、r。在此示例中,您需要 minPts=4。

您需要将密度簇的理论定义与有效的算法输出区分开来,后者仅“几乎”给出理论结果是有充分理由的:在理论模型中,q 将是一部分两个集群的。但这对于用户来说是不方便且令人惊讶的。

您不是第一个注意到这一点的人。甚至维基百科也知道这一点:

While minPts intuitively is the minimum cluster size, in some cases DBSCAN can produce smaller clusters.[5] A DBSCAN cluster consists of at least one core point.[5] As other points may be border points to more than one cluster, there is no guarantee that at least minPts points are included in every cluster.

引用文献[5]为文章

Schubert, Erich; Sander, Jörg; Ester, Martin; Kriegel, Hans Peter; Xu, Xiaowei (July 2017). "DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN". ACM Trans. Database Syst. 42 (3): 19:1–19:21. doi:10.1145/3068335. ISSN 0362-5915.

其中包含脚注:

Note that this can, in rare cases, lead to a cluster with fewer than minPts points, if too many border points are reachable by different clusters, and have previously been assigned to other clusters. Every cluster will at least have one core point. Multi-assignment to exactly represent the theoretical model—or an assignment by shortest distance—can be implemented easily.

关于algorithm - DBSCAN Clustersize 小于 MinPts,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59143239/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com