gpt4 book ai didi

python - 在 dedupe 库中增加 max_components 变量

转载 作者:太空宇宙 更新时间:2023-11-04 05:02:02 24 4
gpt4 key购买 nike

如何增加 max_components 变量的默认值?

默认情况下 max_components 设置为 30000。我需要增加此限制,因为每次我执行重复数据删除(使用相同的数据集)时都会得到不同的结果。

我认为我的数据中的簇总数大于 30000。

最佳答案

来自 Github 的回答

Issue in dedupe github Increase max_components = 30000

If you are getting different results using same saved settings file, then what you reporting is a bug. If you are getting different results from different training data (or even the same training data), that's expected as at various points dedupe uses a random sample to learn good rules.

In either case, I doubt that max_components is related. But, if you want to change it, fork the code and change it.

关于python - 在 dedupe 库中增加 max_components 变量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45480818/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com