gpt4 book ai didi

python - 如何确定Python中源字符串和多个字符串之间的相似度?

转载 作者:行者123 更新时间:2023-12-01 05:37:00 25 4
gpt4 key购买 nike

假设我有以下源字符串:

Humpty dumpty <span id="1">sat</span> on a wall, humpty dumpty had a great fall. All of <span id="two">the kings</span> horses and all the kings men.

以及列表中的其他一些字符串,每个字符串由换行符分隔:

Humpty dumpty sat on a wall, humpty dumpty had a great fall. All of the kings horses and all the kings men.

Humpty dumpty sat on the wall, all of the kings horses and all the kings men.

There is a humpty dumpty who had sat on the wall, and all of the kings horses and all the kings men.

Humpty dumpty sat on some wall, humpty dumpty had a great fall. All of the kings horses and all the kings men couldn't put him together again.

Humpty dumpty this is a completely related sentence.

我希望能够从目标字符串开始,使用 python 找出与源字符串最匹配的“列表中的其他字符串”。是否有一些最佳方法可以在源字符串和目标字符串对之间的比较中得出一些“分数”,并根据某些标准能够确定哪个字符串与源字符串最匹配? (在这种情况下,最相似的字符串应该是第一个字符串,因为它是没有任何“<span id="1"></span> ”的源字符串。

最佳答案

您可以使用 PyLevenshtein 模块来查找 Levenshtein 距离,并使用它来确定字符串之间的相似性。

https://code.google.com/p/pylevenshtein/

关于python - 如何确定Python中源字符串和多个字符串之间的相似度?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18710942/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com