gpt4 book ai didi

Twitter 热门话题 : Combine different spellings

转载 作者:行者123 更新时间:2023-12-02 16:10:59 24 4
gpt4 key购买 nike

Twitter 的热门话题通常不仅仅包含一个词。但对于组合术语,通常有不同的拼写方式,例如:

“混血王子”/“混血王子”

要查找提及热门主题的所有更新,您需要所有拼写方式。 Twitter 是这样做的:

Twitter's Trending Topics Admin

左边是主题名称,右边是不同的拼写方式。您认为这是手动还是自动完成的?可以自动执行此操作吗?如果是:如何?

希望你能帮助我。提前致谢!

最佳答案

您基本上想要的是找到 similarity between two strings .

我认为Soundex算法就是您正在寻找的。它可用于根据字符串的发音来比较字符串。或者正如维基描述的那样:

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.

还有:

Using this algorithm [EDIT: that is, "rating" words by a letter and three digits], both "Robert" and "Rupert" return the same string "R163" while "Rubin" yields "R150". "Ashcraft" yields "A261".

还有the Levenshtein distance .

祝你好运。

关于Twitter 热门话题 : Combine different spellings,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1203497/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com