gpt4 book ai didi

python - 以迭代方式计算成对列表的分数时遇到问题?

转载 作者:行者123 更新时间:2023-11-30 23:11:42 25 4
gpt4 key购买 nike

假设我有以下列表(实际上它们有很多子列表):

list_1 = [['Hi my name is anon'],
['Hi I like #hokey']]


list_2 = [['Hi my name is anon_2'],
['Hi I like #Basketball']]

我想计算 distance所有可能的没有重复的成对(没有替换的组合,产品?)。例如:

distance between: ['Hi my name is anon'] and ['Hi my name is anon_2']
distance between: ['Hi my name is anon'] and ['Hi I like #Basketball']
distance between: ['Hi I like #hokey'] and ['Hi my name is anon_2']
distance between: ['Hi I like #hokey'] and ['Hi I like #Basketball']

并将分数放入这样的列表中:

[distance_1,distance_2,distance_3,distance_4]

为此,我正在考虑使用 itertools productcombination 。这是我尝试过的:

strings_1 = [i[0] for i in list_1]
strings_2 = [i[0] for i in list_2]

import itertools

scores_list = [dis.jaccard(i,j) for i,j in zip(itertools.combinations(strings_1, strings_2))]

问题是我得到了这个回溯:

    scores_list = [dis.jaccard(i,j) for i,j in zip(itertools.combinations(strings_1, strings_2))]
TypeError: an integer is required

如何有效地完成这项任务以及如何计算这种类似于产品组合的操作?

最佳答案

您需要使用itertools.product得到笛卡尔积,像这样

[dis.jaccrd(string1, string2) for string1, string2 in product(list_1, list_2)]

产品将对项目进行分组,如下所示

>>> from pprint import pprint
>>> pprint(list(product(list_1, list_2)))
[(['Hi my name is anon'], ['Hi my name is anon_2']),
(['Hi my name is anon'], ['Hi I like #Basketball']),
(['Hi I like #hokey'], ['Hi my name is anon_2']),
(['Hi I like #hokey'], ['Hi I like #Basketball'])]
<小时/>

如果您只想将 jaccrd 函数应用于列表中的字符串,那么您可能需要像这样预处理列表

>>> list_11 = [item for items in list_1 for item in items]
>>> list_21 = [item for items in list_2 for item in items]
>>> pprint([str1 + " " + str2 for str1, str2 in product(list_11, list_21)])
['Hi my name is anon Hi my name is anon_2',
'Hi my name is anon Hi I like #Basketball',
'Hi I like #hokey Hi my name is anon_2',
'Hi I like #hokey Hi I like #Basketball']
>>> pprint([dis.jaccard(str1, str2) for str1, str2 in product(list_11, list_21)])
...
...
<小时/>

根据 Ashwini 在评论中的建议,对于您的情况,您可以直接使用 itertools.starmap ,像这样

>>> from itertools import product, starmap
>>> list(starmap(dis.jaccrd, product(list_11, list_21)))

例如,

>>> list_1 = ["a1", "a2", "a3"]
>>> list_2 = ["b1", "b2", "b3"]
>>> from itertools import product, starmap
>>> list(starmap(lambda x, y: x + " " + y, product(list_1, list_2)))
['a1 b1', 'a1 b2', 'a1 b3', 'a2 b1', 'a2 b2', 'a2 b3', 'a3 b1', 'a3 b2', 'a3 b3']

关于python - 以迭代方式计算成对列表的分数时遇到问题?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30107612/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com