gpt4 book ai didi

Python:如何摆脱嵌套循环?

转载 作者:太空宇宙 更新时间:2023-11-03 14:39:42 25 4
gpt4 key购买 nike

我有 2 个 for 循环,一个接一个,我想以某种方式摆脱它们以提高代码速度。我的 pandas 数据框看起来像这样(标题代表不同的公司,行代表不同的用户,1 表示用户访问了该公司,否则为 0):

   100  200  300  400
0 1 1 0 1
1 1 1 1 0

我想比较我数据集中的每一对公司,为此,我创建了一个包含所有公司 ID 的列表。代码查看列表采用第一家公司(基地),然后它与其他所有公司(同行)配对,因此是第二个“for”循环。我的代码如下:

def calculate_scores():
df_matrix = create_the_matrix(df)
print(df_matrix)
for base in list_of_companies:
counter = 0
for peer in list_of_companies:
counter += 1
if base == peer:
"do nothing"
else:
# Calculate first the denominator since we slice the big matrix
# In dataframes that only have accessed the base firm
denominator_df = df_matrix.loc[(df_matrix[base] == 1)]
denominator = denominator_df.sum(axis=1).values.tolist()
denominator = sum(denominator) - len(denominator)

# Calculate the numerator. This is done later because
# We slice up more the dataframe above by
# Filtering records which have been accessed by both the base and the peer firm
numerator_df = denominator_df.loc[(denominator_df[base] == 1) & (denominator_df[peer] == 1)]
numerator = len(numerator_df.index)
annual_search_fraction = numerator/denominator
print("Base: {} and Peer: {} ==> {}".format(base, peer, annual_search_fraction))

编辑 1(添加代码解释):

指标如下:

enter image description here

1) 我尝试计算的指标是要告诉我与所有其他搜索相比,两家公司一起被搜索的次数。

2) 该代码首先选择所有访问过base firm (denominator_df = df_matrix.loc[(df_matrix[base] == 1)])行的用户。然后它计算分母,计算基础公司和用户搜索的任何其他公司之间的独特组合的数量,因为我可以计算(用户)访问的公司数量,我可以减去 1 以获得数量基础公司与其他公司之间的独特联系。

3) 接下来,代码过滤前面的denominator_df 以仅选择访问基本公司和同行公司的行。由于我需要计算访问基地和同行公司的用户数量,我使用命令:numerator = len(numerator_df.index) 来计算行数,这将给我分子。

顶部数据框的预期输出如下:

Base: 100 and Peer: 200 ==> 0.5
Base: 100 and Peer: 300 ==> 0.25
Base: 100 and Peer: 400 ==> 0.25
Base: 200 and Peer: 100 ==> 0.5
Base: 200 and Peer: 300 ==> 0.25
Base: 200 and Peer: 400 ==> 0.25
Base: 300 and Peer: 100 ==> 0.5
Base: 300 and Peer: 200 ==> 0.5
Base: 300 and Peer: 400 ==> 0.0
Base: 400 and Peer: 100 ==> 0.5
Base: 400 and Peer: 200 ==> 0.5
Base: 400 and Peer: 300 ==> 0.0

4) 检查代码是否给出了正确的解决方案:1 个基础公司和所有其他同行公司之间的所有指标总和必须为 1。他们在我发布的代码中这样做

任何关于前进方向的建议或提示将不胜感激!

最佳答案

您可能正在寻找 itertools.product()。这是一个类似于您似乎想要做的示例:

import itertools

a = [ 'one', 'two', 'three' ]

for b in itertools.product( a, a ):
print( b )

上述代码片段的输出是:

('one', 'one')
('one', 'two')
('one', 'three')
('two', 'one')
('two', 'two')
('two', 'three')
('three', 'one')
('three', 'two')
('three', 'three')

或者你可以这样做:

for u,v in itertools.product( a, a ):
print( "%s %s"%(u, v) )

然后输出是,

one one
one two
one three
two one
two two
two three
three one
three two
three three

如果你想要一个列表,你可以这样做:

alist = list( itertools.product( a, a ) ) )

print( alist )

输出是,

[('one', 'one'), ('one', 'two'), ('one', 'three'), ('two', 'one'), ('two', 'two'), ('two', 'three'), ('three', 'one'), ('three', 'two'), ('three', 'three')]

关于Python:如何摆脱嵌套循环?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54368202/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com