gpt4 book ai didi

python - 如何计算组合在 Python 中的二进制表中出现的次数?

转载 作者:行者123 更新时间:2023-12-04 07:18:53 25 4
gpt4 key购买 nike

我需要创建一个包含两列的 Pandas DataFrame:

  • 组合 - 包含描述二进制表中产品组合的元组(例如,(“面包”,“鸡蛋”))
  • Count - 包含此组合在二进制表中出现的次数

  • 我一直提到的二进制表看起来像这样。 1 仅表示该产品在该特定产品中(或者在这种情况下,它存在于组合中),否则为 0。
          bread  cheese  eggs  flour  jam
    1000 0 0 1 0 0
    1001 1 0 0 0 0
    1002 1 0 1 1 0
    1003 1 0 1 0 1
    1004 0 0 1 0 0
    ... ... ... ... ... ...
    1495 1 0 1 1 0
    1496 1 1 1 0 0
    1497 0 0 0 0 1
    1498 1 0 0 0 0
    1499 1 0 1 0 0
    500 rows × 5 columns
    我已经想出了如何创建组合列,我只是不知道如何使用二进制表中的数据创建计数列。到目前为止,这是我的代码:
    import pandas as pd
    import itertools

    combinations_list = []
    products = ["bread","cheese","eggs","flour","jam"]

    for p in range(2, len(products)+1):
    for subset in itertools.combinations(products, p):
    combinations_list.append(str(subset))

    #code for count column here

    report = pd.DataFrame(combinations_list,columns=['Combinations'])
    report
    这是代码的样子,但我仍然需要添加计数列。
    >>
    Combinations
    0 ('bread', 'cheese')
    1 ('bread', 'eggs')
    2 ('bread', 'flour')
    3 ('bread', 'jam')
    4 ('cheese', 'eggs')
    5 ('cheese', 'flour')
    6 ('cheese', 'jam')
    7 ('eggs', 'flour')
    8 ('eggs', 'jam')
    9 ('flour', 'jam')
    10 ('bread', 'cheese', 'eggs')
    11 ('bread', 'cheese', 'flour')
    12 ('bread', 'cheese', 'jam')
    13 ('bread', 'eggs', 'flour')
    14 ('bread', 'eggs', 'jam')
    15 ('bread', 'flour', 'jam')
    16 ('cheese', 'eggs', 'flour')
    17 ('cheese', 'eggs', 'jam')
    18 ('cheese', 'flour', 'jam')
    19 ('eggs', 'flour', 'jam')
    20 ('bread', 'cheese', 'eggs', 'flour')
    21 ('bread', 'cheese', 'eggs', 'jam')
    22 ('bread', 'cheese', 'flour', 'jam')
    23 ('bread', 'eggs', 'flour', 'jam')
    24 ('cheese', 'eggs', 'flour', 'jam')
    25 ('bread', 'cheese', 'eggs', 'flour', 'jam')
    谁能帮帮我吗?谢谢!

    最佳答案

    这是一种解决方案:

    d={}
    for x in range(2,len(df.columns)+1):
    for y in itertools.combinations(df.columns,x):
    d[y]=0

    for x in range(2,len(df.columns)+1):
    for i in df.index:
    s=[k for k in df.columns if df.loc[i, k]==1]
    p=[j for j in itertools.combinations(s,x)]
    for w in p:
    d[w]+=1

    res=pd.DataFrame({'comb':d.keys(), 'count':d.values()})
    对于 df 的可见部分(您在问题中提供的行),此代码返回:
    >>>print(res)
    comb count
    0 (bread, cheese) 1
    1 (bread, eggs) 5
    2 (bread, flour) 2
    3 (bread, jam) 1
    4 (cheese, eggs) 1
    5 (cheese, flour) 0
    6 (cheese, jam) 0
    7 (eggs, flour) 2
    8 (eggs, jam) 1
    9 (flour, jam) 0
    10 (bread, cheese, eggs) 1
    11 (bread, cheese, flour) 0
    12 (bread, cheese, jam) 0
    13 (bread, eggs, flour) 2
    14 (bread, eggs, jam) 1
    15 (bread, flour, jam) 0
    16 (cheese, eggs, flour) 0
    17 (cheese, eggs, jam) 0
    18 (cheese, flour, jam) 0
    19 (eggs, flour, jam) 0
    20 (bread, cheese, eggs, flour) 0
    21 (bread, cheese, eggs, jam) 0
    22 (bread, cheese, flour, jam) 0
    23 (bread, eggs, flour, jam) 0
    24 (cheese, eggs, flour, jam) 0
    25 (bread, cheese, eggs, flour, jam) 0

    关于python - 如何计算组合在 Python 中的二进制表中出现的次数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68623807/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com