gpt4 book ai didi

python - 加入单个 DataFrame 的两个唯一组合,将其转换为列名

转载 作者:行者123 更新时间:2023-12-04 14:58:22 26 4
gpt4 key购买 nike

我有一个有趣的问题,我尝试这样做,但没有成功。我有一个包含 4 列的时间序列数据框:源、目标、时间戳和值。

每个时间戳都有多个源、目标和值作为提供的代码:

import pandas as pd
data =
[['a','None','01.01.2020',20], ['a','None','02.01.2020',15],['a','None','03.01.2020',11],
['a','b','01.01.2020',100], ['a','b','02.01.2020',105], ['a','b','03.01.2020',101],
['c','d','01.01.2020',0], ['c','d','02.01.2020',0], ['c','d','03.01.2020',1],
['b','c','01.01.2020',50.45], ['b','c','02.01.2020',10.5], ['b','c','03.01.2020',500],
['a','d','01.01.2020',5000], ['a','d','02.01.2020',1500], ['a','d','03.01.2020',25],
['c','a','01.01.2020',2.2538], ['c','a','02.01.2020',105], ['c','a','03.01.2020',110]]

df = pd.DataFrame(data, columns = ['Source', 'Target', 'timestamp', 'values'])

我想返回一个新的数据格式作为定义的数据框:

resultdata = [['01.01.2020',20,100,0,50.45,5000,2.2538], ['02.01.2020',15,105,0,10.5, 1500,105],
['03.01.2020',11,101,1,500,25,110]]
result = pd.DataFrame(resultdata, columns = ['timestamp', 'aNone', 'ab', 'cd', 'bc', 'ad', 'ca'])

为此,我尝试加入字符串列并删除重复的时间戳,然后运行迭代,但我只收到字典格式的最后一次迭代数据。

df['Source Target'] = df['Source']  + ' ' + df['Target']
st = df['Source Target'].drop_duplicates(keep= 'first').reset_index(drop=True)
timestamp = df['timestamp'].drop_duplicates(keep= 'first')

d ={}
for j in range(len(timestamp)):
Time = timestamp ['timestamp'][j]
for k in range(len(st)):
Column = st[k]
for i in range(len(df)):
time = df['timestamp'][i]
columnname = df['Source Target'][i]
if time==Time and columnname == Column:
d[Column] = (time,df['values'][i])

最佳答案

让我们试试 pivot_table相反:

import pandas as pd

data = [['a', 'None', '01.01.2020', 20], ['a', 'None', '02.01.2020', 15],
['a', 'None', '03.01.2020', 11], ['a', 'b', '01.01.2020', 100],
['a', 'b', '02.01.2020', 105], ['a', 'b', '03.01.2020', 101],
['c', 'd', '01.01.2020', 0], ['c', 'd', '02.01.2020', 0],
['c', 'd', '03.01.2020', 1], ['b', 'c', '01.01.2020', 50.45],
['b', 'c', '02.01.2020', 10.5], ['b', 'c', '03.01.2020', 500],
['a', 'd', '01.01.2020', 5000], ['a', 'd', '02.01.2020', 1500],
['a', 'd', '03.01.2020', 25], ['c', 'a', '01.01.2020', 2.2538],
['c', 'a', '02.01.2020', 105], ['c', 'a', '03.01.2020', 110]]

df = pd.DataFrame(data, columns=['Source', 'Target', 'timestamp', 'values'])

# Create Pivot Table
df = df.pivot_table(index='timestamp',
columns=['Source', 'Target'],
values='values').reset_index()

# Reduce mutli-index columns
df.columns = df.columns.map(''.join)

# Fix dtypes
df = df.convert_dtypes()

# For Display
print(df.to_string())

df:

    timestamp  aNone   ab    ad     bc      ca  cd
0 01.01.2020 20 100 5000 50.45 2.2538 0
1 02.01.2020 15 105 1500 10.5 105.0 0
2 03.01.2020 11 101 25 500.0 110.0 1

关于python - 加入单个 DataFrame 的两个唯一组合,将其转换为列名,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67452590/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com