gpt4 book ai didi

python - Pandas 数据框在 x 行后创建新列

转载 作者:太空宇宙 更新时间:2023-11-04 02:13:52 25 4
gpt4 key购买 nike

我正在尝试基于 CSV 文件中的一些数据创建一个新的 DataFrame。

我的数据是以下形式:

1, 81.99525117808678
2, 78.79210736916842
3, 69.33703048261454
4, 53.12612416937101
5, 48.8442549498639
6, 48.8442549498639
7, 38.96011640562207
8, 33.66251691693962
9, 29.202159649144907
10, 27.77726568480279
1, 81.99525117808678
2, 78.79210736916842
3, 69.33703048261454
4, 53.12612416937101
5, 48.8442549498639
6, 48.8442549498639
7, 38.96011640562207
8, 33.66251691693962
9, 29.202159649144907
10, 27.77726568480279

第一个数字代表索引,第二个数字代表值。我想为每个独特的运行创建一个新列。例如:

Index:       Run 1:             Run 2:
1, 81.99525117808678, 81.99525117808678
2, 78.79210736916842, 78.79210736916842
3, 69.33703048261454, 69.33703048261454
4, 53.12612416937101, 53.12612416937101
5, 48.8442549498639, 48.8442549498639
6, 48.8442549498639, 48.8442549498639
7, 38.96011640562207, 38.96011640562207
8, 33.66251691693962, 33.66251691693962
9, 29.202159649144907, 29.202159649144907
10, 27.77726568480279, 27.77726568480279

到目前为止,我有以下内容:

df = pd.read_csv(path, header=None, names=['Generation', 'Fitness'], index_col=0)

这会产生结果:

0   
1 81.995251
2 78.792107
3 69.337030
4 53.126124
5 48.844255
6 48.844255
7 38.960116
8 33.662517
9 29.202160
10 27.777266
1 81.995251
2 78.792107
3 69.337030
4 53.126124
5 48.844255
6 48.844255
7 38.960116
8 33.662517
9 29.202160
10 27.777266

最佳答案

您可以创建一个 reader 可迭代对象(详见 docs), block 大小为 10,然后连接每个 block :

reader = pd.read_csv('data.csv', sep=',', chunksize=10,
index_col=0, header=None, names=['Generation', 'Fitness'])

my_df = pd.concat((chunk for chunk in reader), axis=1)

>>> my_df
Fitness Fitness
Generation
1 81.995251 81.995251
2 78.792107 78.792107
3 69.337030 69.337030
4 53.126124 53.126124
5 48.844255 48.844255
6 48.844255 48.844255
7 38.960116 38.960116
8 33.662517 33.662517
9 29.202160 29.202160
10 27.777266 27.777266

如果您需要列名,您可以使用列表理解重命名它们:

# python 3.6 or above
my_df.columns = [f'Run {i}' for i, _ in enumerate(my_df.columns,1)]
# Or:
my_df.columns = ['Run {}'.format(i) for i, _ in enumerate(my_df.columns,1)]
# Or:
my_df.columns = range(1,len(list(df))+1)
my_df = my_df.add_prefix('Run ')


>>> my_df
Run 1 Run 2
Generation
1 81.995251 81.995251
2 78.792107 78.792107
3 69.337030 69.337030
4 53.126124 53.126124
5 48.844255 48.844255
6 48.844255 48.844255
7 38.960116 38.960116
8 33.662517 33.662517
9 29.202160 29.202160
10 27.777266 27.777266

关于python - Pandas 数据框在 x 行后创建新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53087524/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com