gpt4 book ai didi

Python Pandas : Create dataframe from Excel file with multi (merged cell) headers

转载 作者:行者123 更新时间:2023-12-04 20:26:39 25 4
gpt4 key购买 nike

我对 Python (Pandas) 比较陌生,我想用它来自动执行 Excel 任务并提高工作效率:)

目前,我正坐在 Excel 销售报告下方,其中“年份”是合并单元格。

           |               2018                          |              2019                      |
| Product | January | February | March | April | January | February | March | April |
| A | 8 | 10 | 65 | 50 | 8 | 10 | 65 | 50 |
| B | 9 | 10 | 65 | 50 | 8 | 63 | 65 | 50 |
| C | 7 | 10 | 65 | 50 | 8 | 10 | 65 | 50 |
| D | 8 | 10 | 65 | 50 | 8 | 10 | 65 | 50 |

现在我想将报告 reshape 为堆叠格式,然后我可以将其写回 Excel,并用于进一步分析:
Product  |  Year  |  Month  |  Values
A | 2018 | January | 8
B | 2018 | February| 9

我的想法是创建一个数据框并使用 pd.melt()

不幸的是,在尝试创建数据框时,我已经在第一步失败了。

“年份”只写在 2 个单元格中,其余显示“未命名 x”。
import pandas as pd

// change console output
desired_width = 320
pd.set_option("display.width", desired_width)
pd.set_option("display.max_columns", 30)

//Read Excel file and create dataframe

df = pd.read_excel("Stackoverflow_example.xlsx")

print(df)




Unnamed: 0 2018 Unnamed: 2 Unnamed: 3 Unnamed: 4 2019 Unnamed: 6 Unnamed: 7 Unnamed: 8
0 Product January February March April January February March April
1 A 8 10 65 50 8 10 65 50
2 B 9 10 65 50 8 63 65 50
3 C 7 10 65 50 8 10 65 50
4 D 8 10 65 50 8 10 65 50

如果有人可以帮助我解决这个问题,那就太好了。

提前谢谢了。

编辑:

添加 header =[0,1],index_col=[0] 工作,但我仍在努力寻找一种将其转换为堆叠格式的方法.....
import pandas as pd

desired_width = 320
pd.set_option("display.width", desired_width)
pd.set_option("display.max_columns", 30)

df = pd.read_excel("Stackoverflow_example.xlsx", header=[0,1], index_col=[0])

print(df)

----------------------------------------------------------------------

2018 2019
Product January February March April January February March April
A 8 10 65 50 8 10 65 50
B 9 10 65 50 8 63 65 50
C 7 10 65 50 8 10 65 50
D 8 10 65 50 8 10 65 50


它有效,但同时弄乱了列标题名称(level_0,“产品”在“月”列中......

import pandas as pd

desired_width = 320
pd.set_option("display.width", desired_width)
pd.set_option("display.max_columns", 30)

df = pd.read_excel("Stackoverflow_example.xlsx", header=[0,1], index_col=[0])
df = df.stack().reset_index()

print(df)

-----------------------------------------------------------------------------
level_0 Product 2018 2019
0 A April 50 50
1 A February 10 10
2 A January 8 8
3 A March 65 65
4 B April 50 50
5 B February 10 63
6 B January 9 8
7 B March 65 65
8 C April 50 50
9 C February 10 10
10 C January 7 8
11 C March 65 65
12 D April 50 50
13 D February 10 10
14 D January 8 8
15 D March 65 65


我尝试重命名列并将索引设置为“产品”,导致“2018 年 2019 年”下方的“单元格”为空
import pandas as pd

desired_width = 320
pd.set_option("display.width", desired_width)
pd.set_option("display.max_columns", 30)

df = pd.read_excel("Stackoverflow_example.xlsx", header=[0,1], index_col=[0])
df = df.stack().reset_index()

df.columns = ["Product", "Month", "2018", "2019"]
df = df.set_index("Product")

print(df)

----------------------------------------------------------

Month 2018 2019
Product
A April 50 50
A February 10 10
A January 8 8
A March 65 65
B April 50 50
B February 10 63
B January 9 8
B March 65 65
C April 50 50
C February 10 10
C January 7 8
C March 65 65
D April 50 50
D February 10 10
D January 8 8
D March 65 65

最佳答案

第一个为 MultiIndex在列中添加参数 header=[0,1]为了避免 MultiIndex按第一列添加index_col=[0]将第一列转换为索引:

df = pd.read_excel("Stackoverflow_example.xlsx", header=[0,1], index_col=[0])

然后通过 DataFrame.unstack reshape , 将索引名称更改为 Series.rename_axis 最后一次转换 Series Series.reset_index 的列:
df = df.unstack().rename_axis(('Year','Month','Product')).reset_index(name='Value')

#if order of columns is impiortant change it by subset
df = df[['Product','Year','Month','Value']]
print(df.head())

Product Year Month Value
0 A 2018 January 8
1 B 2018 January 9
2 C 2018 January 7
3 D 2018 January 8
4 A 2018 February 10

关于Python Pandas : Create dataframe from Excel file with multi (merged cell) headers,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58914730/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com