gpt4 book ai didi

python - 如何从 pandas 的列表中向 Dataframe 添加行?

转载 作者:太空宇宙 更新时间:2023-11-04 05:14:36 24 4
gpt4 key购买 nike

我在 DataFrame 中存储了国家/地区的年度信息 (COUNT)。但是,有些国家在某些年份缺失。

如果我有一个完整的国家列表,将它们添加到相应年份下并用 0 填充 COUNT 的缺失值的最佳方法是什么?

            DATE    COUNTRY     COUNTRY_ID  COUNT
0 1980 United States 840 42
42 1980 Czech Republic 203 2
95 1980 Hungary 348 1
96 1980 Great Britain 826 1
97 1980 South Africa 710 1
98 1982 United States 840 42
140 1982 Paraguay 600 2
.
.

最佳答案

一种方法是组合所有 DATE、COUNTRY 组合,然后重新索引 DataFrame,最后填充缺失值。

# Assume that we want all years not just the ones seen
years = range(df['DATE'].min(), df['DATE'].max()+1)

# get all combinations
idx = pd.MultiIndex.from_product([years, df['COUNTRY'].unique()], names=['DATE', 'COUNTRY'])

# reindex by first putting DATE and COUNTRY into the index
df1 = df.set_index(['DATE', 'COUNTRY']).reindex(idx).reset_index()

# Fill back in missing IDs
country_map = df.set_index('COUNTRY')['COUNTRY_ID'].drop_duplicates()
df1['COUNTRY_ID'] = df1.COUNTRY.map(country_map)

# fill in 0 for COUNT and convert back to int
df1['COUNT'] = df1['COUNT'].fillna(0).astype(int)

DATE COUNTRY COUNTRY_ID COUNT
0 1980 United States 840 42
1 1980 Czech Republic 203 2
2 1980 Hungary 348 1
3 1980 Great Britain 826 1
4 1980 South Africa 710 1
5 1980 Paraguay 600 0
6 1981 United States 840 0
7 1981 Czech Republic 203 0
8 1981 Hungary 348 0
9 1981 Great Britain 826 0
10 1981 South Africa 710 0
11 1981 Paraguay 600 0
12 1982 United States 840 42
13 1982 Czech Republic 203 0
14 1982 Hungary 348 0
15 1982 Great Britain 826 0
16 1982 South Africa 710 0
17 1982 Paraguay 600 2

关于python - 如何从 pandas 的列表中向 Dataframe 添加行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42101376/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com