gpt4 book ai didi

python - 连接两个 Pandas 数据框并重新排序列

转载 作者:行者123 更新时间:2023-11-28 18:56:34 24 4
gpt4 key购买 nike

我有两个数据框(df1 和 df2,如下所示),它们的列在顺序和计数上都不同。我需要将这两个数据框附加到一个 Excel 文件,其中的列顺序必须按照下面的 Col_list 中指定。

df1 是:

 durable_medical_equipment    pcp  specialist  diagnostic  imaging  generic  formulary_brand  non_preferred_generic  emergency_room  inpatient_facility  medical_deductible_single  medical_deductible_family  maximum_out_of_pocket_limit_single  maximum_out_of_pocket_limit_family plan_name      pdf_name
0 False False False False False False False False False False False False False False ABCBCBC adjnajdn.pdf

... df2 是:

   pcp  specialist  generic  formulary_brand  emergency_room  urgent_care  inpatient_facility  durable_medical_equipment  medical_deductible_single  medical_deductible_family  maximum_out_of_pocket_limit_single  maximum_out_of_pocket_limit_family plan_name      pdf_name
0 True True False False True True True True True True True True ABCBCBC adjnajdn.pdf

我正在创建一个列列表,它与 excel 中列的顺序相同。

Col_list = ['durable_medical_equipment', 'pcp', 'specialist', 'diagnostic',
'imaging', 'generic', 'formulary_brand', 'non_preferred_generic',
'emergency_room', 'inpatient_facility', 'medical_deductible_single',
'medical_deductible_family', 'maximum_out_of_pocket_limit_single', 'maximum_out_of_pocket_limit_family',
'urgent_care', 'plan_name', 'pdf_name']

我正在尝试使用 concat() 根据 Col_list 重新排序我的数据框。对于数据框中不存在的列值,该值可以是 NaN。

result = pd.concat([df, pd.DataFrame(columns=list(Col_list))])

这不能正常工作。我怎样才能实现这种重新排序?

我尝试了以下方法:

 result = pd.concat([df_repo, pd.DataFrame(columns=list(Col_list))], sort=False, ignore_index=True)
print(result.to_string())

我得到的输出是:

 durable_medical_equipment    pcp specialist diagnostic imaging generic formulary_brand non_preferred_generic emergency_room inpatient_facility medical_deductible_single medical_deductible_family maximum_out_of_pocket_limit_single maximum_out_of_pocket_limit_family plan_name      pdf_name urgent_care
0 False False False False False False False False False False False False False False ABCBCBC adjnajdn.pdf NaN
pcp specialist generic formulary_brand emergency_room urgent_care inpatient_facility durable_medical_equipment medical_deductible_single medical_deductible_family maximum_out_of_pocket_limit_single maximum_out_of_pocket_limit_family plan_name pdf_name diagnostic imaging non_preferred_generic
0 True True False False True True True True True True True True ABCBCBC adjnajdn.pdf NaN NaN NaN

最佳答案

如果需要按列表中的值更改顺序,请添加 DataFrame.reindex并传递给 concat:

df = pd.concat([df1.reindex(Col_list, axis=1), 
df2.reindex(Col_list, axis=1)], sort=False, ignore_index=True)
print (df)
durable_medical_equipment pcp specialist diagnostic imaging generic \
0 False False False 0.0 0.0 False
1 True True True NaN NaN False

formulary_brand non_preferred_generic emergency_room inpatient_facility \
0 False 0.0 False False
1 False NaN True True

medical_deductible_single medical_deductible_family \
0 False False
1 True True

maximum_out_of_pocket_limit_single maximum_out_of_pocket_limit_family \
0 False False
1 True True

urgent_care plan_name pdf_name
0 NaN ABCBCBC adjnajdn.pdf
1 1.0 ABCBCBC adjnajdn.pdf

关于python - 连接两个 Pandas 数据框并重新排序列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57625316/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com