gpt4 book ai didi

python pandas column dtype = object 导致合并失败 : DtypeWarning: Columns have mixed types

转载 作者:太空狗 更新时间:2023-10-30 01:05:26 25 4
gpt4 key购买 nike

我正在尝试在 Customer_ID 列上合并两个数据帧 df1, df2。似乎 Customer_ID 在两者中具有相同的数据类型 (object)。

df1:

Customer_ID |  Flag
12345 A

df2:

Customer_ID | Transaction_Value
12345 258478

当我合并两个表时:

new_df = df2.merge(df1, on='Customer_ID', how='left')

对于某些 Customer_ID,它有效,而对于其他 Customer_ID,则无效。对于这个例子,我会得到这个结果:

Customer_ID | Transaction_Value | Flag
12345 258478 NaN

我检查了数据类型,它们是相同的:

df1.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 873353 entries, 0 to 873352
Data columns (total 2 columns):
Customer_ID 873353 non-null object
Flag 873353 non-null object
dtypes: object(2)
memory usage: 20.0+ MB

df2.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 873353 entries, 0 to 873352
Data columns (total 2 columns):
Customer_ID 873353 non-null object
Transaction_Value 873353 int64
dtypes: object(2)
memory usage: 20.0+ MB

当我上传 df1 时,我确实收到了这条消息:

C:\Users\xxx\AppData\Local\Continuum\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py:2717: DtypeWarning: Columns (1) have mixed types. Specify dtype option on import or set low_memory=False.
interactivity=interactivity, compiler=compiler, result=result)

当我想检查是否存在客户 ID 时,我意识到我必须在两个数据框中以不同方式指定它。

df1.loc[df1['Customer_ID'] == 12345]

df2.loc[df2['Customer_ID'] == '12345']

最佳答案

Customer_ID 在这两种情况下都是 dtype==object ......但这并不意味着各个元素是同一类型。您需要同时制作 strint


使用 int

dtype = dict(Customer_ID=int)

df1.astype(dtype).merge(df2.astype(dtype), 'left')

Customer_ID Flag Transaction_Value
0 12345 A 258478

使用 str

dtype = dict(Customer_ID=str)

df1.astype(dtype).merge(df2.astype(dtype), 'left')

Customer_ID Flag Transaction_Value
0 12345 A 258478

关于python pandas column dtype = object 导致合并失败 : DtypeWarning: Columns have mixed types,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44639772/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com