gpt4 book ai didi

python - Pandas SettingWithCopyWarning 原因不明

转载 作者:太空宇宙 更新时间:2023-11-04 04:06:28 25 4
gpt4 key购买 nike

考虑以下示例代码

import pandas as pd
import numpy as np

pd.set_option('display.expand_frame_repr', False)
foo = pd.read_csv("foo2.csv", skipinitialspace=True, index_col='Index')
foo.loc[:, 'Date'] = pd.to_datetime(foo.Date)

for i in range(0, len(foo)-1):
if foo.at[i, 'Type'] == 'Reservation':
for j in range(i+1, len(foo)):
if foo.at[j, 'Type'] == 'Payout':
foo.at[j, 'Nights'] = foo.at[i, 'Nights']
break

mask = (foo['Date'] >= '2018-03-31') & (foo['Date'] <= '2019-03-31')
foo2019 = foo.loc[mask]
foopayouts2019 = foo2019.loc[foo2019['Type'] == 'Payout']
foopayouts2019.loc[:, 'Nights'] = foopayouts2019['Nights'].apply(np.int64)
# foopayouts2019.loc[:, 'Nights'] = foopayouts2019['Nights'].astype(np.int64, copy=False)

foo2.csv 为:

Index,Date,Type,Nights,Amount,Payout
0,03/07/2018,Reservation,2.0,1000.00,
1,03/07/2018,Payout,,,1000.00
2,09/11/2018,Reservation,3.0,1500.00,
3,09/11/2018,Payout,,,1500.00
4,02/16/2019,Reservation,2.0,2000.00,
5,02/16/2019,Payout,,,2000.00
6,04/25/2019,Reservation,7.0,1200.00,
7,04/25/2019,Payout,,,1200.00

这给出了以下警告:

/usr/lib/python2.7/dist-packages/pandas/core/indexing.py:543: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self.obj[item] = s

警告没有提到行号,但似乎来自以下行:

foopayouts2019.loc[:, 'Nights'] = foopayouts2019['Nights'].apply(np.int64)

至少,如果我注释掉该行,错误就会消失。所以,我有两个问题。

  1. 是什么导致了这个错误?我一直在尝试使用 .loc 在哪里适当的,包括警告所在的那一行(可能)来自(哪里。如果问题实际上更早,它在哪里?
  2. 其次,.applyastype 哪个更好,如在以下代码行?

    foopayouts2019.loc[:, 'Nights'] = foopayouts2019['Nights'].apply(np.int64)
    # foopayouts2019.loc[:, 'Nights'] = foopayouts2019['Nights'].astype(np.int64, copy=False)

    除了那个警告之外,它们似乎都有效。

最佳答案

我会更改代码中的一些内容:

我们正在检查当前行是否为 Reservation,下一行是否为 Payout通过使用 shift()ffill-ing 使用 np.where() 条件匹配的值

foo.Date=pd.to_datetime(foo.Date) #convert to datetime
c=foo.Type.eq('Reservation')&foo.Type.shift(-1).eq('Payout')
foo.Nights=np.where(~c,foo.Nights.ffill(),foo.Nights) #replace if else with np.where

或者:

c=foo.Type.shift().eq('Reservation')&foo.Type.eq('Payout')
np.where(c,foo.Nights.ffill(),foo.Nights)

然后使用series.between()检查日期是否介于 2 个日期之间:

foo2019 = foo[foo.Date.between('2018-03-31','2019-03-31')].copy() #changes
foopayouts2019 = foo2019[foo2019['Type'] == 'Payout'].copy() #changes .copy()

或者直接:

foopayouts2019=foo[foo.Date.between('2018-03-31','2019-03-31')&foo.Type.eq('Payout')].copy()

foopayouts2019.loc[:, 'Nights'] = foopayouts2019['Nights'].apply(np.int64) #.astype(int)

   Index       Date    Type  Nights  Amount  Payout
3 3 2018-09-11 Payout 3 NaN 1500.0
5 5 2019-02-16 Payout 2 NaN 2000.0

关于python - Pandas SettingWithCopyWarning 原因不明,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57270418/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com