gpt4 book ai didi

python - pandas 条件语句问题

转载 作者:行者123 更新时间:2023-12-01 03:17:14 25 4
gpt4 key购买 nike

嗨~我正在处理我的数据。

我想用条件语句提取数据

这是我的代码。

# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
import os

join_file = r'D:\handling data\complete data\조인\after_join.csv'
pwd = os.getcwd()
os.chdir(os.path.dirname(join_file))
join_data = pd.read_csv(os.path.basename(join_file), sep=',', encoding='utf-8')

print(join_data.head())

enter image description here

join_data['cluster_z'] = 4 # 둘다 하락세           
join_data['cluster_z'][((join_data['cluster_x'] == 3 | join_data['cluster_x'] == 2 | join_data['cluster_x'] == 4 )
& (join_data['cluster_y'] == 3 | join_data['cluster_y'] == 1))] = 1 # 다 상승세

join_data['cluster_z'][((join_data['cluster_x'] == 1 | join_data['cluster_x'] == 5)
& (join_data['cluster_y'] == 3 | join_data['cluster_y'] == 1))] = 2 # 전체 하락세, 점포당 상승세

join_data['cluster_z'][((join_data['cluster_x'] == 3 | join_data['cluster_x'] == 2 | join_data['cluster_x'] == 4 )
& (join_data['cluster_y'] == 2 | join_data['cluster_y'] == 4))] = 3 # 전체 상승세, 점파당 하락세

print(join_data.head())

执行第二次 print(join_data.head()) 后。我收到如图所示的错误

enter image description here

我该如何解决它?提前致谢。

最佳答案

看来你在条件之间省略了很多括号,更好的是使用 loc :

原文:

join_data['cluster_z']
[((join_data['cluster_x'] == 3 |
join_data['cluster_x'] == 2 |
join_data['cluster_x'] == 4 ) &
(join_data['cluster_y'] == 3 |
join_data['cluster_y'] == 1))] = 1

更改为:

join_data.loc[
((join_data['cluster_x'] == 3) |
(join_data['cluster_x'] == 2) |
(join_data['cluster_x'] == 4) ) &
((join_data['cluster_y'] == 3) |
(join_data['cluster_y'] == 1)), 'cluster_z'] = 1

或者更好地使用isin :

join_data.loc[
(join_data['cluster_x'].isin([3,2,4])) &
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 1

一起:

join_data = pd.DataFrame({'cluster_x':[3,2,5,3],
'cluster_y':[3,0,1,2]})

print (join_data)
cluster_x cluster_y
0 3 3
1 2 0
2 5 1
3 3 2

join_data['cluster_z'] = 4

join_data.loc[
(join_data['cluster_x'].isin([3,2,4])) &
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 1

join_data.loc[
(join_data['cluster_x'].isin([1,5])) &
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 2

join_data.loc[
(join_data['cluster_x'].isin([3,2,4])) &
(join_data['cluster_y'].isin([2,4])), 'cluster_z'] = 3

print (join_data)
cluster_x cluster_y cluster_z
0 3 3 1
1 2 0 4
2 5 1 2
3 3 2 3

或者更具可读性:

mask1 = join_data['cluster_x'].isin([3,2,4])
mask2 = join_data['cluster_y'].isin([3,1])
mask3 = join_data['cluster_x'].isin([1,5])
mask4 = join_data['cluster_y'].isin([2,4])

join_data['cluster_z'] = 4
join_data.loc[mask1 & mask2 , 'cluster_z'] = 1
join_data.loc[mask3 & mask2 , 'cluster_z'] = 2
join_data.loc[mask1 & mask4 , 'cluster_z'] = 3

print (join_data)
cluster_x cluster_y cluster_z
0 3 3 1
1 2 0 4
2 5 1 2
3 3 2 3

具有多个 numpy.where 的解决方案:

mask1 = join_data['cluster_x'].isin([3,2,4])
mask2 = join_data['cluster_y'].isin([3,1])
mask3 = join_data['cluster_x'].isin([1,5])
mask4 = join_data['cluster_y'].isin([2,4])

join_data['cluster_z'] = np.where(mask1 & mask2, 1,
np.where(mask3 & mask2, 2,
np.where(mask1 & mask4, 3, 4)))

print (join_data)
cluster_x cluster_y cluster_z
0 3 3 1
1 2 0 4
2 5 1 2
3 3 2 3

关于python - pandas 条件语句问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42362348/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com