gpt4 book ai didi

python - Panda .loc 或 .iloc 从数据集中选择列

转载 作者:太空宇宙 更新时间:2023-11-03 12:37:16 25 4
gpt4 key购买 nike

我一直在尝试从数据集中为所有行选择一组特定的列。我尝试了类似下面的方法。

train_features = train_df.loc[,[0,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]]

我想提一下,所有行都包括在内,但只需要编号的列。有没有更好的方法来解决这个问题。

示例数据:

age  job        marital   education    default   housing   loan   equities   contact     duration   campaign   pdays   previous   poutcome   emp.var.rate   cons.price.idx   cons.conf.idx   euribor3m     nr.employed   y
56 housemaid married basic.4y 1 1 1 1 0 261 1 999 0 2 1.1 93.994 -36.4 3.299552287 5191 1
37 services married high.school 1 0 1 1 0 226 1 999 0 2 1.1 93.994 -36.4 0.743751247 5191 1
56 services married high.school 1 1 0 1 0 307 1 999 0 2 1.1 93.994 -36.4 1.28265179 5191 1

我试图忽略数据集中的工作、婚姻、教育和 y 列。 y 列是目标变量。

最佳答案

如果需要按职位选择,请使用iloc :

train_features = train_df.iloc[:, [0,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]]
print (train_features)
age default housing loan equities contact duration campaign pdays \
0 56 1 1 1 1 0 261 1 999
1 37 1 0 1 1 0 226 1 999
2 56 1 1 0 1 0 307 1 999

previous poutcome emp.var.rate cons.price.idx cons.conf.idx euribor3m \
0 0 2 1.1 93.994 -36.4 3.299552
1 0 2 1.1 93.994 -36.4 0.743751
2 0 2 1.1 93.994 -36.4 1.282652

nr.employed
0 5191
1 5191
2 5191

另一种解决方案是 drop不必要的列:

cols= ['job','marital','education','y']
train_features = train_df.drop(cols, axis=1)
print (train_features)
age default housing loan equities contact duration campaign pdays \
0 56 1 1 1 1 0 261 1 999
1 37 1 0 1 1 0 226 1 999
2 56 1 1 0 1 0 307 1 999

previous poutcome emp.var.rate cons.price.idx cons.conf.idx euribor3m \
0 0 2 1.1 93.994 -36.4 3.299552
1 0 2 1.1 93.994 -36.4 0.743751
2 0 2 1.1 93.994 -36.4 1.282652

nr.employed
0 5191
1 5191
2 5191

关于python - Panda .loc 或 .iloc 从数据集中选择列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43464015/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com