gpt4 book ai didi

python - 在 pandas 中查找并替换?

转载 作者:行者123 更新时间:2023-11-30 22:17:45 25 4
gpt4 key购买 nike

我正在包含数字列的数据帧上执行最小-最大缩放器操作,但是如果这些数字列中的任何单元格包含字符串或空值,那么我会收到异常。为了避免这种情况,我考虑将字符串或空单元格转换为 0。如何执行呢?我的功能:

def min_max_scaler(df_sub,col_names):
"""
import the following:
from sklearn import preprocessing
from sklearn.preprocessing import StandardScaler

df_sub : Expecting a subset of data frame in which every columns should be number fields
(It contains all the columns on which you want to perform the operation)
example : df_subset = df.filter(['latitude','longitude','order.id'], axis=1)
col_names : All column names of the subset
"""
scaler = preprocessing.MinMaxScaler()
scaled_df = scaler.fit_transform(df_sub)
scaled_df = pd.DataFrame(scaled_df, columns=col_names)

return scaled_df

数据集:

day phone_calls received
7 180 NaN
8 8 NaN
9 -240 qbb

如何在执行此函数之前进行验证。请帮忙。

最佳答案

我会这样做:

查找object dtype的列:

obj_cols = df[col_names].columns[df[col_names].dtypes.eq('object')]

将它们转换为数字数据类型,用 0(零)替换 NaN:

df[obj_cols] = df[obj_cols].apply(pd.to_numeric, errors='coerce').fillna(0)

规模:

df[obj_cols] = scaler.fit_transform(df[obj_cols])

作为函数:

def min_max_scaler(df_sub,col_names):
scaler = preprocessing.MinMaxScaler()
obj_cols = df_sub[col_names].columns[df_sub[col_names].dtypes.eq('object')]
df_sub[obj_cols] = df_sub[obj_cols].apply(pd.to_numeric, errors='coerce').fillna(0)

return df_sub

关于python - 在 pandas 中查找并替换?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49553618/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com