gpt4 book ai didi

python - 如何使 FunctionTransformer 在 DataFrameMapper 中工作

转载 作者:行者123 更新时间:2023-12-01 06:35:56 25 4
gpt4 key购买 nike

我的 pandas DataFrame 中有一个列,如下所示:

df = pd.DataFrame([
['26.6 km'],
['19.67 km'],
['18.2 km'],
['20.77 km'],
['15.2 km'],
], columns=['Mileage'])

我有一个函数可以从列中删除“km”:

def remove_words(column):
return column.str.split(' ').str[0]

当我将它放入我的 DataFrameMapper 中时:

mapper = DataFrameMapper([
('Mileage', [FunctionTransformer(remove_words)]),
], df_out=True)

...它返回错误“'numpy.ndarray'对象没有属性'str'”

救命!

最佳答案

使用extractreplace

df['Mileage'] = df['Mileage'].str.extract('(\d*\.?\d*)', expand=False).astype(float)

或者,

df['Mileage'] = df['Mileage'].str.replace('[^\d.]', '').astype(float)

这是示例,

>>> import pandas as pd
>>> df = pd.DataFrame([
['26.6 km'],
['19.67 km'],
['18.2 km'],
['20.77 km'],
['15.2 km'],
], columns=['Mileage'])
>>> df['Mileage'].str.extract('(\d*\.?\d*)', expand=False).astype(float)
0 26.60
1 19.67
2 18.20
3 20.77
4 15.20
Name: Mileage, dtype: float64
>>> df['Mileage'].str.replace('[^\d.]', '').astype(float)
0 26.60
1 19.67
2 18.20
3 20.77
4 15.20
Name: Mileage, dtype: float64

或者,如果您想使用 sklearn_pandas 中的 DataFrameMapperFunctionTransformer ,

from sklearn_pandas import DataFrameMapper, FunctionTransformer

def remove_words(val):
return val.split(' ')[0]

mapper = DataFrameMapper([
('Mileage', [FunctionTransformer(remove_words)]),
], df_out=True)

print(mapper.fit_transform(df))

Mileage
0 26.6
1 19.67
2 18.2
3 20.77
4 15.2

对于sklearn.preprocessing.FunctionTransformer ,

from sklearn_pandas import DataFrameMapper
from sklearn.preprocessing import FunctionTransformer
import numpy as np

def remove_words(vals):
return np.array([v[0].split(' ')[0] for v in vals])

mapper = DataFrameMapper([
(['Mileage'], [FunctionTransformer(remove_words, validate=False)]),
], df_out=True)

print(mapper.fit_transform(df))

Mileage
0 26.6
1 19.67
2 18.2
3 20.77
4 15.2

或者使用numpy.vectorize

from sklearn_pandas import DataFrameMapper
from sklearn.preprocessing import FunctionTransformer
import numpy as np

func = np.vectorize(lambda x: x.split(' ')[0])

def remove_words(vals):
return func(vals)

mapper = DataFrameMapper([
(['Mileage'], [FunctionTransformer(remove_words, validate=False)]),
], df_out=True)

print(mapper.fit_transform(df))

Mileage
0 26.6
1 19.67
2 18.2
3 20.77
4 15.2

关于python - 如何使 FunctionTransformer 在 DataFrameMapper 中工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59670335/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com