gpt4 book ai didi

python - 如何仅标准化机器学习管道中的数字列?

转载 作者:太空宇宙 更新时间:2023-11-04 02:32:43 26 4
gpt4 key购买 nike

我有具有数字和分类特征的数据;我只想标准化数字特征。数值列在 X_num_cols 中捕获,但是我不确定如何将其实现到管道代码中,例如,make_pipeline(preprocessing.StandardScaler(columns=X_num_cols) 不起作用。我在 stackoverflow 上找到了 this,但答案不符合我的代码布局/目的。

from sklearn import preprocessing
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split,GridSearchCV
import pandas as pd
import numpy as np

# Separate target from training features
y = df['MED']
X = df.drop('MED', axis=1)

# Retain only the needed predictors
X = X.filter(['age', 'gender', 'ccis'])

# Find the numerical columns, exclude categorical columns
X_num_cols = X.columns[X.dtypes.apply(lambda c: np.issubdtype(c, np.number))]

# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.5,
random_state=1234,
stratify=y)

# Pipeline
pipeline = make_pipeline(preprocessing.StandardScaler(),
LogisticRegression(penalty='l2'))

# Declare hyperparameters
hyperparameters = {'logisticregression__C' : [0.01, 0.1, 1.0, 10.0, 100.0],
'logisticregression__multi_class': ['ovr'],
'logisticregression__class_weight': ['balanced']
}

# SKlearn cross-validation with pupeline
clf = GridSearchCV(pipeline, hyperparameters, cv=10)

示例数据如下:

Age    Gender    CCIS
13 M 5
24 F 8

最佳答案

你的管道应该是这样的:

from sklearn.preprocessing import StandardScaler,FunctionTransformer
from sklearn.pipeline import Pipeline,FeatureUnion


rg = LogisticRegression(class_weight = { 0:1, 1:10 }, random_state = 42, solver = 'saga',max_iter=100,n_jobs=-1,intercept_scaling=1)


pipeline=Pipeline(steps= [
('feature_processing', FeatureUnion(transformer_list = [
('categorical', FunctionTransformer(lambda data: data[:, cat_indices])),

#numeric
('numeric', Pipeline(steps = [
('select', FunctionTransformer(lambda data: data[:, num_indices])),
('scale', StandardScaler())
]))
])),
('clf', rg)
]
)

关于python - 如何仅标准化机器学习管道中的数字列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48813968/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com