gpt4 book ai didi

python - 在 sklearn 中将 Pipeline 与自定义类一起使用

转载 作者:行者123 更新时间:2023-11-30 09:00:11 24 4
gpt4 key购买 nike

我在使用每个管道步骤的自定义类在管道流内部预测期间遇到问题。

class MyFeatureSelector():
def __init__(self, features=5, method='pca'):
self.features = features
self.method = method

def fit(self, X, Y):
return self

def transform(self, X, Y=None):
try:
if self.features < X.shape[1]:
if self.method == 'pca':
selector = PCA(n_components=self.features)
elif self.method == 'rfe':
selector = RFE(estimator=LinearRegression(n_jobs=-1),
n_features_to_select=self.features,
step=1)
selector.fit(X, Y)
return selector.transform(X)
except Exception as err:
print('MyFeatureSelector.transform(): {}'.format(err))
return X

def fit_transform(self, X, Y=None):
self.fit(X, Y)
return self.transform(X, Y)


model = Pipeline([
("DATA_CLEANER", MyDataCleaner(demo='', mode='strict')),
("DATA_ENCODING", MyEncoder(encoder_name='code')),
("FEATURE_SELECTION", MyFeatureSelector(features=15, method='rfe')),
("HUBER_MODELLING", HuberRegressor())
])

所以,上面的代码在这里工作得很好:

 model.fit(X, _Y)

但是我这里有一个错误

 prediction = model.predict(XT)

ERROR: shapes (672,107) and (15,) not aligned: 107 (dim 1) != 15 (dim 0)

调试在此处显示了该问题:selector.fit(X, Y),因为 MyFeatureSelector 的新实例是在 predict() 步骤中创建的并且 Y 此时不存在。

我哪里错了?

最佳答案

下面发布的工作版本:

class MyFeatureSelector():
def __init__(self, features=5, method='pca'):
self.features = features
self.method = method
self.selector = None
self.init_selector()


def init_selector():
if self.method == 'pca':
self.selector = PCA(n_components=self.features)
elif self.method == 'rfe':
self.selector = RFE(estimator=LinearRegression(n_jobs=-1),
n_features_to_select=self.features,
step=1)

def fit(self, X, Y):
return self

def transform(self, X, Y=None):
try:
if self.features < X.shape[1]:
if Y is not None:
self.selector.fit(X, Y)
return selector.transform(X)
except Exception as err:
print('MyFeatureSelector.transform(): {}'.format(err))
return X

def fit_transform(self, X, Y=None):
self.fit(X, Y)
return self.transform(X, Y)

关于python - 在 sklearn 中将 Pipeline 与自定义类一起使用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43232506/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com