gpt4 book ai didi

python - 学习 : Is there any way to debug Pipelines?

转载 作者:太空狗 更新时间:2023-10-29 18:34:04 25 4
gpt4 key购买 nike

我已经为分类任务创建了一些管道,我想检查每个阶段存在/存储的信息(例如 text_stats、ngram_tfidf)。我怎么能这样做。

pipeline = Pipeline([
('features',FeatureUnion([
('text_stats', Pipeline([
('length',TextStats()),
('vect', DictVectorizer())
])),
('ngram_tfidf',Pipeline([
('count_vect', CountVectorizer(tokenizer=tokenize_bigram_stem,stop_words=stopwords)),
('tfidf', TfidfTransformer())
]))
])),
('classifier',MultinomialNB(alpha=0.1))
])

最佳答案

我发现有时临时添加一个打印出您感兴趣的信息的调试步骤很有用。在 sklearn 示例的示例之上构建 1 ,您可以这样做,例如在调用分类器之前打印出前 5 行、形状或您需要查看的任何内容:

from sklearn import svm
from sklearn.datasets import samples_generator
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import f_regression
from sklearn.pipeline import Pipeline
from sklearn.base import TransformerMixin, BaseEstimator

class Debug(BaseEstimator, TransformerMixin):

def transform(self, X):
print(pd.DataFrame(X).head())
print(X.shape)
return X

def fit(self, X, y=None, **fit_params):
return self

X, y = samples_generator.make_classification(n_informative=5, n_redundant=0, random_state=42)
anova_filter = SelectKBest(f_regression, k=5)
clf = svm.SVC(kernel='linear')
anova_svm = Pipeline([('anova', anova_filter), ('dbg', Debug()), ('svc', clf)])
anova_svm.set_params(anova__k=10, svc__C=.1).fit(X, y)

prediction = anova_svm.predict(X)

关于python - 学习 : Is there any way to debug Pipelines?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34802465/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com