gpt4 book ai didi

python - 使用 Python 将 SVM 模型导出到 PMML

转载 作者:行者123 更新时间:2023-12-01 03:45:33 25 4
gpt4 key购买 nike

我正在使用 Scikit-Learn 应用 SVM 算法来预测客户是否会选择住房贷款。我希望将模型导出为 PMML 格式。数据集中的特征和标签如下所示:

特点
1.访问频率
2。对报价的响应
3.使用网上银行设施
4。储蓄账户数量
5.支票账户数量
6.开出的支票数量
7。已完成的电子转帐数量
8. 取得的属性(property)
九、其他贷款行为
10. 收入

标签
是房屋贷款

模型已正确生成,但无法导出到 PMML。代码粘贴如下:

代码:

from sklearn.decomposition import PCA

from sklearn2pmml.decoration import ContinuousDomain

import pandas
import sklearn_pandas
from sklearn.svm import SVC



home_loan = pandas.read_csv('home-loan-dataset.csv')

home_loan = home_loan.drop(['CustID'], axis=1)

home_loan_df = pandas.concat((pandas.DataFrame(home_loan[:], columns = ['Frequencyofvisits','Responsetooffers','UsageofOnlineBankingFacility','Numberofsavingsaccount','Numberofcheckingaccount','Numberofcheckswritten','NumberofEFTsdone','PropertyAcquired','OtherLoansBehaviour','Income']), pandas.DataFrame(home_loan['IsHouseLoan'], columns = ["IsHouseLoan"])), axis = 1)

home_loan_mapper = sklearn_pandas.DataFrameMapper([
(['Frequencyofvisits','Responsetooffers','UsageofOnlineBankingFacility','Numberofsavingsaccount','Numberofcheckingaccount','Numberofcheckswritten','NumberofEFTsdone','PropertyAcquired','OtherLoansBehaviour','Income'], [ContinuousDomain(), PCA(n_components = 3)]),
("IsHouseLoan", None)
])


home_loan = home_loan_df


home_loan_X = home_loan[['Frequencyofvisits','Responsetooffers','UsageofOnlineBankingFacility','Numberofsavingsaccount','Numberofcheckingaccount','Numberofcheckswritten','NumberofEFTsdone','PropertyAcquired','OtherLoansBehaviour','Income']]

home_loan_y = home_loan[['IsHouseLoan']]

# Classify using SVM

home_loan_classifier = SVC()



home_loan_classifier.fit(home_loan_X, home_loan_y.values.ravel())

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)



#
# Conversion to PMML
#

from sklearn2pmml import sklearn2pmml

sklearn2pmml(home_loan_classifier, home_loan_mapper, "SVMHomeLoan.pmml", with_repr = True)


转换为 PMML 时显示以下错误:

错误:

C:\Python27\python.exe C:/Users/Admin/PycharmProjects/ML-Programs/Bank-Customer-Segmentation/svm-pmml.py
Aug 17, 2016 11:35:01 AM org.jpmml.sklearn.Main run
INFO: Parsing DataFrameMapper PKL..
Aug 17, 2016 11:35:01 AM org.jpmml.sklearn.Main run
INFO: Parsed DataFrameMapper PKL in 30 ms.
Aug 17, 2016 11:35:01 AM org.jpmml.sklearn.Main run
INFO: Converting DataFrameMapper..
Aug 17, 2016 11:35:01 AM org.jpmml.sklearn.Main run
SEVERE: Failed to convert DataFrameMapper
java.lang.IllegalArgumentException: The value of the sklearn2pmml.decoration.ContinuousDomain.data_min_ attribute (null) is not a supported array type
at org.jpmml.sklearn.ClassDictUtil.getArray(ClassDictUtil.java:51)
at sklearn2pmml.decoration.ContinuousDomain.getDataMin(ContinuousDomain.java:111)
at sklearn2pmml.decoration.ContinuousDomain.encodeFeatures(ContinuousDomain.java:50)
at sklearn_pandas.DataFrameMapper.encodeFeatures(DataFrameMapper.java:70)
at org.jpmml.sklearn.Main.run(Main.java:146)
at org.jpmml.sklearn.Main.main(Main.java:107)

Exception in thread "main" java.lang.IllegalArgumentException: The value of the sklearn2pmml.decoration.ContinuousDomain.data_min_ attribute (null) is not a supported array type
at org.jpmml.sklearn.ClassDictUtil.getArray(ClassDictUtil.java:51)
at sklearn2pmml.decoration.ContinuousDomain.getDataMin(ContinuousDomain.java:111)
at sklearn2pmml.decoration.ContinuousDomain.encodeFeatures(ContinuousDomain.java:50)
at sklearn_pandas.DataFrameMapper.encodeFeatures(DataFrameMapper.java:70)
at org.jpmml.sklearn.Main.run(Main.java:146)
at org.jpmml.sklearn.Main.main(Main.java:107)
Traceback (most recent call last):
File "C:/Users/Admin/PycharmProjects/ML-Programs/Bank-Customer-Segmentation/svm-pmml.py", line 52, in <module>
sklearn2pmml(home_loan_classifier, home_loan_mapper, "SVMHomeLoan.pmml", with_repr = True)
File "C:\Python27\lib\site-packages\sklearn2pmml\__init__.py", line 56, in sklearn2pmml
subprocess.check_call(cmd)
File "C:\Python27\lib\subprocess.py", line 540, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['java', '-cp', 'C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\guava-19.0.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\istack-commons-runtime-2.21.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\jaxb-core-2.2.11.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\jaxb-runtime-2.2.11.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\jcommander-1.48.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\jpmml-converter-1.0.7.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\jpmml-sklearn-1.0-SNAPSHOT.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\jpmml-xgboost-1.0.5.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\pmml-agent-1.2.16.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\pmml-model-1.2.16.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\pmml-model-metro-1.2.16.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\pmml-schema-1.2.16.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\pyrolite-4.12.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\serpent-1.12.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\slf4j-api-1.7.21.jar;C:\\Python27\\lib\\site-packages\\sklearn2pmml\\resources\\slf4j-jdk14-1.7.21.jar', 'org.jpmml.sklearn.Main', '--pkl-estimator-input', 'c:\\users\\Admin\\appdata\\local\\temp\\tmplgmrjq.pkl', '--repr-estimator', "SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,\n decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',\n max_iter=-1, probability=False, random_state=None, shrinking=True,\n tol=0.001, verbose=False)", '--pkl-mapper-input', 'c:\\users\\Admin\\appdata\\local\\temp\\tmpobahse.pkl', '--repr-mapper', "DataFrameMapper(features=[(['Frequencyofvisits', 'Responsetooffers', 'UsageofOnlineBankingFacility', 'Numberofsavingsaccount', 'Numberofcheckingaccount', 'Numberofcheckswritten', 'NumberofEFTsdone', 'PropertyAcquired', 'OtherLoansBehavior', 'Income100000'], TransformerPipeline(steps=[('continuousdomain', ContinuousDomain(invalid_value_treatment='return_invalid')), ('pca', PCA(copy=True, n_components=3, whiten=False))])), ('IsHouseLoan', None)],\n sparse=False)", '--pmml-output', 'SVMHomeLoan.pmml']' returned non-zero exit status 1


可能是什么原因?

最佳答案

显然,您的数据列之一不符合 sklearn2pmml.decoration.ContinouslyDomain 转换的预期。在不查看数据的情况下,不可能说出哪一列以及问题的确切性质(例如,分类运算类型而不是连续、错误的数字数据类型、列包含 NA 值等) .

这里有两个选择:

  1. 找出行为不当的列,并修复数据问题,以便 ContinouslyDomain 转换正常工作。
  2. 从列表转换列表中排除 ContinouslyDomain

目前您正在使用直接从 sklearn2pmml README.md 文件复制的数据预处理逻辑。请重新处理它以匹配您的数据 - [ContineousDomain(), PCA(n_components = 3)] 转换不太可能是适合您的用例的正确解决方案。

此外,此问题特定于 sklearn2pmml包裹。如果您在 sklearn2pmml 问题跟踪器中提出问题,您可能会得到更好/更快的回复。

关于python - 使用 Python 将 SVM 模型导出到 PMML,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38989272/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com