gpt4 book ai didi

java - 评估方法需要很长时间 - 使用 Jpmml 的 PMML 模型

转载 作者:行者123 更新时间:2023-12-01 08:55:12 24 4
gpt4 key购买 nike

今天,我使用 Jpmml 在我的代码中加载 pmml 模型。但“评估”方法需要很长时间。这是今天的工作代码:

    String modelPath = "....";
ModelEvaluatorFactory factory = ModelEvaluatorFactory.newInstance();
InputStream in = new ByteArrayInputStream(modelPath.getBytes("UTF-8"));

PMML pmmlModel = JAXBUtil.unmarshalPMML(new StreamSource(in));
ModelEvaluator<?> evaluator = factory.newModelManager(pmmlModel);
List<FieldName> activeFields = evaluator.getActiveFields();

Map<FieldName, FieldValue> defaultFeatures = new HashMap<>();

//after filling the 'defaultFeatures' the line below takes long time
Map<FieldName, ?> results = evaluator.evaluate(defaultFeatures);

PMML 示例:

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PMML xmlns="http://www.dmg.org/PMML-4_2" version="4.2">
<Header>
<Application name="JPMML-SkLearn" version="1.0-SNAPSHOT"/>
<Timestamp>2017-01-22T14:18:05Z</Timestamp>
</Header>
<DataDictionary>
<DataField name="GENDER" optype="categorical" dataType="string">
<Value value="0"/>
<Value value="1"/>
</DataField>
<DataField name="1GA_" optype="continuous" dataType="double"/>
//67000 rows of datafield
</DataDictionary>
<TransformationDictionary>
<DefineFunction name="logit" optype="continuous" dataType="double">
<ParameterField name="value" optype="continuous" dataType="double"/>
<Apply function="/">
<Constant dataType="double">1</Constant>
<Apply function="+">
<Constant dataType="double">1</Constant>
<Apply function="exp">
<Apply function="*">
<Constant dataType="double">-1</Constant>
<FieldRef field="value"/>
</Apply>
</Apply>
</Apply>
</Apply>
</DefineFunction>
</TransformationDictionary>
<MiningModel functionName="classification">
<MiningSchema>
<MiningField name="GENDER" usageType="target"/>
<MiningField name="1GA_"/>
//67000 rows of MiningField
</MiningSchema>
<Output>
<OutputField name="probability_0" feature="probability" value="0"/>
<OutputField name="probability_1" feature="probability" value="1"/>
</Output>
<LocalTransformations>
<DerivedField name="x1" optype="continuous" dataType="double">
<FieldRef field="1GA_"/>
</DerivedField>
//100000 rows
</LocalTransformations>
<Segmentation multipleModelMethod="modelChain">
<Segment id="1">
<True/>
<RegressionModel functionName="regression">
<MiningSchema>
<MiningField name="1GA_"/>
</MiningSchema>
<Output>
<OutputField name="decisionFunction_1" feature="predictedValue"/>
<OutputField name="logitDecisionFunction_1" optype="continuous" dataType="double" feature="transformedValue">
<Apply function="logit">
<FieldRef field="decisionFunction_1"/>
</Apply>
</OutputField>
</Output>
<RegressionTable intercept="-5.303370169392045">
<NumericPredictor name="x1" coefficient="0.18476274186559316"/>
//100000 rows of NumericPredictor

</RegressionTable>
</RegressionModel>
</Segment>
<Segment id="2">
<True/>
<RegressionModel functionName="regression">
<MiningSchema>
<MiningField name="logitDecisionFunction_1"/>
</MiningSchema>
<Output>
<OutputField name="logitDecisionFunction_0"
feature="predictedValue"/>
</Output>
<RegressionTable intercept="1.0">
<NumericPredictor name="logitDecisionFunction_1"

coefficient="-1.0"/>
</RegressionTable>
</RegressionModel>
</Segment>
<Segment id="3">
<True/>
<RegressionModel functionName="classification">
<MiningSchema>
<MiningField name="GENDER" usageType="target"/>
<MiningField name="logitDecisionFunction_1"/>
<MiningField name="logitDecisionFunction_0"/>
</MiningSchema>
<RegressionTable intercept="0.0" targetCategory="1">
<NumericPredictor name="logitDecisionFunction_1"


coefficient="1.0"/>
</RegressionTable>
<RegressionTable intercept="0.0" targetCategory="0">
<NumericPredictor name="logitDecisionFunction_0"


coefficient="1.0"/>
</RegressionTable>
</RegressionModel>
</Segment>
</Segmentation>
</MiningModel>
</PMML>

有尝试使用MLlib而不是Jpmml的想法。有任何想法吗?谢谢

最佳答案

“负载”是什么意思?是“将 PMML 文档解析为内存数据结构”还是“执行 PMML 文档”?

您的代码似乎是针对后者。但它肯定会失败,因为 JAXBUtil#unmarshalPMML(Source) 方法是使用字节数组调用的,该字节数组不包含有效的 PMML 文档(没有 XML 解析器会接受 ".. ..".getBytes("UTF-8")).

另外,“需要很长时间”是什么意思? JAXB 框架的一次性初始化成本约为 1 秒。之后,它每秒可以解码大约 200 到 500 MB(即兆字节)的 PMML 内容。您还需要多少?

关于java - 评估方法需要很长时间 - 使用 Jpmml 的 PMML 模型,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42074491/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com