gpt4 book ai didi

Java Spark MLlib : There is an error "ERROR OWLQN: Failure! Resetting history: breeze.optimize.NaNHistory:" for Logistic Regression in ml library

转载 作者:行者123 更新时间:2023-12-01 10:37:57 25 4
gpt4 key购买 nike

我只是尝试使用 Apache Spark ml 库进行 Logistic Regression,但是每当我尝试时,都会出现错误消息,例如

“错误 OWLQN:失败!重置历史记录:breeze.optimize.NaNHistory:”

逻辑回归的数据集示例如下:

+-----+---------+---------+---------+--------+-------------+
|state|dayOfWeek|hourOfDay|minOfHour|secOfMin| features|
+-----+---------+---------+---------+--------+-------------+
| 1.0| 7.0| 0.0| 0.0| 0.0|(4,[0],[7.0])|

逻辑回归的代码如下:
//Data Set
StructType schema = new StructType(
new StructField[]{
new StructField("state", DataTypes.DoubleType, false, Metadata.empty()),
new StructField("dayOfWeek", DataTypes.DoubleType, false, Metadata.empty()),
new StructField("hourOfDay", DataTypes.DoubleType, false, Metadata.empty()),
new StructField("minOfHour", DataTypes.DoubleType, false, Metadata.empty()),
new StructField("secOfMin", DataTypes.DoubleType, false, Metadata.empty())
});
List<Row> dataFromRDD = bucketsForMLs.map(p -> {
return RowFactory.create(p.label(), p.features().apply(0), p.features().apply(1), p.features().apply(2), p.features().apply(3));
}).collect();

Dataset<Row> stateDF = sparkSession.createDataFrame(dataFromRDD, schema);
String[] featureCols = new String[]{"dayOfWeek", "hourOfDay", "minOfHour", "secOfMin"};
VectorAssembler vectorAssembler = new VectorAssembler().setInputCols(featureCols).setOutputCol("features");
Dataset<Row> stateDFWithFeatures = vectorAssembler.transform(stateDF);

StringIndexer labelIndexer = new StringIndexer().setInputCol("state").setOutputCol("label");
Dataset<Row> stateDFWithLabelAndFeatures = labelIndexer.fit(stateDFWithFeatures).transform(stateDFWithFeatures);

MLRExecutionForDF mlrExe = new MLRExecutionForDF(javaSparkContext);
mlrExe.execute(stateDFWithLabelAndFeatures);

// Logistic Regression part
LogisticRegressionModel lrModel = new LogisticRegression().setMaxIter(maxItr).setRegParam(regParam).setElasticNetParam(elasticNetParam)
// This part would occur error
.fit(stateDFWithLabelAndFeatures);

最佳答案

我刚刚遇到了同样的错误。它来自 breeze Spark 刚刚导入的 ScalaNLP 包。它表示无法生产衍生品。

我不确定这到底是什么意思,但在我的数据集中,我可以争辩说,我使用的数据越少,抛出错误的频率就越高。这意味着要训练的类的缺失特征的比例更高,错误发生的频率更高。我认为这与由于缺少类的信息而无法正确优化有关。

尽管如此,该错误似乎并没有阻止代码运行。

关于Java Spark MLlib : There is an error "ERROR OWLQN: Failure! Resetting history: breeze.optimize.NaNHistory:" for Logistic Regression in ml library,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45381403/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com