gpt4 book ai didi

java - 如何使用java在spark mllib中获取逻辑回归的p值

转载 作者:行者123 更新时间:2023-11-30 09:55:07 26 4
gpt4 key购买 nike

如何使用 Java 获取 Spark MLlib 中逻辑回归的 p 值。如何求分类类别的概率。以下是我尝试过的代码:

SparkConf sparkConf = new SparkConf().setAppName("GRP").setMaster("local[*]");
SparkContext ctx = new SparkContext(sparkConf);

LabeledPoint pos = new LabeledPoint(1.0, Vectors.dense(1.0, 0.0, 3.0));
String path = "dataSetnew.txt";

JavaRDD<LabeledPoint> data = MLUtils.loadLibSVMFile(ctx, path).toJavaRDD();
JavaRDD<LabeledPoint>[] splits = data.randomSplit(new double[] {0.6, 0.4}, 11L);
JavaRDD<LabeledPoint> training = splits[0].cache();
JavaRDD<LabeledPoint> test = splits[1];

final org.apache.spark.mllib.classification.LogisticRegressionModel model =
new LogisticRegressionWithLBFGS()
.setNumClasses(2)
.setIntercept(true)
.run(training.rdd());

JavaRDD<Tuple2<Object, Object>> predictionAndLabels = test.map(
new org.apache.spark.api.java.function.Function<LabeledPoint, Tuple2<Object, Object>>() {
public Tuple2<Object, Object> call(LabeledPoint p) {
Double prediction = model.predict(p.features());
// System.out.println("prediction :"+prediction);
return new Tuple2<Object, Object>(prediction, p.label());
}
}
);

Vector denseVecnew = Vectors.dense(112,110,110,0,0,0,0,0,0,0,0);
Double prediction = model.predict(denseVecnew);
Vector weightVector = model.weights();
System.out.println("weights : "+weightVector);
System.out.println("intercept : "+model.intercept());
System.out.println("forecast”+ prediction);
ctx.stop();

最佳答案

对于二元分类,您可以使用 LogisticRegressionModel.clearThreshold方法。调用后predict将返回原始分数

enter image description here

而不是标签。这些在 [0, 1] 范围内,可以解释为概率。

参见 clearThreshold docs .

关于java - 如何使用java在spark mllib中获取逻辑回归的p值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34528828/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com