gpt4 book ai didi

apache-spark - 使用 PySpark 进行 ALS 训练会引发 StackOverflowError

转载 作者:行者123 更新时间:2023-11-30 08:35:10 25 4
gpt4 key购买 nike

当尝试在 Windows 上的 Spark MLLib (1.4) 中使用 ALS 训练机器学习模型时,Pyspark 总是以 StackoverflowError 终止。我尝试添加检查点,如 https://stackoverflow.com/a/31484461/36130 中所述。 -- 似乎没有帮助(虽然每次运行都会创建一个新目录,但它总是空的)。

这是训练代码和堆栈跟踪:

ranks = [8, 12]
lambdas = [0.1, 10.0]
numIters = [10, 20]
bestModel = None
bestValidationRmse = float("inf")
bestRank = 0
bestLambda = -1.0
bestNumIter = -1

for rank, lmbda, numIter in itertools.product(ranks, lambdas, numIters):
ALS.checkpointInterval = 2
model = ALS.train(training, rank, numIter, lmbda)
validationRmse = computeRmse(model, validation, numValidation)

if (validationRmse < bestValidationRmse):
bestModel = model
bestValidationRmse = validationRmse
bestRank = rank
bestLambda = lmbda
bestNumIter = numIter

testRmse = computeRmse(bestModel, test, numTest)

堆栈跟踪:

15/08/27 02:02:58 ERROR Executor: Exception in task 3.0 in stage 56.0 (TID 127)
java.lang.StackOverflowError
at java.io.ObjectInputStream$BlockDataInputStream.readInt(Unknown Source)
at java.io.ObjectInputStream.readHandle(Unknown Source)
at java.io.ObjectInputStream.readClassDesc(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeReadObject(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)

最佳答案

尝试设置检查点目录

sc.setCheckpointDir("/check_point_dir")

关于apache-spark - 使用 PySpark 进行 ALS 训练会引发 StackOverflowError,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32245185/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com