gpt4 book ai didi

python - 将随机文件写入 HDFS - PySpark

转载 作者:行者123 更新时间:2023-12-01 03:05:13 24 4
gpt4 key购买 nike

我还没有看到任何有关如何执行此操作的示例。我在 python 3 环境中使用 PySpark 2.0。我有随机数据、二进制数据、.jpg 数据、随机字符串。我只需要将数据放回底层存储即可。

例如:

import os
with open(os.path.join(base_dir, 'RF_model.txt'), "w") as file1:
toFile = raw_input(RF_model.toDebugString())
file1.write(toFile)

(以上不起作用)

谢谢!

编辑 -------------- RF_model.toDebugString() 输出 ----

 Tree 0:
If (feature 0 <= 64.0)
If (feature 2 <= 212.0)
If (feature 3 <= 0.0)
If (feature 2 <= 154.0)
Predict: 1.0
Else (feature 2 > 154.0)
Predict: 1.0
Else (feature 3 > 0.0)
If (feature 2 <= 147.0)
Predict: 0.0
Else (feature 2 > 147.0)
Predict: 0.0
Else (feature 2 > 212.0)
If (feature 2 <= 375.0)
If (feature 3 <= 0.0)
Predict: 0.0
Else (feature 3 > 0.0)
Predict: 0.0
Else (feature 2 > 375.0)
If (feature 0 <= 22.0)
Predict: 0.0
Else (feature 0 > 22.0)
Predict: 0.0
Else (feature 0 > 64.0)
If (feature 2 <= 239.0)
If (feature 3 <= 0.0)
If (feature 2 <= 200.0)
Predict: 0.0
Else (feature 2 > 200.0)
Predict: 0.0
Else (feature 3 > 0.0)
If (feature 2 <= 124.0)
Predict: 0.0
Else (feature 2 > 124.0)
Predict: 0.0
Else (feature 2 > 239.0)
If (feature 2 <= 375.0)
If (feature 1 <= 67.0)
Predict: 0.0
Else (feature 1 > 67.0)
Predict: 0.0
Else (feature 2 > 375.0)
If (feature 1 <= 63.0)
Predict: 0.0
Else (feature 1 > 63.0)
Predict: 0.0
Tree 1:
If (feature 0 <= 64.0)
If (feature 2 <= 224.0)
If (feature 3 <= 0.0)
If (feature 2 <= 170.0)
Predict: 1.0
Else (feature 2 > 170.0)
Predict: 1.0
Else (feature 3 > 0.0)
If (feature 2 <= 158.0)
Predict: 0.0
Else (feature 2 > 158.0)
Predict: 0.0
Else (feature 2 > 224.0)
If (feature 2 <= 375.0)
If (feature 3 <= 0.0)
Predict: 0.0
Else (feature 3 > 0.0)
Predict: 0.0

最佳答案

当我假设您想要将 .toDebugString() 的输出写入文本文件时,我希望我是对的,

在 pyspark 中,您可以使用 .saveAsTextFile 将任何并行数据保存为文本文件 -

# imp step : first parallelize data that you need to save
rdd = sc.parallelize([str(RF_Model.toDebugString())])

# then save as text file , using below if underline storage is HDFS
rdd.saveAsTextFile('hdfs://'+base_dir+"/RF_model.txt")

或者如果您只想将其保存在本地文件系统中 -

rdd.saveAsTextFile("file:///"+base_dir+"/RF_model.txt")

关于python - 将随机文件写入 HDFS - PySpark,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43504716/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com