gpt4 book ai didi

python - 必须使用 Hive 构建 Spark (spark 1.5.0)

转载 作者:太空狗 更新时间:2023-10-29 21:58:21 25 4
gpt4 key购买 nike

下载 spark 1.5.0 预构建并通过 pyspark 运行这个简单的代码

from pyspark.sql import Row
l = [('Alice', 1)]
sqlContext.createDataFrame(l).collect

产量错误:

15/09/30 06:48:48 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so do
es not have its own datastore table.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\bigdata\spark-1.5\spark-1.5.0\python\pyspark\sql\context.py", line 408, in createDataFrame
jdf = self._ssql_ctx.applySchemaToPythonRDD(jrdd.rdd(), schema.json())
File "c:\bigdata\spark-1.5\spark-1.5.0\python\pyspark\sql\context.py", line 660, in _ssql_ctx
"build/sbt assembly", e)
Exception: ("You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt assembly", Py4JJavaError(u'An error occurred
while calling None.org.apache.spark.sql.hive.HiveContext.\n', JavaObject id=o28))

于是尝试自己编译

c:\bigdata\spark-1.5\spark-1.5.0>.\build\apache-maven-3.3.3\bin\mvn  -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests -Phive -Phive-t

hriftserver 清理包

但在编译版本上仍然出现相同的错误。

有什么建议吗?

最佳答案

导入行后添加这些行

from pyspark import  SparkContext
from pyspark.sql import SQLContext
sc = SparkContext( 'local', 'pyspark')
sqlContext = SQLContext(sc)

关于python - 必须使用 Hive 构建 Spark (spark 1.5.0),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32858385/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com