gpt4 book ai didi

java - 此 JVM 中只能运行一个 SparkContext - [SPARK]

转载 作者:塔克拉玛干 更新时间:2023-11-02 08:03:39 27 4
gpt4 key购买 nike

我正在尝试运行以下代码来实时获取 Twitter 信息:

import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.streaming.twitter._
import org.apache.spark.streaming.StreamingContext._
import twitter4j.auth.Authorization
import twitter4j.Status
import twitter4j.auth.AuthorizationFactory
import twitter4j.conf.ConfigurationBuilder
import org.apache.spark.streaming.api.java.JavaStreamingContext

import org.apache.spark.rdd.RDD
import org.apache.spark.SparkContext
import org.apache.spark.mllib.feature.HashingTF
import org.apache.spark.mllib.linalg.Vector
import org.apache.spark.SparkConf
import org.apache.spark.api.java.JavaSparkContext
import org.apache.spark.api.java.function.Function
import org.apache.spark.streaming.Duration
import org.apache.spark.streaming.api.java.JavaDStream
import org.apache.spark.streaming.api.java.JavaReceiverInputDStream

val consumerKey = "xxx"
val consumerSecret = "xxx"
val accessToken = "xxx"
val accessTokenSecret = "xxx"
val url = "https://stream.twitter.com/1.1/statuses/filter.json"

val sparkConf = new SparkConf().setAppName("Twitter Streaming")
val sc = new SparkContext(sparkConf)

val documents: RDD[Seq[String]] = sc.textFile("").map(_.split(" ").toSeq)


// Twitter Streaming
val ssc = new JavaStreamingContext(sc,Seconds(2))

val conf = new ConfigurationBuilder()
conf.setOAuthAccessToken(accessToken)
conf.setOAuthAccessTokenSecret(accessTokenSecret)
conf.setOAuthConsumerKey(consumerKey)
conf.setOAuthConsumerSecret(consumerSecret)
conf.setStreamBaseURL(url)
conf.setSiteStreamBaseURL(url)

val filter = Array("Twitter", "Hadoop", "Big Data")

val auth = AuthorizationFactory.getInstance(conf.build())
val tweets : JavaReceiverInputDStream[twitter4j.Status] = TwitterUtils.createStream(ssc, auth, filter)

val statuses = tweets.dstream.map(status => status.getText)
statuses.print()
ssc.start()

但是当它到达这个命令:val sc = new SparkContext(sparkConf)时,出现如下错误:

17/05/09 09:08:35 WARN SparkContext: Multiple running SparkContexts detected in the same JVM! org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true.

我已经尝试在sparkConf值中添加如下参数,但还是报错:

val sparkConf = new SparkConf().setAppName("Twitter Streaming").setMaster("local[4]").set("spark.driver.allowMultipleContexts", "true")

如果我忽略错误并继续运行命令,我会收到另一个错误:

17/05/09 09:15:44 WARN ReceiverSupervisorImpl: Restarting receiver with delay 2000 ms: Error receiving tweets 401:Authentication credentials (https://dev.twitter.com/pages/auth) were missing or incorrect. Ensure that you have set valid consumer key/secret, access token/secret, and the system clock is in sync. \n\n\nError 401 Unauthorized HTTP ERROR: 401

Problem accessing '/1.1/statuses/filter.json'. Reason:Unauthorized

感谢任何类型的贡献。问候,祝你有美好的一天。

最佳答案

Spark-shell 已经准备好一个 spark-session 或 spark-context 供您使用 - 因此您不必/不能初始化一个新的。通常,在 spark-shell 启动过程结束时,您会有一行告诉您在什么变量下它对您可用。allowMultipleContexts 仅用于测试 Spark 的某些功能,在大多数情况下不应使用。

关于java - 此 JVM 中只能运行一个 SparkContext - [SPARK],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43890060/

27 4 0
文章推荐: java - Spring - 找不到 key 'could not find key ' logging.exception-conversion-word'
文章推荐: android - 开发工具权限我没有设置
文章推荐: javascript - 未捕获的类型错误 : Object # has no method 'exec' at file:///android_asset/www/index. html