gpt4 book ai didi

python - 星火/PySpark : An error occurred while trying to connect to the Java server (127. 0.0.1:39543)

转载 作者:太空宇宙 更新时间:2023-11-03 11:18:04 25 4
gpt4 key购买 nike

下午好

在过去的两天里,Java 服务器出现了许多连接问题。这有点不常见,因为错误并不总是发生,只是有时......

我正在结合使用 PySpark 和 Jupyter Notebook。一切都在谷歌云中的虚拟机实例上运行。我在 Google Cloud 中使用这个:

custom (8 vCPUs, 200 GB) 

这些是其他设置:

conf = pyspark.SparkConf().setAppName("App")
conf = (conf.setMaster('local[*]')
.set('spark.executor.memory', '180G')
.set('spark.driver.memory', '180G')
.set('spark.driver.maxResultSize', '180G'))

sc = pyspark.SparkContext(conf=conf)
sq = pyspark.sql.SQLContext(sc)

我训练了一个随机森林模型并做出了预测:

model = rf.fit(train)
predictions = model.transform(test)

然后我创建了 ROC 曲线并计算了 AUC 值。

然后我想看看混淆矩阵:

confusion_mat = metrics.confusionMatrix().toArray()
print(confusion_mat_train_rf)

现在错误发生了:

    Traceback (most recent call last):
File "/usr/lib/python2.7/SocketServer.py", line 290, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 318, in process_request
self.finish_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 331, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python2.7/SocketServer.py", line 652, in __init__
self.handle()
File "/usr/local/lib/python2.7/dist-packages/pyspark/accumulators.py", line 235, in handle
num_updates = read_int(self.rfile)
File "/usr/local/lib/python2.7/dist-packages/pyspark/serializers.py", line 577, in read_int
raise EOFError
EOFError
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/py4j/java_gateway.py", line 883, in send_command
response = connection.send_command(command)
File "/usr/local/lib/python2.7/dist-packages/py4j/java_gateway.py", line 1040, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
Py4JNetworkError: Error while receiving
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:39543)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/py4j/java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
File "/usr/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused

这是控制台的输出:

OpenJDK 64-Bit Server VM warning
: INFO: os::commit_memory(0x00007f4998300000, 603979776, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 603979776 bytes for committing reserved memory.

日志文件:

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 603979776 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux.cpp:2643), pid=2377, tid=0x00007f1c94fac700
#
# JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 )
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#

--------------- S Y S T E M ---------------

OS:DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"

uname:Linux 4.13.0-1008-gcp #11-Ubuntu SMP Thu Jan 25 11:08:44 UTC 2018 x86_64
libc:glibc 2.23 NPTL 2.23
rlimit: STACK 8192k, CORE 0k, NPROC 805983, NOFILE 1048576, AS infinity
load average:7.69 4.51 3.57

/proc/meminfo:
MemTotal: 206348252 kB
MemFree: 1298460 kB
MemAvailable: 250308 kB
Buffers: 6812 kB
Cached: 438232 kB
SwapCached: 0 kB
Active: 203906416 kB
Inactive: 339540 kB
Active(anon): 203804300 kB
Inactive(anon): 8392 kB
Active(file): 102116 kB
Inactive(file): 331148 kB
Unevictable: 3652 kB
Mlocked: 3652 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 4688 kB
Writeback: 0 kB
AnonPages: 203805168 kB
Mapped: 23076 kB
Shmem: 8776 kB
Slab: 114476 kB
SReclaimable: 50640 kB
SUnreclaim: 63836 kB
KernelStack: 4752 kB
PageTables: 404292 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 103174124 kB
Committed_AS: 205956256 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 71628 kB
DirectMap2M: 4122624 kB
DirectMap1G: 207618048 kB


CPU:total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 85 stepping 3, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx

有没有人知道问题可能是什么以及我该如何解决?我很绝望。 :(

//我认为 Java Runtime Environment 没有足够的内存来继续......但是我该怎么办?

非常感谢!

最佳答案

如果你是

using this one in Google Cloud:

custom (8 vCPUs, 200 GB)

然后你显着超额订阅了内存。忽略 spark.executor.memorylocal 模式下没有效果。

spark.executor.memory 仅占 JVM 堆,不包括:

  • PySpark worker 内存。
  • PySpark 驱动程序内存。

即使使用 JVM,也只有一部分可用于数据处理(参见 Memory Management Overview),因此 spark.driver.maxResultSize 等于分配的总内存没有意义。

关于python - 星火/PySpark : An error occurred while trying to connect to the Java server (127. 0.0.1:39543),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48523629/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com