gpt4 book ai didi

apache-spark - PySpark 修复/删除控制台进度条

转载 作者:行者123 更新时间:2023-12-04 03:55:14 42 4
gpt4 key购买 nike

如下所示,Spark 控制台输出进度条弄乱了输出。是否有可用于关闭舞台进度条的配置或标志?或者更好的是,如何修复控制台日志,以便在阶段完成后进度条消失?这可能只是 PySpark 的一个错误,但我不确定。

(CID, (v1 / n1, v2 / n2))
[Stage 46:============================================> (19 + 4) / 24]('1', (0.020000000000000035, 4.805))
('5', (6.301249999999998, 0.125))
('10', (21.78000000000001, 3.125))
('7', (0.005000000000000009, 0.6049999999999996))

(CID, sqrt(v1 / n1 + v2 / n2))
('1', 2.19658826364888)
('5', 2.5350049309616733)
('10', 4.990490957811667)
('7', 0.7810249675906652)

(CID, (AD_MEAN, NCI_MEAN))
('7', (1.0, 5.5))
('5', (7.75, 5.3))
('10', (13.5, 6.0))
('1', (3.0, 5.0))

(CID, (AD_MEAN - NCI_MEAN))
('7', -4.5)
('5', 2.45)
('1', -2.0)
('10', 7.5)

(CID, (NUMER, DENOM))
[Stage 100:===================================================> (30 + 2) / 32]('10', (7.5, 4.990490957811667))
('5', (2.45, 2.5350049309616733))
('7', (-4.5, 0.7810249675906652))
('1', (-2.0, 2.19658826364888))

有时甚至会更糟(向右滚动):
$ spark-submit main.py 
17/04/28 11:36:23 WARN Utils: Your hostname, Pandora resolves to a loopback address: 127.0.1.1; using 146.95.36.193 instead (on interface wlp3s0)
17/04/28 11:36:23 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/04/28 11:36:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[Stage 0:> (0 + 2 [Stage 32:=============================> (4 + 4[Stage 37:> (0 + 0[Stage 35:=====> (4 + 2) / 12][Stage 37:> (0 + 0[Stage 35:===========> (8 + 4) / 12][Stage 37:> (0 + 0[Stage 37:=======> (1 + 3[Stage 37:=============================> (4 + 0[Stage 36:========> (13 + 4) / 24][Stage 37:=========> (4 + 0[Stage 36:==============> (21 + 3) / 24][Stage 37:=========> (4 + 1[Stage 37:====================================> (5 + 3[Stage 38:===================================> (20 + 4)[Stage 38:====================================================> (30 + 2) SORTED (t-value, CID)
[(-5.761659596980321, '7'), (-0.9105029072119708, '1'), (0.9664675480810896, '5'), (1.5028581483070664, '10')]

最佳答案

您可以通过设置禁用

  • spark.ui.showConsoleProgress = 错误

  • 或者
  • log4j.properties 中降低日志记录级别高于INFO ,即到 ERROR

  • 相关 Spark jiras:
  • https://issues.apache.org/jira/browse/SPARK-4017
  • https://issues.apache.org/jira/browse/SPARK-18719
  • spark.ui.showConsoleProgress自 1.2 版以来一直在 Spark 中,但只会在 Spark 2.2 中记录。

    示例代码:
    spark.conf.set('spark.ui.showConsoleProgress', False)

    关于apache-spark - PySpark 修复/删除控制台进度条,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43597703/

    42 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com