gpt4 book ai didi

java - EMR 版本 4.2.0 上的 Scalding 作业因 VerifyError 而失败

转载 作者:可可西里 更新时间:2023-11-01 14:57:51 32 4
gpt4 key购买 nike

我们有一个 Scalding 作业,我想使用版本标签 4.2.0 在 AWS Elastic MapReduce 上运行它。

此作业在 AMI 2.4.2 上成功运行。当我们将它升级到 AMI 3.7.0 时,我们遇到了由不兼容的 jar 引起的 java.lang.VerifyError。我们的项目使用 1.5 版的 commons-codec 库,但早期的不兼容版本随 AMI 一起提供。同样,我们的项目使用 Scala 2.10,但 AMI 附带 2.11 版。我们通过添加引导脚本来删除所有匹配 commons-codec-1.[234].jarscala-library-2.11.*.jar 的文件来解决这个问题集群。

现在我们又要升级到 4.2.0,再次得到一个 VerifyError:

```Exception in thread "main" java.lang.reflect.InvocationTargetException    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)    at com.twitter.scalding.Job$.apply(Job.scala:47)    at com.twitter.scalding.Tool.getJob(Tool.scala:48)    at com.twitter.scalding.Tool.run(Tool.scala:68)    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)    at com.snowplowanalytics.snowplow.enrich.hadoop.JobRunner$.main(JobRunner.scala:33)    at com.snowplowanalytics.snowplow.enrich.hadoop.JobRunner.main(JobRunner.scala)    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)    at java.lang.reflect.Method.invoke(Method.java:606)    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)Caused by: java.lang.VerifyError: Bad type on operand stackException Details:  Location:    com/snowplowanalytics/snowplow/enrich/common/utils/ConversionUtils$.decodeBase64Url(Ljava/lang/String;Ljava/lang/String;)Lscalaz/Validation; @5: invokevirtual  Reason:    Type 'org/apache/commons/codec/binary/Base64' (current frame, stack[0]) is not assignable to 'org/apache/commons/codec/binary/BaseNCodec'  Current Frame:    bci: @5    flags: { }    locals: { 'com/snowplowanalytics/snowplow/enrich/common/utils/ConversionUtils$', 'java/lang/String', 'java/lang/String' }    stack: { 'org/apache/commons/codec/binary/Base64', 'java/lang/String' }  Bytecode:    0000000: 2ab7 008a 2cb6 0090 3a04 bb00 5459 1904    0000010: b200 96b7 0099 3a05 b200 9e19 05b9 00a4    0000020: 0200 b900 aa01 00a7 003e 4eb2 009e bb00    0000030: ac59 b200 4112 aeb6 00b1 b700 b4b2 0041    0000040: 06bd 0004 5903 2b53 5904 2c53 5905 2db6    0000050: 00b9 53b6 00bf b900 c502 00b9 00a4 0200    0000060: b900 c801 00b0                           Exception Handler Table:    bci [0, 42] => handler: 42  Stackmap Table:    same_locals_1_stack_item_frame(@42,Object[#182])    same_locals_1_stack_item_frame(@101,Object[#206])    at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJobConfig$.com$snowplowanalytics$snowplow$enrich$hadoop$EtlJobConfig$$base64ToJsonNode(EtlJobConfig.scala:224)    at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJobConfig$.loadConfigAndFilesToCache(EtlJobConfig.scala:126)    at com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob.(EtlJob.scala:139)    ... 16 more```

Exploring which jars remain on the cluster following the purge:

$ sudo find / -name "*scala-*"
/usr/share/aws/emr/emrfs/cli/lib/scala-library-2.10.5.jar
/usr/share/aws/emr/emrfs/cli/lib/scala-reflect-2.10.4.jar
/usr/share/aws/emr/emrfs/cli/lib/scala-logging-api_2.10-2.1.2.jar
/usr/share/aws/emr/emrfs/cli/lib/nscala-time_2.10-1.2.0.jar
/usr/share/aws/emr/emrfs/cli/lib/scala-logging-slf4j_2.10-2.1.2.jar
$ sudo find / -name "*commons-codec*"
/usr/share/aws/emr/node-provisioner/lib/commons-codec-1.9.jar
/usr/share/aws/emr/emr-metrics/lib/commons-codec-1.6.jar
/usr/share/aws/emr/emr-metrics-client/lib/commons-codec-1.6.jar
/usr/share/aws/emr/emrfs/lib/commons-codec-1.9.jar
/usr/share/aws/emr/hadoop-state-pusher/lib/commons-codec-1.8.jar
/usr/lib/hbase/lib/commons-codec-1.7.jar
/usr/lib/mahout/lib/commons-codec-1.7.jar

AMI 4.1.0 也会出现同样的错误。导致此问题的 3.7.0 和 4.x.x 之间发生了什么变化,我该如何解决?

最佳答案

最后我在引导步骤中添加了以下逻辑:

wget 'http://central.maven.org/maven2/commons-codec/commons-codec/1.5/commons-codec-1.5.jar'
sudo mkdir -p /usr/lib/hadoop/lib
sudo cp commons-codec-1.5.jar /usr/lib/hadoop/lib/remedial-commons-codec-1.5.jar
rm commons-codec-1.5.jar

这将从 Maven 下载正确版本的 jar,并将其放在失败作业步骤的类路径的开头,它优先于其他版本的 jar。

是否有更清洁的解决方案?

关于java - EMR 版本 4.2.0 上的 Scalding 作业因 VerifyError 而失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33870107/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com