gpt4 book ai didi

scala - 在 Java、Scala 或 Kotlin 的生态系统中,是否有一种可靠的方法来重新打包库依赖项以避免版本冲突?

转载 作者:行者123 更新时间:2023-12-05 05:33:46 25 4
gpt4 key购买 nike

这可能是一个老问题,但仍在等待解决方案。整个问题源于 Apache Spark 开发过程中的一个小细节,Apache Spark 是历史上最大的开源项目之一。

在 Spark 1.x 和 2.x 的交付和发布期间。发现关键库依赖(Apache Hive 1.x)引入过多过时的传递依赖,与YARN/HDFS部署容易冲突。意识到团队将没有足够的资源来执行 mono-repo 原则(即确保依赖树中的每个库只能有一个版本),Apache Hive 的硬 fork 被制作、编译和发布:

https://github.com/JoshRosen/hive

https://mvnrepository.com/artifact/org.spark-project.hive/hive-common/1.2.1.spark2

与官方 Apache Hive 的唯一区别是所有对包“org.apache.hive”的源代码引用都替换为“org.spark-project.hive”。

这显然是使用另一个项目的糟糕方式:新代码跟不上 Apache Hive 社区的发展,或者平凡、重复的工作需要使其保持最新。这也引入了危险的漏洞利用,其中未签名的 jar 可用于换出 Apache Spark 安装中迁移的 jar(也是未签名的)。结果,在 Spark 3.0 之后,迁移的项目停止了:在资源足够的情况下,引入了新的原始 Apache Hive 2.x,并升级了大多数过时的依赖项。

人们希望在 Apache Spark 2.0 发布 5 年后,通过所有编译工具和插件的改进,这样的过程应该在很大程度上自动化。具体来说,有2个插件(maven shade plugin和gradle shadow plugin)是专门为依赖包的重定位而设计的,可以直接从规范的Hive中生成Apache Hive的迁移字节码。但是一个快速的实验很快表明他们都无法完成如此简单的任务:

https://github.com/tribbloid/autoshade

该项目包含2个子项目,仅用于重新打包,一个用maven编写,另一个用gradle编写。

maven子项目使用maven shade插件将json4s重定位到repacked.test1.org.json4s:


<dependencies>
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_${vs.scalaBinaryV}</artifactId>
<version>4.0.4</version>
</dependency>
</dependencies>

<build>
<plugins>

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.4</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>

<configuration>
<!-- <createSourcesJar>true</createSourcesJar>-->

<createDependencyReducedPom>true</createDependencyReducedPom>
<dependencyReducedPomLocation>${project.build.directory}/dependency-reduced-pom.xml</dependencyReducedPomLocation>
<!-- <generateUniqueDependencyReducedPom>true</generateUniqueDependencyReducedPom>-->

<keepDependenciesWithProvidedScope>false</keepDependenciesWithProvidedScope>
<promoteTransitiveDependencies>false</promoteTransitiveDependencies>

<!-- <shadedClassifierName>${spark.classifier}</shadedClassifierName>-->
<relocations>
<relocation>
<pattern>org.json4s</pattern>
<shadedPattern>repacked.test1.org.json4s</shadedPattern>
</relocation>
</relocations>

<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>

gradle项目使用shadow插件将json4s重定位到repacked.test2.org.json4s:

dependencies {

api("org.json4s:json4s-jackson_${vs.scalaBinaryV}:4.0.4")
}

tasks {
shadowJar {
exclude("META-INF/*.SF")
exclude("META-INF/*.DSA")

relocate("org.json4s", "repacked.test2.org.json4s")
}

}

之后,第三个项目(在 gradle 中,但没关系)声明为依赖项并使用 Scala 访问新的重定位类:


dependencies {
api(project(":repack:gradle", configuration = "shadow"))

api("com.tribbloids.autoshade:repack-maven:0.0.1")
}

class Json4sTest {

classOf[test1.org.json4s.Formats]

classOf[test2.org.json4s.Formats]
}

令人惊讶的是,它无法编译:

[Error] /home/peng/git-proto/autoshade/main/src/main/scala/com/tribbloids/spookystuff/Json4sTest.scala:7:11: Symbol 'term org.json4s' is missing from the classpath.
This symbol is required by ' <none>'.
Make sure that term json4s is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'package.class' was compiled against an incompatible version of org.
[Error] /home/peng/git-proto/autoshade/main/src/main/scala/com/tribbloids/spookystuff/Json4sTest.scala:7:28: type Formats is not a member of package repacked.test1.org.json4s
[Error] /home/peng/git-proto/autoshade/main/src/main/scala/com/tribbloids/spookystuff/Json4sTest.scala:10:11: Symbol 'term org.json4s' is missing from the classpath.
This symbol is required by ' <none>'.
Make sure that term json4s is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'package.class' was compiled against an incompatible version of org.
[Error] /home/peng/git-proto/autoshade/main/src/main/scala/com/tribbloids/spookystuff/Json4sTest.scala:10:28: type Formats is not a member of package repacked.test2.org.json4s

如果引用了一个不存在的类,则不会出现第一条和第三条错误消息,可以推测包迁移不完整且不一致,与 Apache Spark 团队手动迁移源代码相比完全没有用之前。

那么为什么这么简单的任务很难自动化呢?在 maven 或 gradle 中需要哪些额外的步骤才能使其工作?

最佳答案

目前(2022 年 10 月 13 日),唯一可行的解​​决方案是通过 sbt。 https://github.com/tribbloid/autoshade/blob/main/repack/sbt/build.sbt中使用了以下构建文件,它调用 AssemblyPlugin 来发布一个着色的程序集 jar:

project
.in(file("."))
.settings(commonSettings)
.settings(
scalacOptions += "-Ymacro-annotations",
libraryDependencies ++= Seq(
"org.json4s" %% "json4s-jackson" % "4.0.4"
),
addArtifact(
Artifact("repack-sbt", "assembly"),
sbtassembly.AssemblyKeys.assembly
),
ThisBuild / assemblyMergeStrategy := {
case PathList("module-info.class") => MergeStrategy.discard
case x if x.endsWith("/module-info.class") => MergeStrategy.discard
case x =>
val oldStrategy = (ThisBuild / assemblyMergeStrategy).value
oldStrategy(x)
},
artifact in (Compile, assembly) := {
val art = (artifact in (Compile, assembly)).value
art.withClassifier(Some("assembly"))
},
ThisBuild / assemblyJarName := {
s"${name.value}-${scalaBinaryVersion.value}-${version.value}-assembly.jar"
},
ThisBuild / assemblyShadeRules := Seq(
ShadeRule.rename("org.json4s.**" -> "repacked.test3.org.json4s.@1").inAll
)
)
.enablePlugins(AssemblyPlugin)

发布后:

sbt "clean;publishM2"
...
[success] Total time: 0 s, completed Oct. 13, 2022, 4:19:49 p.m.
[info] Wrote /home/peng/git-proto/autoshade/repack/sbt/target/scala-2.13/repack-sbt_2.13-0.0.1-SNAPSHOT.pom
[info] Strategy 'discard' was applied to 9 files (Run the task at debug level to see details)
[info] Strategy 'rename' was applied to 4 files (Run the task at debug level to see details)
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT-sources.jar
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT-javadoc.jar
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT.jar
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT.pom
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT-assembly.jar
[success] Total time: 3 s, completed Oct. 13, 2022, 4:19:53 p.m.
...

程序集 jar 中的任何类都可以在新的重新打包 repacked.test3.org.json4s 中引用。

尚不清楚 sbt 插件的哪一部分正确地使其成为可能。一旦想通了,理想情况下应该将相同的子程序分别移植到 maven-shade-plugin 和 gradle-shadow-plugin

关于scala - 在 Java、Scala 或 Kotlin 的生态系统中,是否有一种可靠的方法来重新打包库依赖项以避免版本冲突?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73768130/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com