gpt4 book ai didi

scala - Spark scala mocking spark.implicits 用于单元测试

转载 作者:行者123 更新时间:2023-12-05 04:57:14 26 4
gpt4 key购买 nike

在尝试使用 Spark 和 Scala 简化单元测试时,我使用了 scala-test 和 mockito-scala(以及 mockito sugar)。这只是让你做这样的事情:

val sparkSessionMock = mock[SparkSession]

然后你通常可以用“when”和“verify”来施展魔法。

但是如果你有一些实现需要导入

import spark.implicits._

在它的代码中,单元测试的简单性似乎消失了(或者至少我还没有找到解决这个问题的最合适的方法)。

我最终得到了这个错误:

org.mockito.exceptions.verification.SmartNullPointerException: 
You have a NullPointerException here:
-> at ...
because this method call was *not* stubbed correctly:
-> at scala.Option.orElse(Option.scala:289)
sparkSession.implicits();

由于打字问题,简单地模拟对 SparkSession 中“隐式”对象的调用将无济于事:

val implicitsMock = mock[SQLImplicits]
when(sparkSessionMock.implicits).thenReturn(implicitsMock)

不会让你通过,因为它说它需要你的模拟中的对象类型:

require: sparkSessionMock.implicits.type
found: implicitsMock.type

请不要告诉我我应该做 SparkSession.builder.getOrCreate()... 从那以后这不再是单元测试,而是更重量级的集成测试。

(编辑):这是一个完整的可重现示例:

import org.apache.spark.sql._
import org.mockito.Mockito.when
import org.scalatest.{ FlatSpec, Matchers }
import org.scalatestplus.mockito.MockitoSugar

case class MyData(key: String, value: String)

class ClassToTest()(implicit spark: SparkSession) {
import spark.implicits._

def read(path: String): Dataset[MyData] =
spark.read.parquet(path).as[MyData]
}

class SparkMock extends FlatSpec with Matchers with MockitoSugar {

it should "be able to mock spark.implicits" in {
implicit val sparkMock: SparkSession = mock[SparkSession]
val implicitsMock = mock[SQLImplicits]
when(sparkMock.implicits).thenReturn(implicitsMock)
val readerMock = mock[DataFrameReader]
when(sparkMock.read).thenReturn(readerMock)
val dataFrameMock = mock[DataFrame]
when(readerMock.parquet("/some/path")).thenReturn(dataFrameMock)
val dataSetMock = mock[Dataset[MyData]]
implicit val testEncoder: Encoder[MyData] = Encoders.product[MyData]
when(dataFrameMock.as[MyData]).thenReturn(dataSetMock)

new ClassToTest().read("/some/path/") shouldBe dataSetMock
}
}

最佳答案

你不能模拟隐式。隐式在编译时解决,而模拟在运行时发生(运行时反射,字节码操作通过 Byte Buddy ).您不能在编译时导入仅在运行时才会被模拟的隐式。您必须手动解决隐含问题(原则上,如果您在运行时再次启动编译器,则可以在运行时解决隐含问题,但这会困难得多 1 2 3 4 )。

尝试

class ClassToTest()(implicit spark: SparkSession, encoder: Encoder[MyData]) {
def read(path: String): Dataset[MyData] =
spark.read.parquet(path).as[MyData]
}

class SparkMock extends AnyFlatSpec with Matchers with MockitoSugar {

it should "be able to mock spark.implicits" in {
implicit val sparkMock: SparkSession = mock[SparkSession]
val readerMock = mock[DataFrameReader]
when(sparkMock.read).thenReturn(readerMock)
val dataFrameMock = mock[DataFrame]
when(readerMock.parquet("/some/path")).thenReturn(dataFrameMock)
val dataSetMock = mock[Dataset[MyData]]
implicit val testEncoder: Encoder[MyData] = Encoders.product[MyData]
when(dataFrameMock.as[MyData]).thenReturn(dataSetMock)

new ClassToTest().read("/some/path") shouldBe dataSetMock
}
}

//[info] SparkMock:
//[info] - should be able to mock spark.implicits
//[info] Run completed in 2 seconds, 727 milliseconds.
//[info] Total number of tests run: 1
//[info] Suites: completed 1, aborted 0
//[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
//[info] All tests passed.

请注意 "/some/path" 在两个地方应该相同。在您的代码片段中,两个字符串不同。

关于scala - Spark scala mocking spark.implicits 用于单元测试,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64539412/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com