gpt4 book ai didi

apache-spark - SBT测试不适用于 Spark 测试

转载 作者:行者123 更新时间:2023-12-04 04:06:24 25 4
gpt4 key购买 nike

我有一个简单的spark函数来测试DF窗口:

    import org.apache.spark.sql.{DataFrame, SparkSession}

object ScratchPad {

def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().master("local[*]").getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
get_data_frame(spark).show()
}

def get_data_frame(spark: SparkSession): DataFrame = {
import spark.sqlContext.implicits._
val hr = spark.sparkContext.parallelize(List(
("Steinbeck", "Sales", 100),
("Woolf", "IT", 99),
("Wodehouse", "Sales", 250),
("Hemingway", "IT", 349)
)
).toDF("emp", "dept", "sal")

import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._

val windowspec = Window.partitionBy($"dept").orderBy($"sal".desc)


hr.withColumn("rank", row_number().over(windowspec))

}
}

我这样写了一个测试:
    import com.holdenkarau.spark.testing.DataFrameSuiteBase
import org.apache.spark.sql.Row
import org.apache.spark.sql.types._
import org.scalatest.FunSuite

class TestDF extends FunSuite with DataFrameSuiteBase {

test ("DFs equal") {
val expected=sc.parallelize(List(
Row("Wodehouse","Sales",250,1),
Row("Steinbeck","Sales",100,2),
Row("Hemingway","IT",349,1),
Row("Woolf","IT",99,2)
))

val schema=StructType(
List(
StructField("emp",StringType,true),
StructField("dept",StringType,true),
StructField("sal",IntegerType,false),
StructField("rank",IntegerType,true)
)
)

val e2=sqlContext.createDataFrame(expected,schema)
val actual=ScratchPad.get_data_frame(sqlContext.sparkSession)
assertDataFrameEquals(e2,actual)
}

}

当我右键单击intellij中的类,然后单击“运行”时,效果很好。
当我使用“sbt测试”运行相同的测试时,它失败并显示以下内容:
    java.security.AccessControlException: access denied 
org.apache.derby.security.SystemPermission( "engine",
"usederbyinternals" )
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
at java.security.AccessController.checkPermission(AccessController.java:884)
at org.apache.derby.iapi.security.SecurityUtil.checkDerbyInternalsPrivilege(Unknown Source)
...

这是我的SBT脚本,没有任何花哨的内容可以放入 hive 依赖性,否则测试将无法编译:
    name := "WindowingTest"

version := "0.1"

scalaVersion := "2.11.5"


libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.1"
libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.2.1"
libraryDependencies += "com.holdenkarau" %% "spark-testing-base" % "2.2.0_0.8.0" % "test"

Google搜索将我指向derby-6648( https://db.apache.org/derby/releases/release-10.12.1.1.cgi)

上面写着:
需要进行申请变更
在SecurityManager上运行Derby的用户必须编辑策略文件,并对derby.jar,derbynet.jar和derbyoptionaltools.jar授予以下附加权限:

权限org.apache.derby.security.SystemPermission“engine”,“usederbyinternals”;

由于我没有明确安装derby(可能由spark内部使用),我该怎么做?

最佳答案

通过快速又肮脏的hack解决了

System.setSecurityManager(null)
问题
无论如何,因为它仅与自动化测试有关,也许毕竟没有那么大问题;)

关于apache-spark - SBT测试不适用于 Spark 测试,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48008343/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com