- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
在 Scala/Spark
应用程序我有两个不同的 DataFrame。我的任务是创建一个 Excel 文件 (xlsx
),每个 DataFrame 有两张表格。对于这个任务,我决定使用 spark-excel图书馆。
df1.coalesce(1).write
.format("com.crealytics.spark.excel")
.option("dataAddress", "'Sheet1'!A1:Z100000")
.option("useHeader", "true")
.option("dateFormat", "yy-mmm-d")
.option("timestampFormat", "mm-dd-yyyy hh:mm:ss")
.mode("append")
.save("/temp/excel.xlsx")
df2.coalesce(1).write
.format("com.crealytics.spark.excel")
.option("dataAddress", "'Sheet2'!A1:Z100000")
.option("useHeader", "true")
.option("dateFormat", "yy-mmm-d")
.option("timestampFormat", "mm-dd-yyyy hh:mm:ss")
.mode("append")
.save("/temp/excel.xlsx")
2019-09-05 00:09:51 ERROR TaskSetManager:70 - Task 2923 in stage 12.0 failed 4 times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 2923 in stage 12.0 failed 4 times, most recent failure: Lost task 2923.3 in stage 12.0 (TID 33582, ignt51.company.local, executor 0): java.io.EOFException: Unexpected end of input stream
at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:145)
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at java.io.InputStream.read(InputStream.java:101)
at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:144)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:184)
at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.hasNext(HadoopFileLinesReader.scala:50)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:461)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:190)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:619)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:187)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1661)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1649)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1648)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1648)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1882)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1831)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1820)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2074)
at org.apache.spark.rdd.RDD$$anonfun$toLocalIterator$1.org$apache$spark$rdd$RDD$$anonfun$$collectPartition$1(RDD.scala:960)
at org.apache.spark.rdd.RDD$$anonfun$toLocalIterator$1$$anonfun$apply$30.apply(RDD.scala:962)
at org.apache.spark.rdd.RDD$$anonfun$toLocalIterator$1$$anonfun$apply$30.apply(RDD.scala:962)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$JoinIterator.hasNext(Iterator.scala:216)
at scala.collection.Iterator$$anon$19.hasNext(Iterator.scala:800)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:183)
at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:45)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at scala.collection.AbstractIterator.to(Iterator.scala:1336)
at scala.collection.TraversableOnce$class.toList(TraversableOnce.scala:294)
at scala.collection.AbstractIterator.toList(Iterator.scala:1336)
at com.crealytics.spark.excel.AreaDataLocator$class.toSheet(DataLocator.scala:111)
at com.crealytics.spark.excel.CellRangeAddressDataLocator.toSheet(DataLocator.scala:135)
at com.crealytics.spark.excel.ExcelFileSaver.sheet$1(ExcelFileSaver.scala:36)
at com.crealytics.spark.excel.ExcelFileSaver.writeToWorkbook$1(ExcelFileSaver.scala:40)
at com.crealytics.spark.excel.ExcelFileSaver.save(ExcelFileSaver.scala:56)
at com.crealytics.spark.excel.DefaultSource.createRelation(DefaultSource.scala:60)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:656)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:656)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:656)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:225)
at report.Call.MSISDN(Call.scala:156)
at Controller.createReport(Controller.scala:36)
at MainApp$.main(MainApp.scala:76)
at MainApp.main(MainApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2019-09-05 00:09:51 ERROR TransportRequestHandler:210 - Error while invoking RpcHandler#receive() for one-way message.
org.apache.spark.SparkException: Could not find CoarseGrainedScheduler.
at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:160)
at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:140)
at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:655)
at org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:208)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:113)
at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:745)
最佳答案
有一些损坏的文件,您可以在设置以下选项时跳过它们:
spark.sqlContext.setConf("spark.sql.files.ignoreCorruptFiles", "true")
关于excel - Spark : Unexpected end of input stream,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57797342/
根本不是 SQL 人员。让顾问编写以下代码。 首先,它确保只选择了一所小学 - 然后,在 BEGIN 之后,如果变量 @Term 等于 3,我们想要在 IF 语句下执行操作。问题就在这里。当 @Ter
以下 javascript 将 bool 值呈现到每个语句的右侧: var reg = new RegExp(/^[\w\/].*result\b/); console.log(reg.test('p
有什么区别: x = 1 while x < 5 do x += 1 print x end 和: x = 1 while x < 5 x += 1 print x end 将 do
对于初学者来说,我是编程的“菜鸟”,所以只需了解事情的工作原理并向社区寻求帮助。 但是...我想知道的是: 我想要构建一个 Web 应用程序,两个主要用户界面之一实际上是日历产品。每个日历项目都有 8
我正在尝试制作带有图片上传选项的表单。我正在使用 express-http-proxy作为我的 API 代理和 multer按照建议。 app.use('/api', upload.any(), pr
根据this中的回答和 this问题,C++ 标准在 § 23.2.1 中声明 end() 对于所有 STL 容器都具有恒定的时间复杂度。 如果我理解正确的话: std::forward_list 只
当我使用 css 属性 align-items 时,我看不到 flex-end 值或 end 值有任何视觉差异>. align-items: end 和 align-items: flex-end 有
Sub RowRangeMove() Sheets.Add().Name = "CopySheet" With Sheets("BigDataSet - Copy")
假设第 1 到 5,000 列中有 25,000 到 50,000 行数据,每列可能有不同的行数。所有数据都是连续的,即列中没有空行,也没有空列。 考虑以下代码 Dim i As Long Dim W
我在 MYSQL 中有一个表,必须在 postgresql 中转换它。 我正在使用以下命令创建表格。 create table emp(COMPLETE BOOLEAN NOT NULL, END B
我正在尝试使用 Lark 为 BASIC 创建一个 LALR 解析器,而且我很难解决“END”语句和“END IF”等语句之间的冲突。这是语法的简化版本: %ignore /[ \t\f]+/ pro
试图理解this MSDN sample但我对这些行感到困惑: IAsyncResult result = Dns.BeginGetHostEntry(args[0], null, null); Co
我在 http://www.sgi.com/tech/stl/nth_element.html 阅读了 std::nth_element 的描述 template void nth_element(
为什么标准将 end() 定义为末尾,而不是实际末尾? 最佳答案 最好的论据是Dijkstra himself 提出的论据。 : 您希望范围的大小是一个简单的差异end - begin; 当序列退化为
我试图根据一些参数停止页面的其余部分加载;但不确定语法是否正确。 @if(dayRes + dayTri == 2){Sorry, etc @Response.End} 上面抛出这个错误: CS150
在二分搜索中,我们通常有 low 和 high 变量,并且通常有一个 while 循环来测试 low <= high,如以下代码所示(来自维基百科): int SortedArray[max] = {
我将 MS-Test 与 Visual Studio 2010 和 Visual Basic 结合使用。 在下面的函数中,代码覆盖率告诉我,有一个未检查的 block ,并且带有 “End Try”
所以今天我一直致力于使用 Protractor 为 Angular JS 应用程序设置端到端测试。为了编写更清晰的测试,我使用了 Protractor 网站上描述的 Page Object 模式。 测
所以 meteor js 的全部意义在于允许用户一次对整个堆栈进行编码,但是如果我正在使用像 django 这样的旧框架之一,可以借用meteor js的前端代码吗?比如前端的数据库同步,模板化,或者
我正在使用 wavesurfer.js 和 recorder.js 制作采样器。一切都很顺利,除了我无法使用 play([start[, end]]) 调整循环长度。 wavesurfer.seekT
我是一名优秀的程序员,十分优秀!