gpt4 book ai didi

java - 创建 Hadoop 序列文件

转载 作者:可可西里 更新时间:2023-11-01 15:49:19 26 4
gpt4 key购买 nike

我正在尝试创建 hadoop 序列文件。

我成功地在 HDFS 中创建了一个序列文件,但是如果我尝试读取一个序列文件,则会出现“Sequence file not a SequenceFile”错误。我还在 HDFS 中检查创建的序列文件。

enter image description here

这是我的源代码,可以将序列文件读写到HDFS。

package us.qi.hdfs;

import java.io.IOException;
import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.ArrayFile;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.Text;

public class SequenceFileText {
public static void main(String args[]) throws IOException {

/** Get Hadoop HDFS command and Hadoop Configuration*/
HDFS_Configuration conf = new HDFS_Configuration();
HDFS_Test hdfs = new HDFS_Test();

String uri = "hdfs://slave02:9000/user/hadoop/test.seq";

/** Get Configuration from HDFS_Configuration Object by using get_conf()*/
Configuration config = conf.get_conf();

SequenceFile.Writer writer = null;
SequenceFile.Reader reader = null;

try {
Path path = new Path(uri);

IntWritable key = new IntWritable();
Text value = new Text();

writer = SequenceFile.createWriter(config, SequenceFile.Writer.file(path), SequenceFile.Writer.keyClass(key.getClass()),
ArrayFile.Writer.valueClass(value.getClass()));
reader = new SequenceFile.Reader(config, SequenceFile.Reader.file(path));

writer.append(new IntWritable(11), new Text("test"));
writer.append(new IntWritable(12), new Text("test2"));
writer.close();

while (reader.next(key, value)) {
System.out.println(key + "\t" + value);
}
reader.close();
} catch (IOException e) {
e.printStackTrace();
} finally {
IOUtils.closeStream(writer);
IOUtils.closeStream(reader);
}
}
}

并且出现了这个错误。

2018-09-17 17:15:34,267 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2018-09-17 17:15:38,870 INFO [main] compress.CodecPool (CodecPool.java:getCompressor(153)) - Got brand-new compressor [.deflate] java.io.EOFException: hdfs://slave02:9000/user/hadoop/test.seq not a SequenceFile at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1933) at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1892) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1841) at us.qi.hdfs.SequenceFileText.main(SequenceFileText.java:36)

最佳答案

那是我的错误。我更改了一些源代码。

首先,我检查文件是否已经存在于 hdfs 中。如果没有文件,我会创建一个 writer 对象。

当写入程序完成后,我检查一个序列文件。检查文件后,我成功读取了一个序列文件。

这是我的代码。谢谢!

try {
Path path = new Path(uri);

IntWritable key = new IntWritable();
Text value = new Text();

/** First, Check a file already exists.
* If there is not exists in hdfs, writer object is created.
* */
if (!fs.exists(path)) {
writer = SequenceFile.createWriter(config, SequenceFile.Writer.file(path), SequenceFile.Writer.keyClass(key.getClass()),
ArrayFile.Writer.valueClass(value.getClass()));

writer.append(new IntWritable(11), new Text("test"));
writer.append(new IntWritable(12), new Text("test2"));
writer.close();
} else {
logger.info(path + " already exists.");
}

/** Create a SequenceFile Reader object.*/
reader = new SequenceFile.Reader(config, SequenceFile.Reader.file(path));

while (reader.next(key, value)) {
System.out.println(key + "\t" + value);
}

reader.close();
} catch (IOException e) {
e.printStackTrace();
} finally {
IOUtils.closeStream(writer);
IOUtils.closeStream(reader);
}

关于java - 创建 Hadoop 序列文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52377293/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com