gpt4 book ai didi

hadoop - 如何使用加载命令在 pig 的文件夹中加载多个文本文件?

转载 作者:可可西里 更新时间:2023-11-01 16:22:56 26 4
gpt4 key购买 nike

我一直用它来加载一个文本文件

A = LOAD '1try.txt' USING PigStorage(' ') as (c1:chararray,c2:chararray,c3:chararray,c4:chararray);

最佳答案

您可以使用文件夹名代替文件名,如下所示:

A = LOAD 'myfolder' USING PigStorage(' ') 
AS (c1:chararray,c2:chararray,c3:chararray,c4:chararray);

Pig 将加载指定文件夹中的所有文件,如 Programming Pig 中所述:

When specifying a “file” to read from HDFS, you can specify directories. In this case, Pig will find all files under the directory you specify and use them as input for that load statement. So, if you had a directory input with two datafiles today and yesterday under it, and you specified input as your file to load, Pig will read both today and yesterday as input. If the directory you specify has other directories, files in those directories will be included as well.

关于hadoop - 如何使用加载命令在 pig 的文件夹中加载多个文本文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23622288/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com