gpt4 book ai didi

hadoop - 使用 PIG 加载日期时间格式文件

转载 作者:可可西里 更新时间:2023-11-01 16:47:31 24 4
gpt4 key购买 nike

我有以下方式的数据集。

ravi,savings,avinash,2,char,33,F,22,44,12,13,33,44,22,11,10,22,2006-01-23
avinash,current,sandeep,3,char,44,M,33,11,10,12,33,22,39,12,23,19,2001-02-12
supreeth,savings,prabhash,4,char,55,F,22,12,23,12,44,56,7,88,34,23,1995-03-11
lavi,current,nirmesh,5,char,33,M,11,10,33,34,56,78,54,23,445,66,1999-06-15
Venkat,savings,bunny,6,char,11,F,99,12,34,55,33,23,45,66,23,23,2016-05-18

最后一列(例如:2006-01-23)是日期。我正在尝试使用 PIG 通过以下命令加载上述数据。以下是我用来加载文件的代码。

file = LOAD 'FI_USER_CREDS_TBL_T.txt' 
USING PigStorage(',') AS (USER_ID:chararray,
ROLE_ID:chararray,
USER_PW:chararray,
NUM_PWD_HISTORY:int,
PWD_HISTORY:chararray,
PWD_LAST_MOD_TIME:int,
NUM_PWD_ATTEMPTS:int,
NEW_USER_FLG:chararray,
LOGIN_TIME_LOW:int,
LOGIN_TIME_HIGH:int,
DISABLED_FROM_DATE:int,
DISABLED_UPTO_DATE:int,
PW_EXPY_DATE:int,
ACCT_EXPY_DATE:int,
ACCT_INACTIVE_DAYS:int,
LAST_ACCESS_TIME:int,
TS_CNT:int,
DTL__CAPXTIMESTAMP:int,
ETL_INSERT_DATE:datetime);

但它没有读取日期列,而是在使用转储文件命令后给出以下输出。

(ravi,savings,avinash,2,char,33,,22,44,12,13,33,44,22,11,10,22,,)
(avinash,current,sandeep,3,char,44,,33,11,10,12,33,22,39,12,23,19,,)
(supreeth,savings,prabhash,4,char,55,,22,12,23,12,44,56,7,88,34,23,,)
(lavi,current,nirmesh,5,char,33,,11,10,33,34,56,78,54,23,445,66,,)
(Venkat,savings,bunny,6,char,11,,99,12,34,55,33,23,45,66,23,23,,)

如何读取日期列。

请在这方面帮助我。

谢谢。

最佳答案

加载日期为chararray,然后转换成日期格式

喜欢:

file2 = FOREACH file GENERATE ToDate(date, 'dd/MM/yyyy') AS date,....

试试这个链接作为引用, http://pig.apache.org/docs/r0.11.0/api/org/apache/pig/builtin/ToDate.html要么 http://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html

关于hadoop - 使用 PIG 加载日期时间格式文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35767273/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com