gpt4 book ai didi

hadoop - Apache Pig 不会将字符串解析为 int/long

转载 作者:可可西里 更新时间:2023-11-01 15:18:25 26 4
gpt4 key购买 nike

我是 pig 的新手,正在尝试对包含如下所示事件的文件执行一些基本分析:

1345477765  2012-08-20  08:49:24    servername  12.34.56.78 192.168.1.4 joebloggs   ManageSystem    Here's your message

我尝试按如下方式加载文件:

logs = LOAD '/path/to/file' using PigStorage AS (loggedtime:long, serverdate:chararray, servertime:chararray, servername:chararray, externalip:chararray, internalip:chararray, username:chararray, systemtype:chararray,  message:chararray);

当我说明日志时,一切看起来都正常:

     Illustrate logs
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| logs | loggedtime:long | serverdate:chararray | servertime:chararray | servername:chararray | externalip:chararray | internalip:chararray | username:chararray | systemtype:chararray | message:chararray |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| | 1345477765 | 2012-08-20 | 08:49:24 | servername | 12.34.56.78 | 192.168.1.4 | joebloggs | ManageSystem | Here's your message |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

此外,当描述它们时,一切都如我所料:

logs: {loggedtime: long,serverdate: chararray,servertime: chararray,servername: chararray,externalip: chararray,internalip: chararray,username: chararray,systemtype: chararray,message: chararray}

但是,当我转储日志时,不包括记录的时间。

dump logs;
(,2012-08-20,08:49:24,servername,12.34.56.78,192.168.1.4,joebloggs,ManageSystem,Here's your message)

据推测,我的过滤器没有返回任何事件:

specificlog = FILTER logs BY loggedtime == 1345477765;

希望我在这里遗漏了一些简单的东西。

最佳答案

我最终自己弄明白了这一点。要解析为 long,我必须在数字末尾放置一个“L”。

例如通过将我的源数据更改为以下内容,我能够使它正常工作。

1345477765L  2012-08-20  08:49:24    servername  12.34.56.78 192.168.1.4 joebloggs   ManageSystem    Here's your message

希望这能帮助遇到同样问题的人。

关于hadoop - Apache Pig 不会将字符串解析为 int/long,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12041923/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com