gpt4 book ai didi

hadoop - 如何使用配置单元计算由 "|"分隔符分隔的每列中的单词数?

转载 作者:可可西里 更新时间:2023-11-01 14:48:21 26 4
gpt4 key购买 nike

输入数据是

+----------------------+--------------------------------+
| movie_name | Genres |
+----------------------+--------------------------------+
| digimon | Adventure|Animation|Children's |
| Slumber_Party_Massac | Horror |
+----------------------+--------------------------------+

我需要这样的输出

+----------------------+--------------------------------+-----------------+
| movie_name | Genres | count_of_genres |
+----------------------+--------------------------------+-----------------+
| digimon | Adventure|Animation|Children's | 3 |
| Slumber_Party_Massac | Horror | 1 |
+----------------------+--------------------------------+-----------------+

最佳答案

select  *
,size(split(coalesce(Genres,''),'[^|\\s]+'))-1 as count_of_genres

from mytable

此解决方案涵盖各种用例,包括 -

  • NULL 值
  • 空字符串
  • 空标记(例如 Adventure||AnimationAdventure| |Animation )

关于hadoop - 如何使用配置单元计算由 "|"分隔符分隔的每列中的单词数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43573144/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com