gpt4 book ai didi

java - 如何使用 2 列合并的字段执行查询?

转载 作者:行者123 更新时间:2023-11-30 06:34:03 26 4
gpt4 key购买 nike

我正在使用 Java Spark 库构建一系列分布分析。这是我用来从 JSON 文件获取数据并保存输出的实际代码。

Dataset<Row> dataset = spark.read().json("local/foods.json");
dataset.createOrReplaceTempView("cs_food");

List<GenericAnalyticsEntry> menu_distribution= spark
.sql(" ****REQUESTED QUERY ****")
.toJavaRDD()
.map(row -> Triple.of( row.getString(0), BigDecimal.valueOf(row.getLong(1)), BigDecimal.valueOf(row.getLong(2))))
.map(GenericAnalyticsEntry::of)
.collect();

writeObjectAsJsonToHDFS(fs, "/local/output/menu_distribution_new.json", menu_distribution);

我正在寻找的查询基于以下结构:

+------------+-------------+------------+------------+
| FIRST_FOOD | SECOND_FOOD | DATE | IS_SPECIAL |
+------------+-------------+------------+------------+
| Pizza | Spaghetti | 11/02/2017 | TRUE |
+------------+-------------+------------+------------+
| Lasagna | Pizza | 12/02/2017 | TRUE |
+------------+-------------+------------+------------+
| Spaghetti | Spaghetti | 13/02/2017 | FALSE |
+------------+-------------+------------+------------+
| Pizza | Spaghetti | 14/02/2017 | TRUE |
+------------+-------------+------------+------------+
| Spaghetti | Lasagna | 15/02/2017 | FALSE |
+------------+-------------+------------+------------+
| Pork | Mozzarella | 16/02/2017 | FALSE |
+------------+-------------+------------+------------+
| Lasagna | Mozzarella | 17/02/2017 | FALSE |
+------------+-------------+------------+------------+

如何从上面编写的代码中实现此(如下所示)输出?

+------------+--------------------+----------------------+
| FOODS | occurrences(First) | occurrences (Second) |
+------------+--------------------+----------------------+
| Pizza | 2 | 1 |
+------------+--------------------+----------------------+
| Lasagna | 2 | 1 |
+------------+--------------------+----------------------+
| Spaghetti | 2 | 3 |
+------------+--------------------+----------------------+
| Mozzarella | 0 | 2 |
+------------+--------------------+----------------------+
| Pork | 1 | 0 |
+------------+--------------------+----------------------+

我当然尝试过自己找出解决方案,但我的尝试没有成功,我可能是错的,但我需要这样的东西:

"SELECT (first_food + second_food) as menu, COUNT(first_food), COUNT(second_food) from cs_food GROUP BY menu"

最佳答案

从示例数据来看,这看起来会产生您想要的输出:

select
foods,
first_count,
second_count
from
(select first_food as food from menus
union select second_food from menus) as f
left join (
select first_food, count(*) as first_count from menus
group by first_food
) as ff on ff.first_food=f.food
left join (
select second_food, count(*) as second_count from menus
group by second_food
) as sf on sf.second_food=f.food
;

关于java - 如何使用 2 列合并的字段执行查询?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45508331/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com