Java 8 Streams 多重分组依据-6ren

Java 8 Streams 多重分组依据

转载作者：塔克拉玛干更新时间：2023-11-03 03:52:32

24

4

我有这样的温度记录

dt        |AverageTemperature |AverageTemperatureUncertainty|City   |Country |Latitude|Longitude
----------+-------------------+-----------------------------+-------+--------+--------+---------
1963-01-01|-5.417000000000002 |0.5                          |Karachi|Pakistan|57.05N  |10.33E  
1963-02-01|-4.7650000000000015|0.328                        |Karachi|Pakistan|57.05N  |10.33E  
1964-01-01|-5.417000000000002 |0.5                          |Karachi|Pakistan|57.05N  |10.33E  
1964-02-01|-4.7650000000000015|0.328                        |Karachi|Pakistan|57.05N  |10.33E  
1965-01-01|11.417000000000002 |0.5                          |Karachi|Pakistan|57.05N  |10.33E 
1965-02-01|12.7650000000000015|0.328                        |Karachi|Pakistan|57.05N  |10.33E

我必须将其解析为 POJO 并根据以下问题陈述计算平均增量:

Use the Streams API to calculate the average annual temperature delta for each country. To calculate delta the average temperature in 1900 would be subtracted from the average temperature in 1901 to obtain the delta from 1900 to 1901 for a particular city. The average of all these deltas is the average annual temperature delta for a city. The average of all cities in a country is the average of a country.

我的 Temperate POJO 看起来像下面有 getter 和 setter

public class Temperature {
    private java.util.Date date;
    private double averageTemperature;
    private double averageTemperatureUncertainty;
    private String city;
    private String country;
    private String latitude;
    private String longitude;
}

我维护了一个温度列表，因为这个问题将使用流来解决。

为了计算增量，我尝试使用以下流，但我仍然无法计算实际增量，因为我必须计算平均国家增量，我已经对国家、城市和日期进行了分组。

Map<String, Map<String, Map<Integer, Double>>> countriesMap = this.getTemperatures().stream()
                .sorted(Comparator.comparing(Temperature::getDate))
                .collect(Collectors.groupingBy(Temperature::getCountry,
                        Collectors.groupingBy(Temperature::getCity,
                        Collectors.groupingBy
                                (t -> {
                                            Calendar calendar = Calendar.getInstance();
                                            calendar.setTime(t.getDate());
                                            return calendar.get(Calendar.YEAR);
                                        }, 
                        Collectors.averagingDouble(Temperature::getAverageTemperature)))));

为了计算增量，我们必须计算差异对于 Map<Integer, Double> .

为了计算差异，我想出了以下代码，但无法将以下代码与上面的代码联系起来

Stream.of(10d, 20d, 10d) //this is sample data that I that I get in `Map<Integer, Double>` of countriesMap
        .map(new Function<Double, Optional<Double>>() {
            Optional<Double> previousValue = Optional.empty();
            @Override
            public Optional<Double> apply(Double current) {
                Optional<Double> value = previousValue.map(previous -> current - previous);
                previousValue = Optional.of(current);
                return value;
            }
        })
        .filter(Optional::isPresent)
        .map(Optional::get)
        .forEach(System.out::println);

如何一次性使用流计算增量或如何对 countriesMap 执行流操作为了计算delta并实现上述问题陈述。？

最佳答案

要将问题陈述缩减为更小的 block ，您可以研究的另一种方法是通过 year 进行解析。温度并为它们计算增量，进一步average正在处理它。尽管必须对 Map<Integer, Double> 类型的所有值执行此操作内内Map在你的问题中。它看起来像:

Map<Integer, Double> unitOfWork = new HashMap<>(); // innermost map you've attained ('yearToAverageTemperature' map)
unitOfWork = unitOfWork.entrySet()
        .stream()
        .sorted(Map.Entry.comparingByKey())
        .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e1, LinkedHashMap::new));
// the values sorted based on the year from a sorted map
List<Double> srtedValPerYear = new ArrayList<>(unitOfWork.values());
// average of deltas from the complete list 
double avg = IntStream.range(0, srtedVal.size() - 1)
        .mapToDouble(i -> (srtedVal.get(i + 1) - srtedVal.get(i)))
        .average().orElse(Double.NaN);

进一步注意，这只是一个 City 的平均值<Year, AverageTemperature>的记录，您将不得不遍历所有 City keyset 和类似的所有 Country键集以详尽地找出此类平均值。

进一步将这个工作单元移动到一个方法中，遍历整个 map 的 map ，这可能会完成为:

// The average of all cities in a country is the average of a country.
AtomicReference<Double> countryValAvg = new AtomicReference<>(0.0);
countriesMap.forEach((country, cityMap) -> {
    // The average of all these deltas is the average annual temperature delta for a city.
    AtomicReference<Double> cityAvgTemp = new AtomicReference<>((double) 0);
    cityMap.forEach((city, yearMap) -> cityAvgTemp.set(cityAvgTemp.get() + averagePerCity(yearMap)));
    double avgAnnualTempDeltaPerCity = cityAvgTemp.get() / cityMap.size();

    countryValAvg.set(countryValAvg.get() + avgAnnualTempDeltaPerCity);
});
System.out.println(countryValAvg.get() / countriesMap.size());

哪里averagePerCity是执行以下操作的方法

double averagePerCity(Map<Integer, Double> unitOfWork) {
    unitOfWork = unitOfWork.entrySet()
            .stream()
            .sorted(Map.Entry.comparingByKey())
            .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e1, LinkedHashMap::new));
    List<Double> srtedVal = new ArrayList<>(unitOfWork.values());
    return IntStream.range(0, srtedVal.size() - 1)
            .mapToDouble(i -> (srtedVal.get(i + 1) - srtedVal.get(i)))
            .average().orElse(Double.NaN);
}

注意:上面的代码可能缺少验证，它只是提供一个想法，说明如何将完整的问题分解成更小的部分，然后再解决。

Edit1:哪个could be improved further as :

// The average of all cities in a country is the average of a country.
AtomicReference<Double> countryValAvg = new AtomicReference<>(0.0);
countriesMap.forEach((country, cityMap) -> {
    // The average of all these deltas is the average annual temperature delta for a city.
    double avgAnnualTempDeltaPerCity = cityMap.values()
            .stream()
            .mapToDouble(Quick::averagePerCity) // Quick is my class name
            .average()
            .orElse(Double.NaN);
    countryValAvg.set(countryValAvg.get() + avgAnnualTempDeltaPerCity);
});
System.out.println(countryValAvg.get() / countriesMap.size());

Edit2:进一步

double avgAnnualTempDeltaPerCity = countriesMap.values().stream()
        .mapToDouble(cityMap -> cityMap.values()
                .stream()
                .mapToDouble(Quick::averagePerCity) // Quick is my class name
                .average()
                .orElse(Double.NaN))
        .average().orElse(Double.NaN);

关于Java 8 Streams 多重分组依据，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54132104/

24

4

0

文章推荐： algorithm - 通过主定理找到递归方程的闭端公式

文章推荐： c - 我怎么知道一个点是否在三角形中？

文章推荐： java - 在 Java 中存储大量文件的 RGB 值

mysql - 按 parent 分组，但不按 child 分组
您好，我正在处理 BIRT 报告。我有一个查询，我必须对父级的重复数据进行分组，但子级也不能分组! 在我的查询中: item 是父项，item_ledger_entry 是子项。我有来自 item.N
google-analytics - 为什么 MCF channel 分组≠默认 channel 分组？
我正在使用 GA API。这是针对 MCF 目标报告(底部)的标准目标完成指标表(顶部) 看一下这个: 总数加起来 (12,238)，但看看按 channel 分组的分割有多么不同!我以为这些会很接
OrientDB 分组
我正在开发一个流量计数器，我想获得 IP 和重复计数，但是如何？就像是 :select ip, count(ip) from Redirect 返回 : null total ip count 重定
Java正则表达式(分组)
我尝试编写一个正则表达式来匹配条件表达式，例如: a!=2 1+2=2+a 我尝试提取运算符。我当前的正则表达式是“.+([!=<>]+).+” 但问题是匹配器总是尝试匹配组中可能的最短字符串
分组、平均的SQL子查询问题
在 MS Transact SQL 中，假设我有一个这样的表(订单): Order Date Order Total Customer # 09/30/2008 8
MySQL:分组
我想按 m.ID 分组，并对每个 m.id 求和 (pm.amount_construction* prod.anzahl) 实际上我有以下结果: Meterial_id | amount_const
PostgreSQL 分组
我想根据多列中的值对值进行分组。这是一个例子: 我想得到输出: {{-30,-50,20},{-20,30,60},{-30,NULL or other value, 20}} 我设法到达: SELE
MySql 分组
我正在尝试找出运行此查询的最佳方式。我基本上需要返回在我们的系统中只下了一个订单的客户的“登录”字段列表(登录字段基本上是客户 ID/ key )。我们系统的一些背景...... 客户在同一日期下的
MYSQL - 分组
给定以下mysql结果集: id code name importance '1234', 'ID-CS-B', 'Chocolate Sauce'
python - 分组
大家好，我的数据框中有以下列: LC_REF 1 DT 16 2C 2 DT 16 2C 3 DT 16 2C 1 DT 16 3C 6 DT 16 3C 3
MongoDB 分组
我有这样的 mongoDB 集合 { "_id" : "EkKTRrpH4FY9AuRLj", "stage" : 10, }, { "_id" : "EkKTRrpH4FY9
Python 分组
假设我有一组数据对，其中 index 0 是值，index 1 是类型: input = [ ('11013331', 'KAT'), ('9085267',
java中用stream进行去重，排序，分组
java中用stream进行去重，排序，分组一、distinct 1. 八大基本数据类型 List collect = ListUtil.of(1, 2, 3, 1, 2).stream().fil
SQL - 如何添加具有平均值的列，分组
基本上，我从 TABLE_A 中的这个开始 France - 100 France - 200 France - 300 Mexico - 50 Mexico - 50 Mexico - 56 Pol
正则表达式，分组，查找最后一个匹配项
我希望这个正则表达式 ([A-Z]+)$ 将选择此示例中的最后一次出现: AB.012.00.022ABC-1 AB.013.00.022AB-1 AB.014.00.022ABAB-1 但我没有匹配
数据透视表中的 SQL 分组
我创建了一个数据透视表，但数据没有组合在一起。任何人都可以帮助我获得所需的格式吗？我为获取数据透视表而编写的查询: DECLARE @cols AS NVARCHAR(MAX), -- f
SQL选择并按一段时间(时间戳)分组
我想按时间段(月，周，日，小时，...)选择计数和分组。例如，我想选择行数并将它们按 24 小时分组。我的表创建如下。日期是时间戳。 CREATE TABLE MSG ( MSG_ID dec
围绕间隙的 SQL 分组
在 SQL Server 2005 中，我有一个包含如下数据的表: WTN------------Date 555-111-1212 2009-01-01 555-111-1212 2009-
python - 按多列对数据框中的连续条目进行聚类/分组
题假设我有 k 个标量列，如果它们沿着每列彼此在一定距离内，我想对它们进行分组。假设简单 k 是 2 并且它们是我唯一的列。 pd.DataFrame(list(zip(sorted(choice
pandas - 分组、拆分和选取数据框中的顶行
问题在以下数据框中 df : import random import pandas as pd random.seed(999) sz = 50 qty = {'one': 1, 'two': 2

首页

博学

6Ren·AI

商城

Java 8 Streams 多重分组依据