gpt4 book ai didi

java - 根据值将数据集分组为不同的子数据集

转载 作者:太空宇宙 更新时间:2023-11-04 11:39:51 26 4
gpt4 key购买 nike

我想在由几列组成的数据集上实现一个程序,如下所示:

+-----------+---------------+-------------------+-----------------------+
|Item_ID |Product_Name |Manufacturer_Name |Product_Description |
+-----------+---------------+-------------------+-----------------------+
|12345 |Pen |Cello |Ball Pen Soft Nib... |
|12346 |Pencil |Nataraja |Pencil HB Extra D... |
|42345 |Ruler |Nataraja |Scale No.1103 15c... |
|12677 |Sharpener |Nataraja |Pencil Shraperner... |
|12987 |Pen |Reynolds |Dot Pen Extra Gr... |
|44326 |Pen |Reynolds |Gel Pen German T... |
|13456 |Pen |Cello |Dot Pen 0.5mm Nib... |
|19876 |Eraser |Cello |Dust free Eraser ... |
|43246 |Ink Pen |Hero |Ink Pen Smooth Ha... |
+-----------+---------------+-------------------+-----------------------+

我想根据 Manufacturer_Name 对数据集进行分组,如下所示

Manufacturer = Cello
+-----------+---------------+-------------------+-----------------------+
|Item_ID |Product_Name |Manufacturer_Name |Product_Description |
+-----------+---------------+-------------------+-----------------------+
|12345 |Pen |Cello |Ball Pen Soft Nib... |
|13456 |Pen |Cello |Dot Pen 0.5mm Nib... |
|19876 |Eraser |Cello |Dust free Eraser ... |
+-----------+---------------+-------------------+-----------------------+

Manufacturer = Nataraja
+-----------+---------------+-------------------+-----------------------+
|Item_ID |Product_Name |Manufacturer_Name |Product_Description |
+-----------+---------------+-------------------+-----------------------+
|12346 |Pencil |Nataraja |Pencil HB Extra D... |
|42345 |Ruler |Nataraja |Scale No.1103 15c... |
|12677 |Sharpener |Nataraja |Pencil Shraperner... |
+-----------+---------------+-------------------+-----------------------+

Manufacturer = Reynolds
+-----------+---------------+-------------------+-----------------------+
|Item_ID |Product_Name |Manufacturer_Name |Product_Description |
+-----------+---------------+-------------------+-----------------------+
|12987 |Pen |Reynolds |Dot Pen Extra Gr... |
|44326 |Pen |Reynolds |Gel Pen German T... |
+-----------+---------------+-------------------+-----------------------+

Manufacturer = Hero
+-----------+---------------+-------------------+-----------------------+
|Item_ID |Product_Name |Manufacturer_Name |Product_Description |
+-----------+---------------+-------------------+-----------------------+
|43246 |Ink Pen |Hero |Ink Pen Smooth Ha... |
+-----------+---------------+-------------------+-----------------------+

我尝试使用以下代码,但没有产生良好的结果。帮助我改进这个程序。这是我使用的代码:

Dataset<Row> countsBy = src.select("Manufacturer_Name").distinct();
List<Row> lsts = countsBy.collectAsList();
for (Row lst : lsts) {
String man = lst.toString();
System.out.println("Records of " + man + " only");
Dataset<Row> mandataset = src.filter("Manufacturer_Name='" + man + "'");
mandataset.show();
}

最佳答案

也许您可以尝试制作数据集 map ,键为字符串(Manufacturer_Name),对于每次迭代,您检查Manufacturer_Name,然后检查它是否已在 map 中(如果需要,您可以创建它),最后,将行添加到好的数据集中。

你会得到类似的东西:

Map<string,ArrayList<ShopItem>> dic = new HashMap<string,ArrayList<ShopItem>>();
for(/*...*/)
{
string Manufacturer_Name = //you get the name
if(/*the Manufacturer_Name is not in dic*/)
{
dic.put(Manufacturer_Name,new ArrayList<ShopItem>());
}
dic.get(Manufacturer_Name).Add(/*what you want to add*/);
}

然后您需要第二个循环,但仅用于打印数据。

希望能解决您的问题!

编辑:用 map 替换字典(抱歉)并提供链接

How do you create a dictionary in Java?

编辑:更改代码以匹配新想法

关于java - 根据值将数据集分组为不同的子数据集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42931596/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com