gpt4 book ai didi

c# - 在数据表中查找重复项然后比较重复项

转载 作者:太空宇宙 更新时间:2023-11-03 11:57:52 28 4
gpt4 key购买 nike

我有一个包含重复行的数据表。我需要获取重复项并比较重复行以获得某些列中的最佳值。

DataTable dt = new DataTable();

dt.Rows.Add(1, "Test1", "584", 12);
dt.Rows.Add(2, "Test2", "32", 123);
dt.Rows.Add(3, "Test3", "425", 54);
dt.Rows.Add(4, "Test1", "4", 755);
dt.Rows.Add(5, "Test5", "854", 879);
dt.Rows.Add(6, "Test2", "1", null);
dt.Rows.Add(7, "Test2", "999", 3);

注意测试 1 和 2 有重复项。

(1, "Test1", "584", 12)
(4, "Test1", "4", 755)

(2, "Test2", "32", 123)
(6, "Test2", "1", null)
(7, "Test2", "999", 3)

现在我有了重复项。我需要制作一条具有最佳值(value)的线路。新数据表应显示:

Test1 = "Test1", "584", 755
Test2 = "Test2", "999", 123
Test3 = "Test3", "425", 54
Test5 = "Test5", "854", 879

最佳答案

//GroupBy(x => x[1]) = groupby the second column
//Where(x => x.Count() > 1) = only get groups that have a count greater than 1, so duplicates
var duplicates = dt.Rows.OfType<DataRow>().GroupBy(x => x[1]).Where(x => x.Count() > 1).ToList();

//enumerate all duplicates
foreach (var duplicate in duplicates)
{
//enumerate each row of the duplicate
foreach (var dataRow in duplicate)
{
//do something…
//I don't know your rules why a row is better than the other, so that part you have to figure out yourself, or extend your question
}
}

也许您正在寻找这个:

DataTable dt = new DataTable();
dt.Columns.Add("Id", typeof(int));
dt.Columns.Add("Text", typeof(string));
dt.Columns.Add("Value1", typeof(string));
dt.Columns.Add("Value2", typeof(int));

dt.Rows.Add(1, "Test1", "584", 12);
dt.Rows.Add(2, "Test2", "32", 123);
dt.Rows.Add(3, "Test3", "425", 54);
dt.Rows.Add(4, "Test1", "4", 755);
dt.Rows.Add(5, "Test5", "854", 879);
dt.Rows.Add(6, "Test2", "1", null);
dt.Rows.Add(7, "Test2", "999", 3);

var duplicates = dt.Rows.OfType<DataRow>().GroupBy(x => x[1]).Where(x => x.Count() > 1).ToList();

//get the current highestId (first column) so that when we remove duplicates and a new row the new row will get the next available id
var highestId = dt.Rows.OfType<DataRow>().Max(x => (int)x[0]);

//enumerate all duplicates
foreach (var duplicate in duplicates)
{
//get the highest value of each column
var newId = ++highestId;
var newText = duplicate.Key;
var newValue1 = duplicate.Max(x => x[2]); //this does a string comparison, instead of a numeric one, this means that for example that 2 is bigger then 10

// use this if you need numeric comparison
var newValue1AsNumeric = duplicate.Select(x =>
{
if (int.TryParse(Convert.ToString(x[2]), out var value))
return value;

return (int?)null;
}).OfType<int>().Max();

var newValue2 = duplicate.Select(x => x[3]).OfType<int>().Max();

//enumerate each row of the duplicate
foreach (var dataRow in duplicate)
dt.Rows.Remove(dataRow);

dt.Rows.Add(newId, newText, newValue1, newValue2);
}

您可以在此处查看运行中的代码: https://dotnetfiddle.net/rp1DUc

关于c# - 在数据表中查找重复项然后比较重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58835299/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com