gpt4 book ai didi

c# - 在没有 Linq 的情况下为 IList 的 IList 删除重复项

转载 作者:塔克拉玛干 更新时间:2023-11-03 05:42:47 26 4
gpt4 key购买 nike

在没有 Linq 的情况下,在 C# 中删除 IList 中重复项的最有效方法是什么

我有来自另一个 SO [1] 的以下代码,

IList<IList<int>> output = new List<IList<int>>(); 
var lists = output;
for (int i = 0; i < lists.Count; ++i)
{
//since we want to compare sequecnes, we shall ensure the same order of the items
var item = lists[i].OrderBy(x => x).ToArray();
for (int j = lists.Count - 1; j > i; --j)
if (item.SequenceEqual(lists[j].OrderBy(x => x)))
lists.RemoveAt(j);
}

我在更大的编码挑战中使用它,没有 Linq 或语法糖,我想看看是否有任何优雅/快速的解决方案?

我想只使用哈希,但我不确定要使用哪种哈希函数来确定列表已经可用?

更清楚对于像

这样的输入

{{1,2,4, 4}, {3,4,5}, {4,2,1,4} }

中间输出是[排序的输入/输出很好]

{{1,2,4,4}, {3,4,5}, {1,2,4,4} }

输出:

{{1,2,4,4}, {3,4,5}}

最佳答案

我使用了 CollectionAssert.AreEquivalent 内部的修改版本来自微软:

using System.Collections.Generic;

public class Program
{
public static void Main()
{
var lists = new List<List<int>>
{
new List<int> {1, 4, 2},
new List<int> {3, 4, 5},
new List<int> {1, 2, 4}
};

var dedupe =
new List<List<int>>(new HashSet<List<int>>(lists, new MultiSetComparer<int>()));
}

// Equal if sequence contains the same number of items, in any order
public class MultiSetComparer<T> : IEqualityComparer<IEnumerable<T>>
{
public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
{
if (first == null)
return second == null;

if (second == null)
return false;

if (ReferenceEquals(first, second))
return true;

// Shortcut when we can cheaply look at counts
var firstCollection = first as ICollection<T>;
var secondCollection = second as ICollection<T>;
if (firstCollection != null && secondCollection != null)
{
if (firstCollection.Count != secondCollection.Count)
return false;

if (firstCollection.Count == 0)
return true;
}

// Now compare elements
return !HaveMismatchedElement(first, second);
}

private static bool HaveMismatchedElement(IEnumerable<T> first, IEnumerable<T> second)
{
int firstNullCount;
int secondNullCount;

// Create dictionary of unique elements with their counts
var firstElementCounts = GetElementCounts(first, out firstNullCount);
var secondElementCounts = GetElementCounts(second, out secondNullCount);

if (firstNullCount != secondNullCount || firstElementCounts.Count != secondElementCounts.Count)
return true;

// make sure the counts for each element are equal, exiting early as soon as they differ
foreach (var kvp in firstElementCounts)
{
var firstElementCount = kvp.Value;
int secondElementCount;
secondElementCounts.TryGetValue(kvp.Key, out secondElementCount);

if (firstElementCount != secondElementCount)
return true;
}

return false;
}

private static Dictionary<T, int> GetElementCounts(IEnumerable<T> enumerable, out int nullCount)
{
var dictionary = new Dictionary<T, int>();
nullCount = 0;

foreach (T element in enumerable)
{
if (element == null)
{
nullCount++;
}
else
{
int num;
dictionary.TryGetValue(element, out num);
num++;
dictionary[element] = num;
}
}

return dictionary;
}

public int GetHashCode(IEnumerable<T> enumerable)
{
int hash = 17;
// Create and sort list in-place, rather than OrderBy(x=>x), because linq is forbidden in this question
var list = new List<T>(enumerable);
list.Sort();
foreach (T val in list)
hash = hash * 23 + (val == null ? 42 : val.GetHashCode());

return hash;
}
}
}

这使用 Hashset<T> , 添加到此集合会自动忽略重复项。

最后一行可以这样写:

var dedupe = new HashSet<List<int>>(lists, new MultiSetComparer<int>()).ToList();

技术上使用 System.Linq命名空间,但我认为这不是您对 Linq 的关注.

我将附和 Eric Lippert 所说的话。您要求我们向您展示 Linq 的原始工作原理和框架内部,但它不是一个封闭的盒子。此外,如果您认为查看这些方法的源代码会发现明显的低效和优化机会,那么我发现这通常不容易发现,您最好阅读文档并进行测量。

关于c# - 在没有 Linq 的情况下为 IList 的 IList 删除重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41686501/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com