gpt4 book ai didi

vba - 过滤列表的算法

转载 作者:塔克拉玛干 更新时间:2023-11-03 02:59:19 26 4
gpt4 key购买 nike

我已经实现了我认为在 VBA 中过滤 System.Collections.ArrayList 的非常垃圾的方法。该代码采用一个列表和一个要过滤掉的项目/比较值。它遍历列表并删除匹配项。然后它重新启动循环(因为你不能 For Each.Remove 同时)

Public Sub Filter(ByVal testValue As Object, ByVal dataSet As ArrayList)
'testValue and the items in `dataSet` all Implement IComparable from mscorlib.dll
'This allows comparing objects for equality
'i.e. obj1.CompareTo(obj2) = 0 is equivalent to obj1 = obj2
Dim item As IComparable
Dim repeat As Boolean
repeat = False
For Each item In dataSet
If item.CompareTo(testValue) = 0 Then 'or equiv; If item = testValue
dataSet.Remove item
repeat = True
Exit For
End If
Next item
If repeat Then Filter testValue, dataSet
End Sub

为什么垃圾

假设列表有 X 个元素长,包含 Y 个符合过滤条件的项目,X>Y。据我所知,当所有 Y 都在开始时聚集时,最佳情况下的性能是 O(X)。最坏的情况是所有 Y 都聚集在最后。在那种情况下,算法需要 (X-Y)*Y 查找操作,当 Y=X/2 时最大,所以 O(X^2)

与简单的 O(X) 算法相比,这是很差的,该算法在您到达匹配项时逐步执行并删除,但不会中断循环。但是我不知道如何实现它。 有没有办法提高这个过滤器的性能?

最佳答案

你能不能不做下面这样的事情,我相信这是 O(n):

Option Explicit

Public Sub RemItems()

Const TARGET_VALUE As String = "dd"
Dim myList As Object
Set myList = CreateObject("System.Collections.ArrayList")

myList.Add "a"
myList.Add "dd"
myList.Add "a"
myList.Add "a"
myList.Add "a"
myList.Add "dd"
myList.Add "a"
myList.Add "a"
myList.Add "dd"
myList.Add "a"
myList.Add "a"

Dim i As Long
For i = myList.Count - 1 To 0 Step -1
If myList(i) = TARGET_VALUE Then myList.Remove myList(i)
Next i

End Sub

有关复杂性的信息,请参阅此讨论:

Asymptotic complexity of .NET collection classes

如果this可以相信(.NET-Big-O-Algorithm-Complexity-Cheat-Sheet):

enter image description here

注意:我用 https://htmledit.squarefree.com/ 渲染了 HTML

编辑:

警告 - 我不是 CS 毕业生。这是在玩。我确信对于正在处理的数据类型、分布等存在争论......欢迎改进

上面的 .Net 表显示从 HashTable 中删除平均为 O(1),而 ArrayList 为 O(n) ,所以我从值 {"a","b","c"} 中随机生成了 100,000 行。然后我将其用作我的固定测试集​​以获得以下结果。

Runs

Test set proportions

测试运行代码(请轻柔!)

Option Explicit

Private Declare PtrSafe Function getFrequency Lib "kernel32" _
Alias "QueryPerformanceFrequency" (cyFrequency As Currency) As Long
Private Declare PtrSafe Function getTickCount Lib "kernel32" _
Alias "QueryPerformanceCounter" (cyTickCount As Currency) As Long

Public Sub TestingArrayList()
Const TARGET_VALUE = "a"
Dim aList As Object
Set aList = CreateObject("System.Collections.ArrayList")

Dim arr()
arr = ThisWorkbook.Worksheets("Sheet1").Range("A1").CurrentRegion.Value '<== Reads in 100000 value

Dim i As Long
For i = 1 To UBound(arr, 1) '50000
aList.Add arr(i, 2)
Next i

Debug.Print aList.Contains(TARGET_VALUE)

Dim StartTime As Double

StartTime = MicroTimer()

For i = aList.Count - 1 To 0 Step -1
If aList(i) = TARGET_VALUE Then aList.Remove aList(i)
Next i

Debug.Print "Removal from array list took: " & Round(MicroTimer - StartTime, 3) & " seconds"
Debug.Print aList.Contains(TARGET_VALUE)

End Sub

Public Sub TestingHashTable()
Const TARGET_VALUE = "a"
Dim hTable As Object
Set hTable = CreateObject("System.Collections.HashTable")

Dim arr()
arr = ThisWorkbook.Worksheets("Sheet1").Range("A1").CurrentRegion.Value '<== Reads in 100000 value

Dim i As Long
For i = 1 To UBound(arr, 1) '50000
hTable.Add i, arr(i, 2)
Next i

Debug.Print hTable.ContainsValue(TARGET_VALUE)

Dim StartTime As Double

StartTime = MicroTimer()

For i = hTable.Count To 1 Step -1
If hTable(i) = TARGET_VALUE Then hTable.Remove i
Next i

Debug.Print "Removal from hash table took: " & Round(MicroTimer - StartTime, 3) & " seconds"
Debug.Print hTable.ContainsValue(TARGET_VALUE)

End Sub

Public Function MicroTimer() As Double

Dim cyTicks1 As Currency
Static cyFrequency As Currency

MicroTimer = 0

If cyFrequency = 0 Then getFrequency cyFrequency

getTickCount cyTicks1

If cyFrequency Then MicroTimer = cyTicks1 / cyFrequency
End Function

上面的内容似乎是 0(1)。

简单地看一下删除过程(删除其他因素),结果不太确定,但同样,我的编码可能是一个因素!

Deletion run

修改后的代码(去除其他因素):

Option Explicit

Public Sub TestingComparison()

Const RUN_COUNT As Long = 4

Dim hTable As Object
Dim aList As Object
Dim i As Long, j As Long, k As Long, rowCount As Long
Dim results() As Double

Set hTable = CreateObject("System.Collections.HashTable")
Set aList = CreateObject("System.Collections.ArrayList")

Dim testSizes()
testSizes = Array(100, 1000, 10000, 100000) ', 1000000)
ReDim results(0 To RUN_COUNT * (UBound(testSizes) + 1) - 1, 0 To 4)

Application.ScreenUpdating = False

With ThisWorkbook.Worksheets("Sheet5")

For i = LBound(testSizes) To UBound(testSizes)

For k = 1 To RUN_COUNT

For j = 1 To testSizes(i)
hTable.Add j, 1
aList.Add 1
Next j

Dim StartTime As Double, completionTime As Double

StartTime = MicroTimer()

For j = hTable.Count To 1 Step -1
hTable.Remove j
Next j

results(rowCount, 3) = Round(MicroTimer - StartTime, 3)
results(rowCount, 0) = testSizes(i)
results(rowCount, 1) = k

StartTime = MicroTimer()

For j = aList.Count - 1 To 0 Step -1
aList.Remove aList(j)
Next j

results(rowCount, 2) = Round(MicroTimer - StartTime, 3)

hTable.Clear
aList.Clear
rowCount = rowCount + 1
Next k

Next i

.Range("A2").Resize(UBound(results, 1) + 1, UBound(results, 2)) = results

End With

Application.ScreenUpdating = True
End Sub

关于vba - 过滤列表的算法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50304460/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com