gpt4 book ai didi

.net - Sort() 和 BinarySearch() 与整数/字符串的性能比较

转载 作者:行者123 更新时间:2023-12-01 08:40:04 25 4
gpt4 key购买 nike

最初我想问对整数进行排序是否比对字符串进行排序更快。但我自己已经回答了这个问题,我对巨大的差异感到惊讶。为什么排序和 BinarySearch 整数与字符串相比要快得多?

使用 1.000.000 Int32/Strings 的 (VB.Net) 测试:

Private Function CheckIntBinarySearch() As TimeSpan
Dim watch As New System.Diagnostics.Stopwatch()
Dim rnd As New Random(Date.Now.Millisecond)
Dim intCol1 As New List(Of Int32)
Dim intCol2 As New List(Of Int32)
Dim contains As Int32
For i As Int32 = 1 To 1000000
intCol1.Add(rnd.Next(1, 1000000))
Next
For i As Int32 = 1 To 1000000
intCol2.Add(rnd.Next(1, 1000000))
Next
Me.output.WriteLine("Integers sorting ...")
watch.Start()
intCol1.Sort()
watch.Stop()
Me.output.WriteLine("Sorting finished: " & watch.Elapsed.TotalSeconds & " seconds elapsed.")

Me.output.WriteLine("Integers BinarySearch ...")
watch.Start()
For Each Val As Int32 In intCol2
If intCol1.BinarySearch(Val) > -1 Then contains += 1
Next
watch.Stop()
Me.output.WriteLine("BinarySearch finished(contains " & contains & "): " & watch.Elapsed.TotalSeconds & " seconds elapsed.")
Return watch.Elapsed
End Function

Private Function CheckStringBinarySearch() As TimeSpan
Dim watch As New System.Diagnostics.Stopwatch()
Dim rnd As New Random(Date.Now.Millisecond)
Dim stringCol1 As New List(Of String)
Dim stringCol2 As New List(Of String)
Dim contains As Int32
For i As Int32 = 1 To 1000000
stringCol1.Add(rnd.Next(1, 1000000).ToString)
Next
For i As Int32 = 1 To 1000000
stringCol2.Add(rnd.Next(1, 1000000).ToString)
Next
Me.output.WriteLine("Strings sorting ...")
watch.Start()
stringCol1.Sort()
watch.Stop()
Me.output.WriteLine("Sorting finished: " & watch.Elapsed.TotalSeconds & " seconds elapsed.")
Me.output.WriteLine("Strings BinarySearch ...")
watch.Start()
For Each Val As String In stringCol2
If stringCol1.BinarySearch(Val) > -1 Then contains += 1
Next
watch.Stop()
Me.output.WriteLine("BinarySearch finished(contains " & contains & "): " & watch.Elapsed.TotalSeconds & " seconds elapsed.")
Return watch.Elapsed
End Function

检查性能 5 次:

For i As Int32 = 1 To 5
intChecks.Add(CheckIntBinarySearch())
Next
For i As Int32 = 1 To 5
stringChecks.Add(CheckStringBinarySearch())
Next

输出:

    1.)Integers sorting ...
Sorting finished: 0,2292863 seconds elapsed.
Integers BinarySearch ...
BinarySearch finished(contains 630857): 0,9365744 seconds elapsed.
2.)Integers sorting ...
Sorting finished: 0,2287382 seconds elapsed.
Integers BinarySearch ...
BinarySearch finished(contains 632600): 0,9053134 seconds elapsed.
3.)Integers sorting ...
Sorting finished: 0,2318829 seconds elapsed.
Integers BinarySearch ...
BinarySearch finished(contains 631475): 0,9038594 seconds elapsed.
4.)Integers sorting ...
Sorting finished: 0,2308994 seconds elapsed.
Integers BinarySearch ...
BinarySearch finished(contains 632346): 0,9011047 seconds elapsed.
5.)Integers sorting ...
Sorting finished: 0,2266423 seconds elapsed.
Integers BinarySearch ...
BinarySearch finished(contains 632982): 0,893541 seconds elapsed.
1.)Strings sorting ...
Sorting finished: 6,5661916 seconds elapsed.
Strings BinarySearch ...
BinarySearch finished(contains 632579): 12,9545657 seconds elapsed.
2.)Strings sorting ...
Sorting finished: 6,5641975 seconds elapsed.
Strings BinarySearch ...
BinarySearch finished(contains 631478): 13,0184132 seconds elapsed.
3.)Strings sorting ...
Sorting finished: 6,4281382 seconds elapsed.
Strings BinarySearch ...
BinarySearch finished(contains 631775): 12,7684214 seconds elapsed.
4.)Strings sorting ...
Sorting finished: 6,9455087 seconds elapsed.
Strings BinarySearch ...
BinarySearch finished(contains 631478): 13,7057234 seconds elapsed.
5.)Strings sorting ...
Sorting finished: 6,6707111 seconds elapsed.
Strings BinarySearch ...
BinarySearch finished(contains 632346): 13,0493649 seconds elapsed.
  • Int32 平均排序:0,22948982
  • String 平均排序:6,63494942
  • Int32 BinarySearch 平均:0,90807858
  • String BinarySearch 平均:13,09929772

结论:

  • 排序整数比排序字符串快 29
  • BinarySearch Integers 比 BinarySearch Strings 快 14,4

为什么?考虑拥有大量“字符串整数”(“1”,“2”,“3”,...)。在对它们进行排序和搜索之前将它们解析为整数会更好吗?将字符串解析为整数的成本是多少?好的,我想这是另一个问题。

最佳答案

字符串比较存在许多整数比较转义的问题。

  1. 考虑一下如何编写字符串比较。首先,您可以检查引用是否为空,然后您可以检查两个字符串的长度是否匹配,然后您可以一次比较每个数字。这至少是 2 次比较,其中整数只使用一次。
  2. 考虑比较“111112”和“111113”。对于字符串,在得到结果之前需要进行 6 次比较(每个字符一次)。对于整数,它只是一种比较。
  3. 可以优化整数比较和数组以使用特定于寄存器的指令,因此可能会直接在 CPU 缓存中进行大量排序/搜索 - 对于字符串,可能需要调用多个方法来访问字符、检查长度等。<
  4. 如果您需要特定于语言环境的比较,或者可以处理“ß”与“ss”或“æ”与“ae”相同的比较,您甚至不能使用长度优化。
  5. 字符串是引用类型,整数是值类型,因此可能存在取消引用。

此列表并不详尽,但至少可以了解问题。

顺便说一句,不影响性能 - 你应该知道比较字符串时“12”<“2”(这是按字典顺序发生的),所以上面的代码可能不是你想要的。

关于.net - Sort() 和 BinarySearch() 与整数/字符串的性能比较,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4183200/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com