performance - Data.Map 是二叉搜索树的最佳数据类型吗？-6ren

performance - Data.Map 是二叉搜索树的最佳数据类型吗？

转载作者：行者123 更新时间：2023-12-05 05:54:06

我正在编写一些对性能敏感的代码，需要在二叉搜索树中进行查找。树本身将在每次运行开始时生成一次，然后永远不会修改。查找将经常执行，很容易以数百万次执行，因此我只关心查找性能。

树的键是Char s 并且可以廉价地转换为 Int ，所以我假设 Data.Map表亲将是最佳选择:我使用的是严格的 IntMap这里。但是，我得到的性能明显低于我对二叉搜索树的期望。我在这里用一个相当人为的例子来演示。

import qualified Data.IntMap.Strict as IM


divideByTen :: Int -> Int
divideByTen n = maybe 0 snd $ IM.lookupLE n divideByTenMap

divideByTenMap :: IM.IntMap Int
divideByTenMap = IM.fromList $ zip [0,10..1000] [0..]

我的数据不会均匀分布，因此某些子集的性能将比整体更重要。因此，为这些常用输入引入一些快捷方式是有意义的。

divideByTenFaster :: Int -> Int
divideByTenFaster n
  | n < 10  = 0
  | n < 20  = 1
  | n < 30  = 2
  | n < 40  = 3
  | n < 50  = 4
  | n < 60  = 5
  | n < 70  = 6
  | n < 80  = 7
  | n < 90  = 8
  | n < 100 = 9
  | otherwise = divideByTen n

这个IntMap有100个元素，因此二分查找最多应该进行 7 次比较。因此我希望 divideByTenFaster比 divideByTen 快对于 n < 70 , 之后更慢。相反，我看到的是 divideByTenFaster是 IntMap 的两倍查找，即使是 n < 100 .运行以下基准测试代码:

import Criterion.Main

main :: IO ()
main = do
    defaultMainWith defaultConfig $
      [ bench "divideByTen"       $ nf divideByTen 99
      , bench "divideByTenFaster" $ nf divideByTenFaster 99
      ]

我们得到

benchmarking divideByTen
time                 18.50 ns   (18.47 ns .. 18.54 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 18.49 ns   (18.47 ns .. 18.51 ns)
std dev              79.42 ps   (61.51 ps .. 110.3 ps)

benchmarking divideByTenFaster
time                 7.164 ns   (7.154 ns .. 7.176 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 7.161 ns   (7.145 ns .. 7.176 ns)
std dev              49.60 ps   (36.84 ps .. 68.32 ps)

在我的实际问题中，我看到了更显着的差异:IntMap查找速度要慢一个数量级。我发现自己编写了冗长且难以维护的快捷方式，本质上是手动构建一个不平衡的二叉搜索树，这在常见情况下显着提高了性能，并略微降低了一般情况。

我可以转向基于数组的解决方案，但我认为这是 Map 的理想用例.是否缺少一些优化，或者另一种数据类型是更好的选择？

对于完整上下文，我正在处理的问题是查找由 Unicode 规范确定的 Unicode 字符宽度。查找树有大约 700 到 1100 个元素，具体取决于上下文。 ASCII 子集的性能非常重要，除此之外，广泛使用的脚本(拉丁文、汉表意文字、阿拉伯文、天城文等)应该相当快，而较少使用的脚本的性能可能会有所牺牲。

编辑:

鉴于评论中的许多建议，我已经对解决此问题的几种不同方法进行了基准测试。对于纯粹的查找性能，没有什么比一个巨大的数组更好，尽管这需要不平凡的内存和构造成本，但其他方法的性能都相似，除了在这里直接实现 Data.Map.Strict 的结构(感谢@dfeuer 下面的评论) .

{-# LANGUAGE BangPatterns     #-}
{-# LANGUAGE FlexibleContexts #-}


import qualified Data.IntMap.Strict as IM
import qualified Data.Map.Strict as M
import qualified Data.Map.Internal as MInt
import qualified Data.Vector as V
import qualified Data.Vector.Generic as G
import qualified Data.Vector.Unboxed as VU

bstSize :: Int
bstSize = 800

domainSize :: Int
domainSize = 1100000

divideByTenShortcut :: Int -> Int
divideByTenShortcut n
  | n < 10 = 0
  | n < 20 = 1
  | n < 30 = 2
  | n < 40 = 3
  | n < 50 = 4
  | n < 60 = 5
  | n < 70 = 6
  | n < 80 = 7
  | n < 90 = 8
  | n < 100 = 9
  | otherwise = divideByTenIntMapLookup n

divideByTenIntMapLookup :: Int -> Int
divideByTenIntMapLookup n = maybe 0 snd $ IM.lookupLE n divideByTenIntMap

divideByTenIntMap :: IM.IntMap Int
divideByTenIntMap = IM.fromList $ zip [0,10..bstSize*10] [0..]

divideByTenMapLookup :: Int -> Int
divideByTenMapLookup n = maybe 0 snd $ M.lookupLE n divideByTenMap

divideByTenMap :: M.Map Int Int
divideByTenMap = M.fromList $ zip [0,10..bstSize*10] [0..]

arrayBinarySearch :: G.Vector v Int => v Int -> Int -> Int
arrayBinarySearch vs search = loop 0 (G.length vs)
  where
    loop !low !high
       | high <= low = low
       | otherwise = case compare midval search of
            LT -> loop midpoint high
            GT -> loop low (midpoint - 1)
            EQ -> midpoint
      where
        midpoint = (low + high + 1) `quot` 2
        midval = G.unsafeIndex vs midpoint

divideByTenArrayLookup :: Int -> Int
divideByTenArrayLookup n = let i = arrayBinarySearch divideByTenArray n in V.unsafeIndex divideByTenArraySols i

divideByTenArray :: V.Vector Int
divideByTenArray = V.fromList [0,10..bstSize*10]

divideByTenArraySols :: V.Vector Int
divideByTenArraySols = V.fromList [0..bstSize]

divideByTenArrayUnboxedLookup :: Int -> Int
divideByTenArrayUnboxedLookup n = let i = arrayBinarySearch divideByTenArrayUnboxed n in VU.unsafeIndex divideByTenArrayUnboxedSols i

divideByTenArrayUnboxed :: VU.Vector Int
divideByTenArrayUnboxed = VU.fromList [0,10..bstSize*10]

divideByTenArrayUnboxedSols :: VU.Vector Int
divideByTenArrayUnboxedSols = VU.fromList [0..bstSize]

divideByTenFullArrayLookup :: Int -> Int
divideByTenFullArrayLookup = V.unsafeIndex divideByTenFullArray

divideByTenFullArray :: V.Vector Int
divideByTenFullArray = V.fromList $ map (`quot` 10) [0..domainSize]

divideByTenFullArrayUnboxedLookup :: Int -> Int
divideByTenFullArrayUnboxedLookup = VU.unsafeIndex divideByTenFullArrayUnboxed

divideByTenFullArrayUnboxed :: VU.Vector Int
divideByTenFullArrayUnboxed = VU.fromList $ map (`quot` 10) [0..domainSize]

divideByTenCharMapLookup :: Int -> Int
divideByTenCharMapLookup n = lookupLE n divideByTenCharMap

divideByTenCharMap :: CharMap
divideByTenCharMap = fromList $ zip [0,10..bstSize*10] [0..]

data CharMap = Bin {-# UNPACK #-} !Size !Int !Int !CharMap !CharMap
             | Tip
type Size     = Int

lookupLE :: Int -> CharMap -> Int
lookupLE = goNothing
  where
    goNothing !_ Tip = 0
    goNothing k (Bin _ kx x l r) = case compare k kx of LT -> goNothing k l
                                                        EQ -> x
                                                        GT -> goJust k kx x r

    goJust !_ !_ x' Tip = x'
    goJust k kx' x' (Bin _ kx x l r) = case compare k kx of LT -> goJust k kx' x' l
                                                            EQ -> x
                                                            GT -> goJust k kx x r
{-# INLINABLE lookupLE #-}

fromList :: [(Int, Int)] -> CharMap
fromList = repack . M.fromList
  where
    repack MInt.Tip = Tip
    repack (MInt.Bin s k v l r) = Bin s k v (repack l) (repack r)



import Control.Exception (evaluate)
import Criterion.Main

main :: IO ()
main = do
    evaluate divideByTenIntMap
    evaluate divideByTenMap
    evaluate divideByTenCharMap
    evaluate divideByTenArray
    evaluate divideByTenArrayUnboxed
    evaluate divideByTenFullArray
    evaluate divideByTenFullArrayUnboxed

    defaultMainWith defaultConfig
      [ bench "divideByTenIntMap"           $ nf divideByTenIntMapLookup 99
      , bench "divideByTenShortcut"         $ nf divideByTenShortcut 99
      , bench "divideByTenMap"              $ nf divideByTenMapLookup 99
      , bench "divideByTenCharMap"          $ nf divideByTenCharMapLookup 99

      , bench "divideByTenArray"            $ nf divideByTenArrayLookup 99
      , bench "divideByTenArrayUnboxed"     $ nf divideByTenArrayUnboxedLookup 99
      , bench "divideByTenFullArray"        $ nf divideByTenFullArrayLookup 99
      , bench "divideByTenFullArrayUnboxed" $ nf divideByTenFullArrayUnboxedLookup 99
      ]

Build profile: -w ghc-8.10.7 -O1

benchmarking divideByTenIntMap
time                 21.31 ns   (21.26 ns .. 21.36 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 21.31 ns   (21.27 ns .. 21.36 ns)
std dev              137.3 ps   (104.7 ps .. 190.5 ps)

benchmarking divideByTenShortcut
time                 6.881 ns   (6.868 ns .. 6.893 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.896 ns   (6.873 ns .. 6.947 ns)
std dev              112.5 ps   (40.52 ps .. 184.2 ps)
variance introduced by outliers: 23% (moderately inflated)

benchmarking divideByTenMap
time                 23.31 ns   (22.94 ns .. 23.68 ns)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 23.02 ns   (22.91 ns .. 23.26 ns)
std dev              500.1 ps   (245.4 ps .. 865.6 ps)
variance introduced by outliers: 33% (moderately inflated)

benchmarking divideByTenCharMap
time                 15.89 ns   (15.88 ns .. 15.92 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 15.90 ns   (15.88 ns .. 15.92 ns)
std dev              68.89 ps   (44.90 ps .. 113.1 ps)

benchmarking divideByTenArray
time                 27.07 ns   (27.01 ns .. 27.13 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 27.03 ns   (26.92 ns .. 27.13 ns)
std dev              334.5 ps   (248.0 ps .. 474.7 ps)
variance introduced by outliers: 14% (moderately inflated)

benchmarking divideByTenArrayUnboxed
time                 22.89 ns   (22.82 ns .. 22.99 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 22.94 ns   (22.84 ns .. 23.10 ns)
std dev              418.6 ps   (301.6 ps .. 555.1 ps)
variance introduced by outliers: 26% (moderately inflated)

benchmarking divideByTenFullArray
time                 6.639 ns   (6.626 ns .. 6.651 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.635 ns   (6.626 ns .. 6.644 ns)
std dev              28.87 ps   (24.14 ps .. 34.83 ps)

benchmarking divideByTenFullArrayUnboxed
time                 6.029 ns   (6.017 ns .. 6.041 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.021 ns   (6.013 ns .. 6.029 ns)
std dev              26.09 ps   (21.68 ps .. 31.51 ps)

Benchmark doclayout-bench: FINISH

所以问题似乎不在于IntMap 本身。检查 -ddump-simpl 的输出还显示值似乎被适本地拆箱。

最佳答案

您可能想看看这个包:hashtables .它实现了多个可变哈希表，从描述来看，您似乎可以使用一些 cabal 标志来尝试加快速度，例如，sse42 来加速缓存 -布谷鸟哈希的行搜索。

关于performance - Data.Map 是二叉搜索树的最佳数据类型吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/69716515/

文章推荐： vuejs3 - Vue 3 组合 API 中的参数是引用还是 react ？

文章推荐： macos - MAMP PRO 在启动时崩溃 (Monterey 12.0.1)

haskell - 类型家庭类型黑客
我正在尝试编写一个相当多态的库。我遇到了一种更容易表现出来却很难说出来的情况。它看起来有点像这样: {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE
javascript - 这是如何运作的？类型 = 类型 || 'any' ;
谁能解释一下这个表达式是如何工作的？ type = type || 'any'; 这是否意味着如果类型未定义则使用“任意”？最佳答案如果 type 为“falsy”(即 false，或 undef
f# - 类型 'obj' 不是接口(interface)类型
我有一个界面，在IAnimal.fs中， namespace Kingdom type IAnimal = abstract member Eat : Food -> unit 以及另一个成功
c++ - 类型(变量)与(类型)变量
这个问题在这里已经有了答案: 关闭 10 年前。 Possible Duplicate: What is the difference between (type)value and type(va
c# - 默认(可空(类型))与默认(类型)
在 C# 中，default(Nullable) 之间有区别吗？ (或 default(long?) )和 default(long) ？ Long只是一个例子，它可以是任何其他struct类型。最
scala - 如何定义一个 HList 类型，但基于另一个 HList 类型
假设我有一个案例类: case class Foo(num: Int, str: String, bool: Boolean) 现在我还有一个简单的包装器: sealed trait Wrapper[
c# - 如何在运行时定义委托(delegate)类型(即动态委托(delegate)类型)
这个问题在这里已经有了答案: Create C# delegate type with ref parameter at runtime (1 个回答) 关闭 2 年前。为了即时创建委托(dele
python - dct 中的断言失败(类型 == CV_32FC1 || 类型 == CV_64FC1)
我正在尝试获取图像的 dct。一开始我遇到了错误 The function/feature is not implemented (Odd-size DCT's are not implemented
ios - PList 类型？应用程序/x-plist 类型
我正在尝试使用 AFNetworking 的 AFPropertyListRequestOperation，但是当我尝试下载它时，出现错误预期的内容类型{( “应用程序/x-plist” )}, 得
javascript - 元素隐式具有 'any' 类型，因为索引表达式不是 'number' 类型
我在下面收到错误。我知道这段代码的意思，但我不知道界面应该是什么样子: Element implicitly has an 'any' type because index expression is
swift2 - 类型 'Error' 约束为非协议(protocol)类型，即使类型是协议(protocol)
我尝试将 SignalType 从 ReactiveCocoa 扩展为自定义 ErrorType，代码如下所示 enum MyError: ErrorType { // .. cases }
scala - 如何使用 Scala 的 this 类型、抽象类型等来实现 Self 类型？
我无法在任何其他问题中找到答案。假设我有一个抽象父类(super class) Abstract0，它有两个子类 Concrete1 和 Concrete1。我希望能够在 Abstract0 中定义类
日期时间字段上的 MySQL 索引不是 RANGE 类型，而是使用 INDEX 类型
我想知道为什么这个索引没有用在 RANGE 类型中，而是用在 INDEX 中: 索引: CREATE INDEX myindex ON orders(order_date); 查询: EXPLAIN
java - IncompleteClassChangeError ...原本应该是 direct 类型，但结果却发现是 virtual 类型
我正在使用 RxJava，现在我尝试通过提供 lambda 来订阅可观察对象: observableProvider.stringForKey(CURRENT_DELETED_ID) .sub
javascript - MIME 类型 ('text/html' ) 不是受支持的样式表 MIME 类型
我已经尝试了几乎所有解决问题的方法，其中包括。为提供类型使用app.use(express.static('public'))还有更多，但我似乎无法为此找到解决方案。 index.js : imp
css - 哪个更快？输入[类型 ="submit"] 或 [类型 ="submit"]
以下哪个 CSS 选择器更快？ input[type="submit"] { /* styles */ } 或 [type="submit"] { /* styles */ } 只是好
java - 通过构造函数参数表达的不满足的依赖关系，索引为 0 类型 -> 没有合格的 bean 类型
我不知道这个设置有什么问题，我在 IDEA 中获得了所有注释(@Controller、@Repository、@Service)，它在行号左侧显示 bean，然后转到该 bean。这是错误: 14-
c - jni 回调适用于 java 类型，但不适用于 c 类型
我听从了建议 registering java function as a callback in C function并且可以使用“简单”类型(例如整数和字符串)进行回调，例如: jstring j
java - 将 java 类型 string[] 映射到 oracle 类型
有一些 java 类，加载到 Oracle 数据库(版本 11g)和 pl/sql 函数包装器: create or replace function getDataFromJava( in_uLis
javascript - 元素隐式具有 'any' 类型，因为索引表达式不是 'number' 类型 [7015]
我已经从 David Walsh 的 css 动画回调中获取代码并将其修改为 TypeScript。但是，我收到一个错误，我不知道为什么: interface IBrowserPrefix { [

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

performance - Data.Map 是二叉搜索树的最佳数据类型吗？