arrays - Haskell:我可以将整数直接读入数组吗？-6ren

arrays - Haskell:我可以将整数直接读入数组吗？

转载作者：行者123 更新时间：2023-12-04 18:11:01

在 this编程问题，输入是 n × m整数矩阵。通常，n ≈ 105 和 m ≈ 10. 官方solution (1606D，Tutorial)非常必要:它涉及一些矩阵操作、预计算和聚合。为了好玩，我把它当作一个 STUArray 实现练习。
问题
我已经设法使用 STUArray 实现它，但程序仍然需要 way more memory超过允许的 (256MB)。即使在本地运行，最大驻留集大小也是 >400 MB。在分析时，从标准输入读取似乎主导了内存占用:

功能 readv和 readv.readInt ，负责解析整数并将它们保存到二维列表中，是大约 50-70 MB，而不是大约 16 MB =(106 个整数)×(每个整数 8 个字节 + 每个链接 8 个字节)。
有希望吗总内存低于 256 MB？ 我已经在使用 Text输入包。也许我应该完全避免列表和直接从标准输入读取整数到数组。我们怎么能做到这一点？ 或者，是其他地方的问题吗？
代码

{-# OPTIONS_GHC -O2 #-}
module CF1606D where
import qualified Data.Text as T
import qualified Data.Text.IO as TI
import qualified Data.Text.Read as TR
import Control.Monad
import qualified Data.List as DL
import qualified Data.IntSet as DS
import Control.Monad.ST
import Data.Array.ST.Safe
import Data.Int (Int32)
import Data.Array.Unboxed

solve :: IO ()
solve =  do
  ~[n,m] <- readv      
  -- 2D list
  input <- {-# SCC input #-} replicateM (fromIntegral n) readv     
  let
      ints = [1..]
      sorted = DL.sortOn (head.fst) (zip input ints)
      (rows,indices) = {-# SCC rows_inds #-} unzip sorted    
      -- 2D list converted into matrix:
      matrix = mat (fromIntegral n) (fromIntegral m) rows           
      infinite = 10^7
      asc x y = [x,x+1..y]
      desc x y = [y,y-1..x]    
      -- Four prefix-matrices:
      tlMax = runSTUArray $ prefixMat max 0 asc asc (subtract 1) (subtract 1) =<< matrix
      blMin = runSTUArray $ prefixMat min infinite desc asc (+1) (subtract 1) =<< matrix
      trMin = runSTUArray $ prefixMat min infinite asc desc (subtract 1) (+1) =<< matrix
      brMax = runSTUArray $ prefixMat max 0 desc desc (+1) (+1) =<< matrix    
      good _ (i,j)
        | tlMax!(i,j) < blMin!(i+1,j) && brMax!(i+1,j+1) < trMin!(i,j+1) = Left (i,j)
        | otherwise = Right ()
      {-# INLINABLE good #-}
      nearAns = foldM good () [(i,j)|i<-[1..n-1],j<-[1..m-1]]
      ans = either (\(i,j)-> "YES\n" ++ color n (take i indices) ++ " " ++ show j) (const "NO") nearAns
  putStrLn ans

type I = Int32
type S s = (STUArray s (Int, Int) I)
type R = Int -> Int -> [Int]
type F = Int -> Int

mat :: Int -> Int -> [[I]] -> ST s (S s)
mat n m rows = newListArray ((1,1),(n,m)) $ concat rows

prefixMat :: (I->I->I) -> I -> R -> R -> F -> F -> S s -> ST s (S s)
prefixMat opt worst ordi ordj previ prevj mat = do
  ((ilo,jlo),(ihi,jhi)) <- getBounds mat
  pre <- newArray ((ilo-1,jlo-1),(ihi+1,jhi+1)) worst
  forM_ (ordi ilo ihi) $ \i-> do
    forM_ (ordj jlo jhi) $ \j -> do
      matij <- readArray mat (i,j)
      prei <- readArray pre (previ i,j)
      prej <- readArray pre (i, prevj j)
      writeArray pre (i,j) (opt (opt prei prej) matij)
  return pre

color :: Int -> [Int] -> String
color n inds = let
  temp = DS.fromList inds
  colors = [if DS.member i temp then 'B' else 'R' | i<-[1..n]]
  in colors

readv :: Integral t => IO [t]
readv = map readInt . T.words <$> TI.getLine where
  readInt = fromIntegral . either (const 0) fst . TR.signed TR.decimal
{-# INLINABLE readv #-}

main :: IO ()
main = do
  ~[n] <- readv
  replicateM_ n solve

上面代码的简要说明:

阅读 n每行有 m整数。

按行的第一个元素对行进行排序。

现在计算四个“前缀矩阵”，每个角一个。对于左上角和右下角，它是前缀最大值，对于其他两个角，它是我们需要计算的前缀最小值。

找到这些前缀矩阵满足以下条件的单元格 [i,j]:top_left [i,j] < bottom_left [i,j] and top_right [i,j] > bottom_right [i,j]

对于第 1 行到第 i 行，将它们的原始索引(即在未排序的输入矩阵中的位置)标记为蓝色。将其余标记为红色。

示例输入和命令
样本输入: inp3.txt .
命令:

> stack ghc -- -main-is CF1606D.main -with-rtsopts="-s -h -p -P" -rtsopts -prof -fprof-auto CF1606D
> gtime -v ./CF1606D < inp3.txt > outp
    ...
    ...
    MUT     time    2.990s  (  3.744s elapsed)    #    RTS -s output
    GC      time    4.525s  (  6.231s elapsed)    #    RTS -s output
    ...
    ...
    Maximum resident set size (kbytes): 408532    #    >256 MB (gtime output)

> stack exec -- hp2ps -t0.1 -e8in -c CF1606D.hp && open CF1606D.ps

GC问题:如上面 +RTS -s 输出所示，GC 似乎比实际逻辑执行花费的时间更长。这是正常的吗？有没有办法随着时间的推移可视化 GC 事件？我尝试使矩阵严格，但这没有任何影响。
可能这根本不是一个功能友好的问题(尽管我很高兴在这方面被反驳)。例如， Java 也使用 GC，但有很多成功的 Java 提交。 不过，我想看看我能推多远。谢谢!

最佳答案

与普遍看法相反，Haskell 对此类问题非常友好。真正的问题是 array GHC 附带的库完全是垃圾。另一个大问题是每个人都在 Haskell 中被教导在应该使用数组的地方使用列表，这通常是慢代码和内存膨胀程序的主要来源之一。因此，GC 需要很长时间也就不足为奇了，这是因为分配的东西太多了。以下是针对下面提供的解决方案提供的输入的运行:

   1,483,547,096 bytes allocated in the heap
         566,448 bytes copied during GC
      18,703,640 bytes maximum residency (3 sample(s))
       1,223,400 bytes maximum slop
              32 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      1399 colls,     0 par    0.009s   0.009s     0.0000s    0.0011s
  Gen  1         3 colls,     0 par    0.002s   0.002s     0.0006s    0.0016s

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.001s  (  0.001s elapsed)
  MUT     time    0.484s  (  0.517s elapsed)
  GC      time    0.011s  (  0.011s elapsed)
  EXIT    time    0.001s  (  0.002s elapsed)
  Total   time    0.496s  (  0.530s elapsed)

下面提供的解决方案使用数组库 massiv ，这使得无法提交给 codeforces。然而，希望目标是在 Haskell 上做得更好，而不是在某些网站上获得积分。
红蓝矩阵可以分为两个阶段:读取和求解
读
阅读尺寸
在 main函数我们只读取数组的总数和每个数组的维度。我们也打印结果。这里没有什么令人兴奋的。 (请注意，链接文件 inp3.txt 的数组大于问题中定义的限制: n*m <= 10^6)

import Control.Monad.ST
import Control.Monad
import qualified Data.ByteString as BS
import Data.Massiv.Array as A hiding (B)
import Data.Massiv.Array.Mutable.Algorithms (quicksortByM_)
import Control.Scheduler (trivialScheduler_)

main :: IO ()
main = do
  t <- Prelude.read <$> getLine
  when (t < 1 || t > 1000) $ error $ "Invalid t: " ++ show t
  replicateM_ t $ do
    dimsStr <- getLine
    case Prelude.map Prelude.read (words dimsStr) of
      -- Test file fails this check: && n * m <= 10 ^ (6 :: Int) -> do
      [n, m] | n >= 2 && m > 0 && m <= 5 * 10 ^ (5 :: Int) -> do
        mat <- readMatrix n m
        case solve mat of
          Nothing -> putStrLn "NO"
          Just (ix, cs) -> do
            putStrLn "YES"
            putStr $ foldMap show cs
            putStr " "
            print ix
      _ -> putStrLn $ "Unexpected dimensions: " ++ show dimsStr

读取数组
将输入加载到数组中是原始问题的主要问题来源:

没有必要依赖text , ascii 字符是问题预期的唯一有效输入。

输入被读入列表列表。该列表列表是内存开销的真正来源。

对列表进行排序非常慢并且需要大量内存。

通常在这种情况下，最好使用类似 conduit 的方式以流方式读取输入。 .特别是，将输入读取为字节流并将这些字节解析为数字将是最佳解决方案。话虽如此，在问题描述中对每个数组的宽度有严格要求，因此我们可以将输入逐行读取为 ByteString然后解析每行中的数字(为简单起见假设无符号)并将这些数字同时写入数组。这确保在这个阶段我们将只分配结果数组和单行作为字节序列。这可以通过像 attoparsec 这样的解析库来完成。，但问题很简单，只需临时执行即可。

type Val = Word

readMatrix :: Int -> Int -> IO (Matrix P Val)
readMatrix n m = createArrayS_ (Sz2 n m) readMMatrix

readMMatrix :: MMatrix RealWorld P Val -> IO ()
readMMatrix mat =
  loopM_ 0 (< n) (+ 1) $ \i -> do
    line <- BS.getLine
    --- ^ reads at most 10Mb because it is known that input will be at most
    -- 5*10^5 Words: 19 digits max per Word and one for space: 5*10^5 * 20bytes
    loopM 0 (< m) (+ 1) line $ \j bs ->
      let (word, bs') = parseWord bs
       in bs' <$ write_ mat (i :. j) word
  where
    Sz2 n m = sizeOfMArray mat
    isSpace = (== 32)
    isDigit w8 = w8 >= 48 && w8 <= 57
    parseWord bs =
      case BS.uncons bs of
        Just (w8, bs')
          | isDigit w8 -> parseWordLoop (fromIntegral (w8 - 48)) bs'
          | otherwise -> error $ "Unexpected byte: " ++ show w8
        Nothing -> error "Unexpected end of input"
    parseWordLoop !acc bs =
      case BS.uncons bs of
        Nothing -> (acc, bs)
        Just (w8, bs')
          | isSpace w8 -> (acc, bs')
          | isDigit w8 -> parseWordLoop (acc * 10 + fromIntegral (w8 - 48)) bs'
          | otherwise -> error $ "Unexpected byte: " ++ show w8

解决
这是我们实现实际解决方案的步骤。我没有尝试修复这个 SO 问题中提供的解决方案，而是继续翻译 C++ solution而是在问题中链接。我走这条路的原因是双重的:

C++ 解决方案是非常命令式的，我想证明命令式数组操作对 Haskell 来说并不是那么陌生，所以我尝试创建一个尽可能接近的翻译。

我知道解决方案有效

请注意，应该可以用 array 重写下面的解决方案包，因为最后只需要 read , write和 allocate操作。

computeSortBy ::
     (Load r Ix1 e, Manifest r' e)
  => (e -> e -> Ordering)
  -> Vector r e
  -> Vector r' e
computeSortBy f vec = 
  withLoadMArrayST_ vec $ quicksortByM_ (\x y -> pure $ f x y) trivialScheduler_

solve :: Matrix P Val -> Maybe (Int, [Color])
solve a = runST $ do
  let sz@(Sz2 n m) = size a
      ord :: Vector P Int
      ord = computeSortBy 
              (\x y -> compare (a ! (y :. 0)) (a ! (x :. 0))) (0 ..: n)
  mxl <- newMArray @P sz minBound
  loopM_ (n - 1) (>= 0) (subtract 1) $ \ i ->
    loopM_ 0 (< m) (+ 1) $ \j -> do
      writeM mxl (i :. j) (a ! ((ord ! i) :. j))
      when (i < n - 1) $
        writeM mxl (i :. j) 
          =<< max <$> readM mxl (i :. j) <*> readM mxl (i + 1 :. j)
      when (j > 0) $
        writeM mxl (i :. j) 
          =<< max <$> readM mxl (i :. j) <*> readM mxl (i :. j - 1)
  mnr <- newMArray @P sz maxBound
  loopM_ (n - 1) (>= 0) (subtract 1) $ \ i ->
    loopM_ (m - 1) (>= 0) (subtract 1) $ \ j -> do
      writeM mnr (i :. j) (a ! ((ord ! i) :. j))
      when (i < n - 1) $
        writeM mnr (i :. j) 
          =<< min <$> readM mnr (i :. j) <*> readM mnr (i + 1 :. j)
      when (j < m - 1) $
        writeM mnr (i :. j) 
          =<< min <$> readM mnr (i :. j) <*> readM mnr (i :. j + 1)
  mnl <- newMArray @P (Sz m) maxBound
  mxr <- newMArray @P (Sz m) minBound
  let goI i
        | i < n - 1 = do
          loopM_ 0 (< m) (+ 1) $ \j -> do
            val <- min (a ! ((ord ! i) :. j)) <$> readM mnl j
            writeM mnl j val
            when (j > 0) $
              writeM mnl j . min val =<< readM mnl (j - 1)
          loopM_ (m - 1) (>= 0) (subtract 1) $ \j -> do
            val <- max (a ! ((ord ! i) :. j)) <$> readM mxr j
            writeM mxr j val
            when (j < m - 1) $
              writeM mxr j . max val =<< readM mxr (j + 1)
          let goJ j
                | j < m - 1 = do
                    mnlVal <- readM mnl j
                    mxlVal <- readM mxl (i + 1 :. j)
                    mxrVal <- readM mxr (j + 1)
                    mnrVal <- readM mnr ((i + 1) :. (j + 1))
                    if mnlVal > mxlVal && mxrVal < mnrVal
                      then pure $ Just (i, j)
                      else goJ (j + 1)
                | otherwise = pure Nothing
          goJ 0 >>= \case
            Nothing -> goI (i + 1)
            Just pair -> pure $ Just pair
        | otherwise = pure Nothing
  mAns <- goI 0
  Control.Monad.forM mAns $ \ (ansFirst, ansSecond) -> do
    resVec <- createArrayS_ @BL (Sz n) $ \res ->
        iforM_ ord $ \i ordIx -> do
          writeM res ordIx $! if i <= ansFirst then R else B
    pure (ansSecond + 1, A.toList resVec)

关于arrays - Haskell:我可以将整数直接读入数组吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/70143678/

文章推荐： haskell - list monad 不是一个免费的 monad，但是……

文章推荐： list - Do-notation 和列表单子(monad)

文章推荐： haskell - GHC 中函数参数的内联

文章推荐： haskell - GHC 9 中的 `forall {..}`

arrays - 为什么是 &array != &array[0]？
在 C 中: int a[10]; printf("%p\n", a); printf("%p\n", &a[0]); 产量: 0x7fff5606c600 0x7fff5606c600 这是我所期望
arrays - 替换 Array of Array 中元素的位置
我一直在尝试运行此循环来更改基于数组的元素的位置，但出现以下错误。不太确定哪里出了问题。任何想法或想法!谢谢。 var population = [[98, 8, 45, 34, 56], [9, 1
arrays - Ruby Array of Arrays 按值分组和计数
我正在尝试获取一个 Ruby 数组数组并将其分组以计算其值。数组有一个月份和一个 bool 值: array = [["June", false], ["June", false], ["June"
javascript - array.split ("stop here") array to array of arrays in javascript 数组
所以我们的目标是在遇到某个元素时将数组分割成子数组下面的示例 array.split("stop here") ["haii", "keep", "these in the same array bu
java - Arrays.stream(array) 与 Arrays.asList(array).stream()
在this问题已经回答了两个表达式是相等的，但在这种情况下它们会产生不同的结果。对于给定的 int[] 分数，为什么会这样: Arrays.stream(scores) .forEac
arrays - 我如何制作 Perl "array of arrays of hashes"？
我认为我需要的是哈希数组的数组，但我不知道如何制作它。 Perl 能做到吗？如果是这样，代码会是什么样子？最佳答案 perldoc perldsc是了解 Perl 数据结构的好文档。关于arra
android - GSON 解析 Array in array in array
我遇到了这个问题，从 API 中我得到一个扩展 JSON，其中包含一个名为坐标的对象，该对象是一个包含数组 o 数组的数组。为了更清楚地看这个例子: "coordinates": [
arrays - Postgres JSONB : where clause for arrays of arrays
postgres 中有(v 9.5，如果重要的话): create table json_test( id varchar NOT NULL, data jsonb NOT NULL, PRIM
arrays - bash中echo "${array[@]}"echo "${array[*]}"有什么区别
我用 echo "${array[@]}" 和 echo "${array[*]}" 得到了相同的结果。如果我这样做: mkdir 假音乐； touch fakemusic/{Beatles,Sto
arrays - Array of arrays of typealias 表达式类型不明确，没有更多上下文
我正在尝试创建 typealias 对象的数组数组 - 但我收到“表达式类型不明确，没有更多上下文”编译错误。这是我的代码: typealias TestClosure = ((message: St
python - array.array 与 numpy.array
如果您在 Python 中创建一维数组，使用 NumPy 包有什么好处吗？最佳答案这完全取决于您打算如何处理数组。如果您所做的只是创建简单数据类型的数组并进行 I/O，array模块就可以了。另
arrays - Perl6 : Pushing an array to an array of arrays with one element seems not to work as expected
当我将数组推送到只有一个数组作为其唯一元素的数组数组时，为什么会得到这种数据结构？ use v6; my @d = ( [ 1 .. 3 ] ); @d.push( [ 4 .. 6 ] ); @d.
arrays - 如何从 Array{Array{Int64,2},1} 转换为 Array{Int64,2}
在 Julia 中，我想将定义为二维数组向量的数据转换为二维矩阵数组。如下例所述，我想把数据s转换成数据t，但是至今没有成功。我该如何处理这个案子？ julia> s = [[1 2 3], [4
c - 1[&array] 是 &array[sizeof(array)/sizeof(array[0])] 的合适替代品还是太混淆了？
C 没有elementsof 关键字来获取数组的元素数。所以这通常由计算 sizeof(Array)/sizeof(Array[0]) 代替但这需要重复数组变量名。1[&Array] 是指向数组后第一
arrays - 为什么我可以调用 array.some() 而不是 array.every() 联合数组类型？
所以，假设我有一个像这样的(愚蠢的)函数: function doSomething(input: number|string): boolean { if (input === 42 || in
arrays - 需要的公式 : Sort array to array -"zig-zag"
我有以下数组: a = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16] 我将它用于一些像这样的视觉内容: 1 2 3 4 5 6 7 8 9 10
arrays - Scala:array.toList 与 array.to[List]
我想知道数组中的 .toList 与 .to[List] 之间有什么区别。我在spark-shell中做了这个测试，结果没有区别，但我不知道用什么更好。任何意见？ scala> val l = Arr
arrays - Array.Find和IndexOf用于完全相同对象的多个元素
我很难获得完全相同对象的多个元素的当前元素索引: $b = "A","D","B","D","C","E","D","F" $b | ? { $_ -contains "D" } 替代版本: $b =
arrays - Vuetify : How to do a v-select search in array of arrays
我正在尝试使用来自我的 API 的 v-select 执行 options，我将数据放在数组数组中。 Array which I got from API 它应该是一个带有搜索的 select，因为它
arrays - char *array 和 char array[] 之间的内存区别是什么？
这个问题在这里已经有了答案: String literals: pointer vs. char array (1 个回答) 4 个月前关闭。当我执行下一个代码时 int main() {

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

arrays - Haskell:我可以将整数直接读入数组吗？