gpt4 book ai didi

performance - 提高字符串操作的速度

转载 作者:行者123 更新时间:2023-12-04 14:46:20 25 4
gpt4 key购买 nike

这是一个后续问题,有点像这个问题:Write an efficient string replacement function? .

在(尽管遥远的) future ,我希望能够进行自然语言处理。当然,因此字符串操作的速度很重要。偶然地,我偶然发现了这个测试:http://raid6.com.au/~onlyjob/posts/arena/ - 所有测试都有偏差,这次也不异常(exception)。然而,它对我提出了重要的问题。所以我写了一些测试来看看我是怎么做的:

这是我的第一次尝试(我称之为#A):

#A

(defun test ()
(declare (optimize (debug 0) (safety 0) (speed 3)))
(loop with addidtion = (concatenate 'string "abcdefgh" "efghefgh")
and initial = (get-internal-real-time)
for i from 0 below (+ (* (/ 1024 (length addidtion)) 1024 4) 1000)
for ln = (* (length addidtion) i)
for accumulated = addidtion
then (loop with concatenated = (concatenate 'string accumulated addidtion)
for start = (search "efgh" concatenated)
while start do (replace concatenated "____" :start1 start)
finally (return concatenated))
when (zerop (mod ln (* 1024 256))) do
(format t "~&~f s | ~d Kb" (/ (- (get-internal-real-time) initial) 1000) (/ ln 1024)))
(values))

(test)

对结果感到困惑,我尝试使用 cl-ppcre - 我不知道我希望什么,但结果非常糟糕......这是我用于测试的代码:

#B
(ql:quickload "cl-ppcre")

(defun test ()
(declare (optimize (debug 0) (safety 0) (speed 3)))
(loop with addidtion = (concatenate 'string "abcdefgh" "efghefgh")
and initial = (get-internal-real-time)
for i from 0 below (+ (* (/ 1024 (length addidtion)) 1024 4) 1000)
for ln = (* (length addidtion) i)
for accumulated = addidtion
then (cl-ppcre:regex-replace-all "efgh" (concatenate 'string accumulated addidtion) "____")
when (zerop (mod ln (* 1024 256))) do
(format t "~&~f s | ~d Kb" (/ (- (get-internal-real-time) initial) 1000) (/ ln 1024)))
(values))

(test)

那么,为了避免一些概括,我决定自己写一个,虽然有点天真:

#C
(defun replace-all (input match replacement)
(declare (type string input match replacement)
(optimize (debug 0) (safety 0) (speed 3)))
(loop with pattern fixnum = (1- (length match))
with i fixnum = pattern
with j fixnum = i
with len fixnum = (length input) do
(cond
((>= i len) (return input))
((zerop j)
(loop do
(setf (aref input i) (aref replacement j) i (1+ i))
(if (= j pattern)
(progn (incf i pattern) (return))
(incf j))))
((char= (aref input i) (aref match j))
(decf i) (decf j))
(t (setf i (+ i 1 (- pattern j)) j pattern)))))

(defun test ()
(declare (optimize (debug 0) (safety 0) (speed 3)))
(loop with addidtion string = (concatenate 'string "abcdefgh" "efghefgh")
and initial = (get-internal-real-time)
for i fixnum from 0 below (+ (* (/ 1024 (length addidtion)) 1024 4) 1000)
for ln fixnum = (* (length addidtion) i)
for accumulated string = addidtion
then (replace-all (concatenate 'string accumulated addidtion) "efgh" "____")
when (zerop (mod ln (* 1024 256))) do
(format t "~&~f s | ~d Kb" (/ (- (get-internal-real-time) initial) 1000) (/ ln 1024)))
(values))

(test)

几乎和 cl-ppcre 一样慢!现在,这太不可思议了!我在这里找不到任何会导致如此糟糕性能的东西......而且它仍然很糟糕:(

意识到到目前为止标准函数的性能最好,我查看了 SBCL 源代码,经过一些阅读,我想出了这个:

#D
(defun replace-all (input match replacement &key (start 0))
(declare (type simple-string input match replacement)
(type fixnum start)
(optimize (debug 0) (safety 0) (speed 3)))
(loop with input-length fixnum = (length input)
and match-length fixnum = (length match)
for i fixnum from 0 below (ceiling (the fixnum (- input-length start)) match-length) do
(loop with prefix fixnum = (+ start (the fixnum (* i match-length)))
for j fixnum from 0 below match-length do
(when (<= (the fixnum (+ prefix j match-length)) input-length)
(loop for k fixnum from (+ prefix j) below (the fixnum (+ prefix j match-length))
for n fixnum from 0 do
(unless (char= (aref input k) (aref match n)) (return))
finally
(loop for m fixnum from (- k match-length) below k
for o fixnum from 0 do
(setf (aref input m) (aref replacement o))
finally
(return-from replace-all
(replace-all input match replacement :start k))))))
finally (return input)))

(defun test ()
(declare (optimize (debug 0) (safety 0) (speed 3)))
(loop with addidtion string = (concatenate 'string "abcdefgh" "efghefgh")
and initial = (get-internal-real-time)
for i fixnum from 0 below (+ (* (/ 1024 (length addidtion)) 1024 4) 1000)
for ln fixnum = (* (length addidtion) i)
for accumulated string = addidtion
then (replace-all (concatenate 'string accumulated addidtion) "efgh" "____")
when (zerop (mod ln (* 1024 256))) do
(format t "~&~f s | ~d Kb" (/ (- (get-internal-real-time) initial) 1000) (/ ln 1024)))
(values))

(test)

最后,我可以获胜,尽管对标准库的性能只有一小部分 - 但与几乎所有其他东西相比,它仍然非常非常糟糕......

以下是结果表:
| SBCL #A   | SBCL #B   | SBCL #C    | SBCL #D   | C gcc 4 -O3 | String size |
|-----------+-----------+------------+-----------+-------------+-------------|
| 17.463 s | 166.254 s | 28.924 s | 16.46 s | 1 s | 256 Kb |
| 68.484 s | 674.629 s | 116.55 s | 63.318 s | 4 s | 512 Kb |
| 153.99 s | gave up | 264.927 s | 141.04 s | 10 s | 768 Kb |
| 275.204 s | . . . . . | 474.151 s | 251.315 s | 17 s | 1024 Kb |
| 431.768 s | . . . . . | 745.737 s | 391.534 s | 27 s | 1280 Kb |
| 624.559 s | . . . . . | 1079.903 s | 567.827 s | 38 s | 1536 Kb |

现在,问题是:我做错了什么?这是 Lisp 字符串固有的东西吗?这可能可以通过……什么来缓解?

从长远来看,我什至会考虑编写一个专门的字符串处理库。如果问题不是我的错误代码,而是实现。这样做有意义吗?如果是,你会建议用什么语言来做?

编辑:只是为了记录,我现在正在尝试使用这个库: https://github.com/Ramarren/ropes处理字符串连接。不幸的是,它没有替换功能,并且进行多次替换并不是一件容易的事。但是当我有东西时,我会更新这篇文章。

我试图稍微改变怀远的变体以使用数组的填充指针而不是字符串连接(以实现类似于 Paulo Madeira 建议的 StringBuilder 的东西。它可能可以进一步优化,但我不确定类型/哪个方法会更快/是否值得为 *+ 重新定义类型以让它们只在 fixnumsigned-byte 上运行。无论如何,这是代码和基准:
(defun test/e ()
(declare (optimize speed))
(labels ((min-power-of-two (num)
(declare (type fixnum num))
(decf num)
(1+
(progn
(loop for i fixnum = 1 then (the (unsigned-byte 32) (ash i 1))
while (< i 17) do
(setf num
(logior
(the fixnum
(ash num (the (signed-byte 32)
(+ 1 (the (signed-byte 32)
(lognot i)))))) num))) num)))
(join (x y)
(let ((capacity (array-dimension x 0))
(desired-length (+ (length x) (length y)))
(x-copy x))
(declare (type fixnum capacity desired-length)
(type (vector character) x y x-copy))
(when (< capacity desired-length)
(setf x (make-array
(min-power-of-two desired-length)
:element-type 'character
:fill-pointer desired-length))
(replace x x-copy))
(replace x y :start1 (length x))
(setf (fill-pointer x) desired-length) x))
(seek (old str pos)
(let ((q (position (aref old 0) str :start pos)))
(and q (search old str :start2 q))))
(subs (str old new)
(loop for p = (seek old str 0) then (seek old str p)
while p do (replace str new :start1 p))
str))
(declare (inline min-power-of-two join seek subs)
(ftype (function (fixnum) fixnum) min-power-of-two))
(let* ((builder
(make-array 16 :element-type 'character
:initial-contents "abcdefghefghefgh"
:fill-pointer 16))
(ini (get-internal-real-time)))
(declare (type (vector character) builder))
(loop for i fixnum below (+ 1000 (* 4 1024 1024 (/ (length builder))))
for j = builder then
(subs (join j builder) "efgh" "____")
for k fixnum = (* (length builder) i)
when (= 0 (mod k (* 1024 256)))
do (format t "~&~8,2F sec ~8D kB"
(/ (- (get-internal-real-time) ini) 1000)
(/ k 1024))))))
    1.68 sec      256 kB
6.63 sec 512 kB
14.84 sec 768 kB
26.35 sec 1024 kB
41.01 sec 1280 kB
59.55 sec 1536 kB
82.85 sec 1792 kB
110.03 sec 2048 kB

最佳答案

瓶颈是search功能,这可能未在 SBCL 中优化。以下版本使用position帮助它跳过不可能的区域,并且速度大约是我机器上的 #A 版本的 10 倍:

(defun test/e ()
(declare (optimize speed))
(labels ((join (x y)
(concatenate 'simple-base-string x y))
(seek (old str pos)
(let ((q (position (char old 0) str :start pos)))
(and q (search old str :start2 q))))
(subs (str old new)
(loop for p = (seek old str 0) then (seek old str p)
while p do (replace str new :start1 p))
str))
(declare (inline join seek subs))
(let* ((str (join "abcdefgh" "efghefgh"))
(ini (get-internal-real-time)))
(loop for i below (+ 1000 (* 4 1024 1024 (/ (length str))))
for j = str then (subs (join j str) "efgh" "____")
for k = (* (length str) i)
when (= 0 (mod k (* 1024 256)))
do (format t "~&~8,2F sec ~8D kB"
(/ (- (get-internal-real-time) ini) 1000)
(/ k 1024))))))

关于performance - 提高字符串操作的速度,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17629896/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com