gpt4 book ai didi

windows - 批量拆分一个文本文件

转载 作者:可可西里 更新时间:2023-11-01 10:56:00 33 4
gpt4 key购买 nike

我有这个批处理文件来拆分一个 txt 文件:

@echo off
for /f "tokens=1*delims=:" %%a in ('findstr /n "^" "PASSWORD.txt"') do for /f "delims=~" %%c in ("%%~b") do >"text%%a.txt" echo(%%c
pause

它有效,但它逐行拆分。我如何让它每 5000 行拆分一次。提前致谢。

编辑:

I have just tried this:

@echo off
setlocal ENABLEDELAYEDEXPANSION
REM Edit this value to change the name of the file that needs splitting. Include the extension.
SET BFN=passwordAll.txt
REM Edit this value to change the number of lines per file.
SET LPF=50000
REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
SET SFN=SplitFile

REM Do not change beyond this line.

SET SFX=%BFN:~-3%

SET /A LineNum=0
SET /A FileNum=1

For /F "delims==" %%l in (%BFN%) Do (
SET /A LineNum+=1

echo %%l >> %SFN%!FileNum!.%SFX%

if !LineNum! EQU !LPF! (
SET /A LineNum=0
SET /A FileNum+=1
)

)
endlocal
Pause
exit

但我收到一条错误消息:没有足够的存储空间来处理此命令

最佳答案

这将为您提供一个基本框架。根据需要进行调整

@echo off
setlocal enableextensions disabledelayedexpansion

set "nLines=5000"
set "line=0"

for /f "usebackq delims=" %%a in ("passwords.txt") do (
set /a "file=line/%nLines%", "line+=1"
setlocal enabledelayedexpansion
for %%b in (!file!) do (
endlocal
>>"passwords_%%b.txt" echo(%%a
)
)

endlocal

已编辑

如评论所示,一个 4.3GB 的文件很难管理。 for/f 需要将整个文件加载到内存中,并且在内存中将文件转换为 unicode 时所需的缓冲区大小是此大小的两倍。

这是一个完全临时的解决方案。我没有在那么高的文件上测试它,但至少理论上它应该可以工作(除非 5000 行需要大量内存,这取决于行长度)

并且,使用这样的文件会很慢

@echo off
setlocal enableextensions disabledelayedexpansion

set "line=0"
set "tempFile=%temp%\passwords.tmp"

findstr /n "^" passwords.txt > "%tempFile%"
for /f %%a in ('type passwords.txt ^| find /c /v "" ') do set /a "nFiles=%%a/5000"

for /l %%a in (0 1 %nFiles%) do (
set /a "e1=%%a*5", "e2=e1+1", "e3=e2+1", "e4=e3+1", "e5=e4+1"
setlocal enabledelayedexpansion
if %%a equ 0 (
set "e=/c:"[1-9]:" /c:"[1-9][0-9]:" /c:"[1-9][0-9][0-9]:" /c:"!e2![0-9][0-9][0-9]:" /c:"!e3![0-9][0-9][0-9]:" /c:"!e4![0-9][0-9][0-9]:" /c:"!e5![0-9][0-9][0-9]:" "
) else (
set "e=/c:"!e1![0-9][0-9][0-9]:" /c:"!e2![0-9][0-9][0-9]:" /c:"!e3![0-9][0-9][0-9]:" /c:"!e4![0-9][0-9][0-9]:" /c:"!e5![0-9][0-9][0-9]:" "
)
for /f "delims=" %%e in ("!e!") do (
endlocal & (for /f "tokens=1,* delims=:" %%b in ('findstr /r /b %%e "%tempFile%"') do @echo(%%c)>passwords_%%a.txt
)
)

del "%tempFile%" >nul 2>nul

endlocal

已编辑,同样:之前的代码对于以冒号开头的行将无法正常工作,因为它已在 for 命令中用作分隔行号的分隔符从数据。

作为替代方案,仍然是纯批处理但仍然很慢

@echo off
setlocal enableextensions disabledelayedexpansion

set "nLines=5000"
set "line=0"
for /f %%a in ('type passwords.txt^|find /c /v ""') do set "fileLines=%%a"

< "passwords.txt" (for /l %%a in (1 1 %fileLines%) do (
set /p "data="
set /a "file=line/%nLines%", "line+=1"
setlocal enabledelayedexpansion
>>"passwords_!file!.txt" echo(!data!
endlocal
))

endlocal

关于windows - 批量拆分一个文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23593556/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com