gpt4 book ai didi

file - 读取非常大的文件

转载 作者:IT王子 更新时间:2023-10-29 01:33:37 26 4
gpt4 key购买 nike

我正在尝试读取一个包含 200 多列和 1000 多行的文件。我使用以下代码:

var result []string

file, err := os.Open("t8.txt")
if (err != nil) {
fmt.Println(err)
}
defer file.Close()
scan := bufio.NewScanner(file)
for scan.Scan() {
result = append(result, scan.Text())

}


fmt.Println(scan.Err()) //token too long

但是,当我打印出结果时,我得到的只是第一行,因为它说 token 太长。当我在较小的文件上尝试时,它工作正常。 Go 中有没有一种方法可以扫描大文件?

最佳答案

正如@Dave C 在评论中指出的那样,您遇到了 MaxScanTokenSize = 64 * 1024

要绕过该限制,请使用 bufio.Reader,它具有适合您的情况的 ReadString(delim byte) 方法。

来自 Scanner go doc(特别是最后一句话):

Scanner provides a convenient interface for reading data such as a file of newline-delimited lines of text. Successive calls to the Scan method will step through the 'tokens' of a file, skipping the bytes between the tokens. The specification of a token is defined by a split function of type SplitFunc; the default split function breaks the input into lines with line termination stripped. Split functions are defined in this package for scanning a file into lines, bytes, UTF-8-encoded runes, and space-delimited words. The client may instead provide a custom split function.

Scanning stops unrecoverably at EOF, the first I/O error, or a token too large to fit in the buffer. When a scan stops, the reader may have advanced arbitrarily far past the last token. Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.

关于file - 读取非常大的文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29442006/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com