gpt4 book ai didi

go - 为什么逐行读取文件需要更多内存?

转载 作者:行者123 更新时间:2023-12-02 15:24:52 25 4
gpt4 key购买 nike

我尝试读取这种格式的大文件:

a string key, 200 values separated by comma

并将其写入 map 。

我写了这段代码:

package main

import (
"bufio"
"unsafe"
"fmt"
"log"
"os"
"runtime"
"strings"
)

func main() {

file, err := os.Open("file_address.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()

mp := make(map[string]float32)
var total_size int64 = 0
scanner := bufio.NewScanner(file)
var counter int64 = 0

for scanner.Scan() {
counter++
sliced := strings.Split(scanner.Text(), ",")
mp[sliced[0]] = 2.2
}

if err := scanner.Err(); err != nil {
log.Fatal(err)
}
fmt.Printf("loaded: %d. Took %d Mb of memory.", counter, total_size/1024.0/1024.0)
fmt.Println("Loading finished. Now waiting...")

var ms runtime.MemStats
runtime.ReadMemStats(&ms)

fmt.Printf("\n")
fmt.Printf("Alloc: %d MB, TotalAlloc: %d MB, Sys: %d MB\n",
ms.Alloc/1024/1024, ms.TotalAlloc/1024/1024, ms.Sys/1024/1024)
fmt.Printf("Mallocs: %d, Frees: %d\n",
ms.Mallocs, ms.Frees)
fmt.Printf("HeapAlloc: %d MB, HeapSys: %d MB, HeapIdle: %d MB\n",
ms.HeapAlloc/1024/1024, ms.HeapSys/1024/1024, ms.HeapIdle/1024/1024)
fmt.Printf("HeapObjects: %d\n", ms.HeapObjects)
fmt.Printf("\n")
}

这是输出:

loaded: 544594. Took 8 Mb of memory.Loading finished. Now waiting...

Alloc: 2667 MB, TotalAlloc: 3973 MB, Sys: 2831 MB
Mallocs: 1108463, Frees: 401665
HeapAlloc: 2667 MB, HeapSys: 2687 MB, HeapIdle: 11 MB
HeapObjects: 706798

Done!

虽然 key 仅占用约 8Mb,但程序占用约 2.7Gb 内存!似乎 sliced 永远不会从堆中删除。我尝试在 for 末尾设置 sliced=nil ,但没有帮助。我读过,如果我将整个文件加载到内存中然后分割它,我可以避免这个问题,但是我必须逐行读取文件,因为我没有足够的内存来加载一些较大的文件文件。

为什么内存被占用?处理完每一行后如何释放它?

最佳答案

为了高效地使用 CPU 和内存,

key := string(bytes.SplitN(scanner.Bytes(), []byte(","), 2)[0])
mp[key] = 2.2

关于go - 为什么逐行读取文件需要更多内存?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58122019/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com