gpt4 book ai didi

戈朗 : Unzip files in Go gets char encoding problems in the files names when file has been zipped in windows

转载 作者:行者123 更新时间:2023-12-04 03:31:00 24 4
gpt4 key购买 nike

我正在尝试使用 zip 库在 Go (Golang) 中解压缩文件。问题是,当 zip 文件在 Windows 中被压缩时,所有特殊字符都会变得困惑。windows 可能使用 windows1252 字符编码。只是不知道如何解压缩这些文件。我已经尝试使用 golang.org/x/text/encoding/charmapgolang.org/x/text/transform,但没有成功。我想,在 zip 库中应该有一个替代品来改变 charmap。

另一个问题:有时应用程序会解压缩在 Windows 上压缩的文件,有时在不同的操作系统上压缩。因此,应用程序需要识别字符编码。

这是代码(感谢:https://golangcode.com/unzip-files-in-go/):

package main

import (
"archive/zip"
"fmt"
"io"
"log"
"os"
"path/filepath"
"strings"
)

func main() {

files, err := Unzip("Edificações e Instalações Operacionais - 08.03 a 12.03.2021.zip", "output-folder")
if err != nil {
log.Fatal(err)
}

fmt.Println("Unzipped:\n" + strings.Join(files, "\n"))
}

// Unzip will decompress a zip archive, moving all files and folders
// within the zip file (parameter 1) to an output directory (parameter 2).
func Unzip(src string, dest string) ([]string, error) {

var filenames []string

r, err := zip.OpenReader(src)
if err != nil {
return filenames, err
}
defer r.Close()

for _, f := range r.File {

// Store filename/path for returning and using later on
fpath := filepath.Join(dest, f.Name)


if !strings.HasPrefix(fpath, filepath.Clean(dest)+string(os.PathSeparator)) {
return filenames, fmt.Errorf("%s: illegal file path", fpath)
}

filenames = append(filenames, fpath)

if f.FileInfo().IsDir() {
// Make Folder
os.MkdirAll(fpath, os.ModePerm)
continue
}

// Make File
if err = os.MkdirAll(filepath.Dir(fpath), os.ModePerm); err != nil {
return filenames, err
}

outFile, err := os.OpenFile(fpath, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, f.Mode())
if err != nil {
return filenames, err
}

rc, err := f.Open()
if err != nil {
return filenames, err
}

_, err = io.Copy(outFile, rc)

// Close the file without defer to close before next iteration of loop
outFile.Close()
rc.Close()

if err != nil {
return filenames, err
}
}
return filenames, nil
}

This is The Output

最佳答案

如果我们只打印第一个压缩条目:

package main
import "archive/zip"

func main() {
s := "Edificações_e_Instalações_Operacionais_08_03_a_12_03_2021.zip"
f, e := zip.OpenReader(s)
if e != nil {
panic(e)
}
defer f.Close()
println(f.File[0].Name)
}

我们得到这个结果:

Edifica��es e Instala��es Operacionais - 08.03 a 12.03.2021/

根据这个页面:

In Brazil, however, the most widespread codepage —and that which DOS inBrazilian portuguese used by default— was code page 850.

https://wikipedia.org/wiki/Code_page_860

所以我们可以修改代码来处理这个问题:

package main

import (
"archive/zip"
"golang.org/x/text/encoding/charmap"
)

func main() {
z := "Edificações_e_Instalações_Operacionais_08_03_a_12_03_2021.zip"
f, e := zip.OpenReader(z)
if e != nil {
panic(e)
}
defer f.Close()
s, e := charmap.CodePage850.NewDecoder().String(f.File[0].Name)
if e != nil {
panic(e)
}
println(s)
}

我们得到正确的结果:

Edificações e Instalações Operacionais - 08.03 a 12.03.2021/

https://pkg.go.dev/golang.org/x/text/encoding/charmap

关于戈朗 : Unzip files in Go gets char encoding problems in the files names when file has been zipped in windows,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66800673/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com