gpt4 book ai didi

c# - 在 C# 中,如何在不破坏编码的情况下将网页保存到文件?

转载 作者:太空狗 更新时间:2023-10-30 01:12:29 24 4
gpt4 key购买 nike

这是我到目前为止得到的结果(这不起作用)。在这一点上,我认为我的目标是 Ansi 编码的,但我真的不想在这一点上知道。我的浏览器似乎能够确定要使用的编码,我该怎么做?

static void GetUrl(Uri uri, string localFileName)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
HttpWebResponse response;

response = (HttpWebResponse)request.GetResponse();

// Save the stream to file
Stream responseStream = response.GetResponseStream();
StreamReader reader = new StreamReader(responseStream, Encoding.Default);
Stream fileStream = File.OpenWrite(localFileName);
using (StreamWriter sw = new StreamWriter(fileStream, Encoding.Default))
{
sw.Write(reader.ReadToEnd());
sw.Flush();
sw.Close();
}
}

回答后(目前仅在 UTF-8 网站上测试过):

static void GetUrl(Uri uri, string localFileName)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
try
{
// Hope GetEncoding() knows how to parse the CharacterSet
Encoding encoding = Encoding.GetEncoding(response.CharacterSet);
StreamReader reader = new StreamReader(response.GetResponseStream(), encoding);
using (StreamWriter sw = new StreamWriter(localFileName, false, encoding))
{
sw.Write(reader.ReadToEnd());
sw.Flush();
sw.Close();
}
}
finally
{
response.Close();
}
}

最佳答案

网络浏览器通过三种方式尝试检测字符编码。

寻找(如果是 HTML):

<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">

或(对于 XHTML)

<?xml version="1.0" encoding="ISO-8859-1"?>

或者有时甚至在http header中指定

Content-Type: text/html; charset=ISO-8859-1

关于c# - 在 C# 中,如何在不破坏编码的情况下将网页保存到文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/293760/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com