gpt4 book ai didi

java - 程序如何决定xml文件的编码?

转载 作者:数据小太阳 更新时间:2023-10-29 02:10:09 26 4
gpt4 key购买 nike

我在处理(解码)xml 文件时对 xml 编码有疑问。我们在文件的开头指定 xml 文件的编码,如下所示。

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

我的问题是程序读取这一行后,它决定以下内容以 UTF-8 编码。但是要阅读第一行,程序如何确定它是用 UTF-8 编码的?我的意思是在读取字节流时,程序如何知道它需要对第一行使用哪种编码?

问候,马 Jade 兰

最佳答案

写在F.1节。 xml规范:

F.1 Detection Without External Encoding Information

Because each XML entity not accompanied by external encoding information and not in UTF-8 or UTF-16 encoding must begin with an XML encoding declaration, in which the first characters must be <?xml, any conforming processor can detect, after two to four octets of input, which of the following cases apply. In reading this list, it may help to know that in UCS-4, < is #x0000003C and ? is #x0000003F, and the Byte Order Mark required of UTF-16 data streams is #xFEFF. The notation ## is used to denote any byte value except that two consecutive ##s cannot be both 00.

基本上,有两种选择:

  1. 有一个字节顺序标记(BOM)
  2. 没有 BOM。

specification然后通过查看 encoding 清楚地记录处理器应该使用的特定八位位组流表来确定要使用的编码。声明。

关于java - 程序如何决定xml文件的编码?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36493898/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com