gpt4 book ai didi

javascript - WebSockets 和文本编码

转载 作者:塔克拉玛干 更新时间:2023-11-02 20:30:53 25 4
gpt4 key购买 nike

我读了:

The WebSocket API accepts a DOMString object, which is encoded as UTF-8 on the wire, or one of ArrayBuffer, ArrayBufferView, or Blob objects for binary transfers.

DOMString 是 UTF-16 编码的字符串。那么通过网络使用 UTF-8 编码是否正确?

最佳答案

是的,这是正确的。

UTF-16 可能会也可能不会在内存中使用,这只是您正在使用的任何框架的实现细节。对于 JavaScript,字符串是 UTF-16。

对于 WebSocket 通信,必须通过网络使用 UTF-8 传输文本数据(如今大多数互联网协议(protocol)都使用 UTF-8)。这是由 WebSocket protocol specification 决定的:

After a successful handshake, clients and servers transfer data back and forth in conceptual units referred to in this specification as "messages". On the wire, a message is composed of one or more frames. The WebSocket message does not necessarily correspond to a particular network layer framing, as a fragmented message may be coalesced or split by an intermediary.

A frame has an associated type. Each frame belonging to the same message contains the same type of data. Broadly speaking, there are types for textual data (which is interpreted as UTF-8 [RFC3629] text), binary data (whose interpretation is left up to the application), and control frames (which are not intended to carry data for the application but instead for protocol-level signaling, such as to signal that the connection should be closed). This version of the protocol defines six frame types and leaves ten reserved for future use.

...

Data frames (e.g., non-control frames) are identified by opcodes where the most significant bit of the opcode is 0. Currently defined opcodes for data frames include 0x1 (Text), 0x2 (Binary). Opcodes 0x3-0x7 are reserved for further non-control frames yet to be defined.

Data frames carry application-layer and/or extension-layer data. The opcode determines the interpretation of the data:

Text

The "Payload data" is text data encoded as UTF-8. Note that a particular text frame might include a partial UTF-8 sequence; however, the whole message MUST contain valid UTF-8. Invalid UTF-8 in reassembled messages is handled as described in Section 8.1.

Binary

The "Payload data" is arbitrary binary data whose interpretation is solely up to the application layer.

从 UTF-16 到 UTF-8 再到 UTF-16 的转换会产生少量开销,但现代机器上的开销很小,而且 UTF 之间的转换是无损的。

关于javascript - WebSockets 和文本编码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43529031/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com