java - 与 ASCII 不同的编码，即使对于字母也是如此-6ren

java - 与 ASCII 不同的编码，即使对于字母也是如此

转载作者：行者123 更新时间：2023-11-29 04:47:11

24

4

是否有任何字符编码在消费类设备(相对于大型机)上相当常见，并且将字母 A-Za-z0-9 映射为与 ASCII 不同的字符编码？

目前我正在考虑 Java 应用程序，所以我想知道是否有任何机会在某些国家/地区使用某些 Java 软件的临时用户最终可能会得到 defaultCharset。以这样的方式报告 "AZaz09".getBytes()返回不同于 "AZaz09".getBytes("UTF-8") 的内容.我正在尝试弄清楚我是否必须解决某些兼容性问题，这些问题可能由这方面的不同行为导致。

我知道，从历史上看，EBCDIC 是 ASCII 不兼容编码的主要示例。但它是否被用于任何最新的消费设备，或仅用于 IBM 大型机和老式计算机？ EBCDIC 的遗产是否存在于某些国家/地区的通用编码中？

我还知道 UTF-16 与 ASCII 不兼容，并且在 Windows 上以这种方式对文件进行编码是很常见的。但据我所知，这始终只是文件内容，而不是默认的应用程序区域设置。用户是否可以将他们的 Windows 机器配置为使用 UTF-16 作为系统代码页，而不会破坏至少一半的应用程序？

据我所知，亚洲使用的所有前 Unicode 多字节编码仍然将 ASCII 范围 00-7F 映射到至少在字母和数字方面与 ASCII 兼容的内容。是否有任何仍在使用的亚洲编码对其代码点使用超过一个字节的所有？或者也许在其他大陆？

最佳答案

这是一个简单的程序，可以找出答案。失败的字符集是否足够常见由您决定。

import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;

public class EncodingTest {
    public static void main(String[] args) {
        String s = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
        byte[] b = s.getBytes(StandardCharsets.UTF_8);
        for (Charset cs : Charset.availableCharsets().values()) {
            try {
                byte[] b2 = s.getBytes(cs);
                if (!Arrays.equals(b, b2)) {
                    System.out.println(cs.displayName() + " doesn't give the same result");
                }
            }
            catch (Exception e) {
                System.out.println(cs.displayName() + " throws an exception");
            }
        }
    }
}

我机器上的结果是

IBM-Thai doesn't give the same result
IBM01140 doesn't give the same result
IBM01141 doesn't give the same result
IBM01142 doesn't give the same result
IBM01143 doesn't give the same result
IBM01144 doesn't give the same result
IBM01145 doesn't give the same result
IBM01146 doesn't give the same result
IBM01147 doesn't give the same result
IBM01148 doesn't give the same result
IBM01149 doesn't give the same result
IBM037 doesn't give the same result
IBM1026 doesn't give the same result
IBM1047 doesn't give the same result
IBM273 doesn't give the same result
IBM277 doesn't give the same result
IBM278 doesn't give the same result
IBM280 doesn't give the same result
IBM284 doesn't give the same result
IBM285 doesn't give the same result
IBM290 doesn't give the same result
IBM297 doesn't give the same result
IBM420 doesn't give the same result
IBM424 doesn't give the same result
IBM500 doesn't give the same result
IBM870 doesn't give the same result
IBM871 doesn't give the same result
IBM918 doesn't give the same result
ISO-2022-CN throws an exception
JIS_X0212-1990 doesn't give the same result
UTF-16 doesn't give the same result
UTF-16BE doesn't give the same result
UTF-16LE doesn't give the same result
UTF-32 doesn't give the same result
UTF-32BE doesn't give the same result
UTF-32LE doesn't give the same result
x-IBM1025 doesn't give the same result
x-IBM1097 doesn't give the same result
x-IBM1112 doesn't give the same result
x-IBM1122 doesn't give the same result
x-IBM1123 doesn't give the same result
x-IBM1364 doesn't give the same result
x-IBM300 doesn't give the same result
x-IBM833 doesn't give the same result
x-IBM834 doesn't give the same result
x-IBM875 doesn't give the same result
x-IBM930 doesn't give the same result
x-IBM933 doesn't give the same result
x-IBM935 doesn't give the same result
x-IBM937 doesn't give the same result
x-IBM939 doesn't give the same result
x-JIS0208 doesn't give the same result
x-JISAutoDetect throws an exception
x-MacDingbat doesn't give the same result
x-MacSymbol doesn't give the same result
x-UTF-16LE-BOM doesn't give the same result
X-UTF-32BE-BOM doesn't give the same result
X-UTF-32LE-BOM doesn't give the same result

关于java - 与 ASCII 不同的编码，即使对于字母也是如此，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/36688767/

24

4

0

文章推荐： ios - 允许非程序员测试 WIP 应用程序的最简单方法

文章推荐： ios - 钛create2DMatrix丑陋的改造结果

文章推荐： JAVA - 为树数据结构创建一个迭代器，实现迭代器，并获取头节点

文章推荐： mysql - 通过相同的键将值从一个表插入到另一个表

ascii - ASCII 中的双引号
双引号的 ASCII 数字是多少？ (") 另外，是否有指向任何地方的列表的链接？最后，如何进入C族(尤其是C#) 最佳答案引号的 ASCII 码是 34。 (好吧，严格来说，它不是真正的引号，而
ascii - ASCII 字符如何存储在内存中？
考虑一台计算机，它有一个字节可寻址内存，根据大端方案组织成 32 位字。程序读取在键盘上输入的 ASCII 字符并将它们存储在连续的字节位置，从位置 1000 开始。在输入名称“johnson”后显示
ascii - 大多数 ASCII 控制字符是否已过时？
\x20 下的大多数 ASCII 代码似乎完全过时了。他们今天有没有使用？它们是否可以被视为“可供抢夺”，还是最好避免它们？我需要一个分隔符来将“行”分组在一起，为此目的选择其中一个肯定会很好。来
ascii - 为什么不可打印的 ASCII 字符实际上可以打印？
非字母数字或标点符号的字符称为不可打印: Codes 20hex to 7Ehex, known as the printable characters 那么为什么是例如005 可表示(并由 club
ascii - 为什么在 ASCII 表中大写字母排在小写字母之前？
在我的一次面试中，面试官问我为什么在 ASCII 表中大写字母在小写字母之前，我在 google.com 上搜索但没有找到，谁能给我答案？多谢! 最佳答案我只是猜测，但我想这是因为最早的字符集根本没
ascii - 普通文本中最少使用的分隔符 < ASCII 128
由于编码原因可能会让您感到恐惧(我不好意思说)，我需要在单个字符串中存储多个文本项。我将使用一个字符来分隔它们。哪个字符最适合用于此目的，即哪个字符最不可能出现在文本中？必须是可打印的，并且可能小
ascii - 安全 ASCII 字符以在存储前替换空格
我的代码将一大堆文本数据传递给负责存储这些数据的遗留库。但是，它倾向于删除尾随空格。当我读回数据时，这是一个问题。由于我无法更改遗留代码，因此我考虑用一些不常见的 ASCII 字符替换所有空格。当我读
ascii - 正确的英镑符号的 ASCII 值
我正在检查井号 (£) 的 ASCII 值。我找到了多个答案: http://www.ascii-code.com/说 A3 = 163 是井号的 ASCII 值。 http://www.asciit
ascii - 其他 ASCII 控制字符在哪里？
我们好像只用了'\0'(null),'\a'(bell),'\b'(backspace),'\t'(水平制表符),'\n'(line fee) ,'\r'(回车),'\v'(垂直制表符),'\e'(转
ascii - 为什么这些 ASCII 方法不一致？
当我查看 rust ASCII operations感觉之间存在一致性问题 is_lowercase/is_uppercase: pub fn is_uppercase(&self) -> bool
ascii - 255 以上的扩展 ASCII 码
我一直假设 ASCII 码的范围是 0 到 255。昨晚我不得不处理一个我认为是下划线但结果是 Chr(8230) 的字符。三个类似下划线的小点。这是在 AutoHotKey 脚本中。问题已解决，但给
ascii - "base ten ASCII"是什么意思？
也许我在使用 Google 方面做得很糟糕，但这些规范适用于 Bencoding继续引用称为“十进制 ASCII”的东西，这让我认为它与常规 ASCII 不同。有人能解释一下吗？最佳答案 base明
ascii - 在 Ada 中将字符串转换为 ascii
我正在尝试将小字符串转换为它们各自的 ascii 十进制值。就像将字符串“Ag”转换为“065103”一样。我尝试使用 integer_variable : Integer := Integer'V
ascii-art - 带有可选字母的 ASCII 艺术库
我想使用程序或图形库将图像转换为 ASCII 艺术，但我想指定要使用的调色板(符号)。所以基本上我想要一个图像，它从某个字母 A 呈现为文本，它是完整 ASCII 表的子集，例如 A := {a,b,
ascii - Graphviz 和 ascii 输出
是否可以使用 Graphviz 绘制 ASCII 图表？类似的事情: digraph { this -> is this -> a a -> test } 给出了不想要的结果。相反，我
ascii-art - 如何生成文本 ASCII 艺术
关闭。这个问题是off-topic .它目前不接受答案。想改进这个问题吗？ Update the question所以它是on-topic用于堆栈溢出。关闭 11 年前。 Improve thi
Bash:将非 ASCII 字符转换为 ASCII
如何将 Žvaigždės aukštybėj užges 或 äüöÖÜÄ 之类的字符串转换为 Zvaigzdes aukstybej uzges 或 auoOUA，分别使用 Bash？基本上我只
c - Ascii 十六进制值到 ascii 数字
这个问题在这里已经有了答案: 关闭 10 年前。 Possible Duplicate: How would you convert from ASCII to Hex by character i
mysql - 如何在不保存以检查是否与外部 ASCII 字符串匹配的情况下即时将列转换为 ASCII？
我有一个成员搜索功能，您可以在其中提供部分姓名，返回的内容应该是至少具有与该输入匹配的用户名、名字或姓氏之一的所有成员。这里的问题是某些名称具有“奇怪”的字符，例如 Renée 中的 é 并且用户不想
python - 如何将非 ASCII 字符编码的文件重命名为 ASCII
我有文件名“abc张.xlsx”，其中包含某种非 ASCII 字符编码，我想删除所有非 ASCII 字符以将其重命名为“abc.xlsx”。这是我尝试过的: import os import str

首页

博学

6Ren·AI

商城

java - 与 ASCII 不同的编码，即使对于字母也是如此