gpt4 book ai didi

java - 给定无效的 base64 编码字符串,如何使 Android Base64.decode 可靠地抛出异常(或返回 null)?

转载 作者:搜寻专家 更新时间:2023-11-01 03:45:42 24 4
gpt4 key购买 nike

在 Java SE 中,如果给定一个无效的 base64 编码字符串,解码器将抛出异常。

Java SE

// java.util.Base64
Base64.Decoder decoder = Base64.getDecoder();
String input = "キツネが怠惰な犬を飛び越える a quick fox jump over the lazy dog 敏捷的棕色狐狸跨過懶狗 Ein schneller Fuchs springt über den faulen Hund สุนัขจิ้งจอกตัวเตี้ยกระโดดข้ามสุนัขขี้เกียจ быстрый лис перепрыгнуть через ленивую собаку";
byte[] decode = decoder.decode(input); // java.lang.IllegalArgumentException thrown!!!
String output = new String(decode, StandardCharsets.UTF_8);
System.out.println(output);

但是,在 Android 中,即使输入是无效的 base64 编码字符串,也不会出现异常或无效指示。

安卓

String input = "キツネが怠惰な犬を飛び越える a quick fox jump over the lazy dog 敏捷的棕色狐狸跨過懶狗 Ein schneller Fuchs springt über den faulen Hund สุนัขจิ้งจอกตัวเตี้ยกระโดดข้ามสุนัขขี้เกียจ быстрый лис перепрыгнуть через ленивую собаку";
byte[] binary = input.getBytes(StandardCharsets.UTF_8);
// android.util.Base64
byte[] decode = Base64.decode(binary, Base64.NO_WRAP);
System.out.println("length = " + decode.length); // length = 50
String output = new String(decode, StandardCharsets.UTF_8);
System.out.println(output); // j��rG��;���ޮ�^���v��{�w���Ź�l���[z�^�����Ǻw

请问这是为什么?

在 Android 中,当给定的输入字符串是无效的 base64 编码时,是否有一种快速可靠的方法来抛出异常或返回 null?


问题已提交

https://issuetracker.google.com/issues/141497577

最佳答案

最初,我计划使用 Apache isBase64,在传递到 Base64.decode 之前验证输入。

https://github.com/apache/commons-codec/blob/master/src/main/java/org/apache/commons/codec/binary/Base64.java

后来发现逻辑不够严谨。他们不检查 https://stackoverflow.com/a/8571544/72437 中提到的以下规则

  • 检查长度是否为4个字符的倍数
  • 检查每个字符都在集合 A-Z、a-z、0-9、+、/中,但末尾的填充为 0、1 或 2 个“=”字符除外

因此,我使用上述规则修改 Apache isBase64。请注意,为简单起见,在我的使用目的中我不接受空格作为有效的 base64。

/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.commons.codec.binary;

import java.nio.charset.Charset;

import static com.yocto.wenote.Constants.UTF_8;

/**
* Provides Base64 encoding and decoding as defined by <a href="http://www.ietf.org/rfc/rfc2045.txt">RFC 2045</a>.
*
* <p>
* This class implements section <cite>6.8. Base64 Content-Transfer-Encoding</cite> from RFC 2045 <cite>Multipurpose
* Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies</cite> by Freed and Borenstein.
* </p>
* <p>
* The class can be parameterized in the following manner with various constructors:
* </p>
* <ul>
* <li>URL-safe mode: Default off.</li>
* <li>Line length: Default 76. Line length that aren't multiples of 4 will still essentially end up being multiples of
* 4 in the encoded data.
* <li>Line separator: Default is CRLF ("\r\n")</li>
* </ul>
* <p>
* The URL-safe parameter is only applied to encode operations. Decoding seamlessly handles both modes.
* </p>
* <p>
* Since this class operates directly on byte streams, and not character streams, it is hard-coded to only
* encode/decode character encodings which are compatible with the lower 127 ASCII chart (ISO-8859-1, Windows-1252,
* UTF-8, etc).
* </p>
* <p>
* This class is thread-safe.
* </p>
*
* @see <a href="http://www.ietf.org/rfc/rfc2045.txt">RFC 2045</a>
* @since 1.0
*/
public class Base64 {
/**
* Byte used to pad output.
*/
protected static final byte PAD_DEFAULT = '='; // Allow static access to default

/**
* This array is a lookup table that translates Unicode characters drawn from the "Base64 Alphabet" (as specified
* in Table 1 of RFC 2045) into their 6-bit positive integer equivalents. Characters that are not in the Base64
* alphabet but fall within the bounds of the array are translated to -1.
*
* Thanks to "commons" project in ws.apache.org for this code.
* http://svn.apache.org/repos/asf/webservices/commons/trunk/modules/util/
*/
private static final byte[] DECODE_TABLE = {
// 0 1 2 3 4 5 6 7 8 9 A B C D E F
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, // 00-0f
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, // 10-1f
-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 62, -1, -1, -1, 63, // 20-2f + /
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, -1, -1, -1, -1, -1, -1, // 30-3f 0-9
-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, // 40-4f A-O
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, -1, -1, -1, -1, -1, // 50-5f P-Z
-1, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, // 60-6f a-o
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 // 70-7a p-z
};

/**
* Returns whether or not the <code>octet</code> is in the base 64 alphabet.
*
* @param octet
* The value to test
* @return <code>true</code> if the value is defined in the the base 64 alphabet, <code>false</code> otherwise.
* @since 1.4
*/
static boolean isBase64(final byte octet) {
return (octet >= 0 && octet < DECODE_TABLE.length && DECODE_TABLE[octet] != -1);
}

private static boolean isPadding(final byte octet) {
return octet == PAD_DEFAULT;
}

/**
* Tests a given String to see if it contains only valid characters within the Base64 alphabet. Currently the
* method treats whitespace as valid.
*
* @param base64
* String to test
* @return <code>true</code> if all characters in the String are valid characters in the Base64 alphabet or if
* the String is empty; <code>false</code>, otherwise
* @since 1.5
*/
public static boolean isBase64(final String base64) {
return isBase64(getBytesUtf8(base64));
}

/**
* Tests a given byte array to see if it contains only valid characters within the Base64 alphabet. Currently the
* method treats whitespace as valid.
*
* @param arrayOctet
* byte array to test
* @return <code>true</code> if all bytes are valid characters in the Base64 alphabet or if the byte array is empty;
* <code>false</code>, otherwise
* @since 1.5
*/
private static boolean isBase64(final byte[] arrayOctet) {
final int length = arrayOctet.length;

// Check that the length is a multiple of 4 characters
if (length%4 != 0) {
return false;
}

// Check that every character is in the set A-Z, a-z, 0-9, +, / except for padding at the
// end which is 0, 1 or 2 '=' characters.
final int end = Math.max(0, length-2);

for (int i = 0; i < end; i++) {
if (!isBase64(arrayOctet[i])) {
return false;
}
}

boolean padding = false;

for (int i = end; i < arrayOctet.length; i++) {
byte octet = arrayOctet[i];
if (padding) {
if (!isPadding(octet)) {
return false;
}
} else {
if (isPadding(octet)) {
padding = true;
} else if (!isBase64(octet)) {
return false;
}
}
}

return true;
}

/**
* Encodes the given string into a sequence of bytes using the UTF-8 charset, storing the result into a new byte
* array.
*
* @param string
* the String to encode, may be <code>null</code>
* @return encoded bytes, or <code>null</code> if the input string was <code>null</code>
* @throws NullPointerException
* Thrown if {@link Charsets#UTF_8} is not initialized, which should never happen since it is
* required by the Java platform specification.
* @since As of 1.7, throws {@link NullPointerException} instead of UnsupportedEncodingException
* @see <a href="http://download.oracle.com/javase/7/docs/api/java/nio/charset/Charset.html">Standard charsets</a>
* @see #getBytesUnchecked(String, String)
*/
private static byte[] getBytesUtf8(final String string) {
return getBytes(string, UTF_8);
}

/**
* Calls {@link String#getBytes(Charset)}
*
* @param string
* The string to encode (if null, return null).
* @param charset
* The {@link Charset} to encode the <code>String</code>
* @return the encoded bytes
*/
private static byte[] getBytes(final String string, final Charset charset) {
if (string == null) {
return null;
}
return string.getBytes(charset);
}
}

关于java - 给定无效的 base64 编码字符串,如何使 Android Base64.decode 可靠地抛出异常(或返回 null)?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58053650/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com