gpt4 book ai didi

macos - 在 Swift 中解码引用的可打印消息

转载 作者:搜寻专家 更新时间:2023-10-31 08:26:26 25 4
gpt4 key购买 nike

我有一个带引号的可打印字符串,例如“The cost would be =C2=A31,000”。我如何将其转换为“费用为 1,000 英镑”。

我目前只是手动转换文本,这并不涵盖所有情况。我确信只有一行代码可以帮助解决这个问题。

这是我的代码:

func decodeUTF8(message: String) -> String
{
var newMessage = message.stringByReplacingOccurrencesOfString("=2E", withString: ".", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=E2=80=A2", withString: "•", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=C2=A3", withString: "£", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=A3", withString: "£", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=E2=80=9C", withString: "\"", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=E2=80=A6", withString: "…", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=E2=80=9D", withString: "\"", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=92", withString: "'", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=3D", withString: "=", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=20", withString: "", options: NSStringCompareOptions.LiteralSearch, range: nil)
newMessage = newMessage.stringByReplacingOccurrencesOfString("=E2=80=99", withString: "'", options: NSStringCompareOptions.LiteralSearch, range: nil)

return newMessage
}

谢谢

最佳答案

一个简单的方法是使用 (NS)String 方法stringByRemovingPercentEncoding 用于此目的。这是观察到的 decoding quoted-printables ,所以第一个解决方案主要是对答案的翻译该线程到 Swift。

想法是将引用打印的“=NN”编码替换为percent encoding "%NN"然后用现有的方法去掉百分比编码。

续行是单独处理的。此外,输入字符串中的百分比字符必须首先编码,否则他们将被视为百分比中的主角编码。

func decodeQuotedPrintable(message : String) -> String? {
return message
.stringByReplacingOccurrencesOfString("=\r\n", withString: "")
.stringByReplacingOccurrencesOfString("=\n", withString: "")
.stringByReplacingOccurrencesOfString("%", withString: "%25")
.stringByReplacingOccurrencesOfString("=", withString: "%")
.stringByRemovingPercentEncoding
}

该函数返回一个可选字符串,对于无效输入,该字符串为 nil。无效输入可以是:

  • 一个“=”字符,后面没有跟两个十六进制数字,例如“=XX”。
  • 不能解码为有效 UTF-8 序列的“=NN”序列,例如“=E2=64”。

例子:

if let decoded = decodeQuotedPrintable("=C2=A31,000") {
print(decoded) // £1,000
}

if let decoded = decodeQuotedPrintable("=E2=80=9CHello =E2=80=A6 world!=E2=80=9D") {
print(decoded) // “Hello … world!”
}

更新 1:以上代码假定消息使用 UTF-8用于引用非 ASCII 字符的编码,如大多数示例所示:C2 A3 是“£”的 UTF-8 编码,E2 80 A4 是 UTF-8 的编码。

如果输入是 "Rub=E9n" 则消息使用 Windows-1252编码。要正确解码,您必须更换

.stringByRemovingPercentEncoding

通过

.stringByReplacingPercentEscapesUsingEncoding(NSWindowsCP1252StringEncoding)

还有一些方法可以从“Content-Type”中检测编码标题字段,比较例如https://stackoverflow.com/a/32051684/1187415 .


更新 2:stringByReplacingPercentEscapesUsingEncoding方法被标记为已弃用,因此上面的代码将始终生成编译器警告。不幸的是,似乎没有替代方法由 Apple 提供。

所以这是一种新的、完全独立的解码方法,它不会引起任何编译器警告。这次我写了作为 String 的扩展方法。解释评论在代码。

extension String {

/// Returns a new string made by removing in the `String` all "soft line
/// breaks" and replacing all quoted-printable escape sequences with the
/// matching characters as determined by a given encoding.
/// - parameter encoding: A string encoding. The default is UTF-8.
/// - returns: The decoded string, or `nil` for invalid input.

func decodeQuotedPrintable(encoding enc : NSStringEncoding = NSUTF8StringEncoding) -> String? {

// Handle soft line breaks, then replace quoted-printable escape sequences.
return self
.stringByReplacingOccurrencesOfString("=\r\n", withString: "")
.stringByReplacingOccurrencesOfString("=\n", withString: "")
.decodeQuotedPrintableSequences(enc)
}

/// Helper function doing the real work.
/// Decode all "=HH" sequences with respect to the given encoding.

private func decodeQuotedPrintableSequences(enc : NSStringEncoding) -> String? {

var result = ""
var position = startIndex

// Find the next "=" and copy characters preceding it to the result:
while let range = rangeOfString("=", range: position ..< endIndex) {
result.appendContentsOf(self[position ..< range.startIndex])
position = range.startIndex

// Decode one or more successive "=HH" sequences to a byte array:
let bytes = NSMutableData()
repeat {
let hexCode = self[position.advancedBy(1) ..< position.advancedBy(3, limit: endIndex)]
if hexCode.characters.count < 2 {
return nil // Incomplete hex code
}
guard var byte = UInt8(hexCode, radix: 16) else {
return nil // Invalid hex code
}
bytes.appendBytes(&byte, length: 1)
position = position.advancedBy(3)
} while position != endIndex && self[position] == "="

// Convert the byte array to a string, and append it to the result:
guard let dec = String(data: bytes, encoding: enc) else {
return nil // Decoded bytes not valid in the given encoding
}
result.appendContentsOf(dec)
}

// Copy remaining characters to the result:
result.appendContentsOf(self[position ..< endIndex])

return result
}
}

示例用法:

if let decoded = "=C2=A31,000".decodeQuotedPrintable() {
print(decoded) // £1,000
}

if let decoded = "=E2=80=9CHello =E2=80=A6 world!=E2=80=9D".decodeQuotedPrintable() {
print(decoded) // “Hello … world!”
}

if let decoded = "Rub=E9n".decodeQuotedPrintable(encoding: NSWindowsCP1252StringEncoding) {
print(decoded) // Rubén
}

Swift 4(及更高版本)的更新:

extension String {

/// Returns a new string made by removing in the `String` all "soft line
/// breaks" and replacing all quoted-printable escape sequences with the
/// matching characters as determined by a given encoding.
/// - parameter encoding: A string encoding. The default is UTF-8.
/// - returns: The decoded string, or `nil` for invalid input.

func decodeQuotedPrintable(encoding enc : String.Encoding = .utf8) -> String? {

// Handle soft line breaks, then replace quoted-printable escape sequences.
return self
.replacingOccurrences(of: "=\r\n", with: "")
.replacingOccurrences(of: "=\n", with: "")
.decodeQuotedPrintableSequences(encoding: enc)
}

/// Helper function doing the real work.
/// Decode all "=HH" sequences with respect to the given encoding.

private func decodeQuotedPrintableSequences(encoding enc : String.Encoding) -> String? {

var result = ""
var position = startIndex

// Find the next "=" and copy characters preceding it to the result:
while let range = range(of: "=", range: position..<endIndex) {
result.append(contentsOf: self[position ..< range.lowerBound])
position = range.lowerBound

// Decode one or more successive "=HH" sequences to a byte array:
var bytes = Data()
repeat {
let hexCode = self[position...].dropFirst().prefix(2)
if hexCode.count < 2 {
return nil // Incomplete hex code
}
guard let byte = UInt8(hexCode, radix: 16) else {
return nil // Invalid hex code
}
bytes.append(byte)
position = index(position, offsetBy: 3)
} while position != endIndex && self[position] == "="

// Convert the byte array to a string, and append it to the result:
guard let dec = String(data: bytes, encoding: enc) else {
return nil // Decoded bytes not valid in the given encoding
}
result.append(contentsOf: dec)
}

// Copy remaining characters to the result:
result.append(contentsOf: self[position ..< endIndex])

return result
}
}

示例用法:

if let decoded = "=C2=A31,000".decodeQuotedPrintable() {
print(decoded) // £1,000
}

if let decoded = "=E2=80=9CHello =E2=80=A6 world!=E2=80=9D".decodeQuotedPrintable() {
print(decoded) // “Hello … world!”
}

if let decoded = "Rub=E9n".decodeQuotedPrintable(encoding: .windowsCP1252) {
print(decoded) // Rubén
}

关于macos - 在 Swift 中解码引用的可打印消息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32184783/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com