gpt4 book ai didi

html - 仅使用 iOS API 从 html 中提取 JSON 字符串

转载 作者:行者123 更新时间:2023-11-30 11:11:36 24 4
gpt4 key购买 nike

我想使用第三方框架“不”从 html 文档中提取 JSON 字符串。我正在尝试创建 iOS 框架,但我不想在其中使用第三方框架。

示例网址: http://www.nicovideo.jp/watch/sm33786214

在该 html 中,有一行:

我需要提取:JSON_String_I_want_to 提取并将其转换为 JSON 对象。

使用第三方框架“Kanna”,是这样的:



if let doc = Kanna.HTML(html: html, encoding: String.Encoding.utf8) {
if let descNode = doc.css("#js-initial-watch-data[data-api-data]").first {
let dataApiData = descNode["data-api-data"]
if let data = dataApiData?.data(using: .utf8) {
if let json = try? JSON(data: data, options: JSONSerialization.ReadingOptions.mutableContainers) {

我在网上搜索了类似的问题,但无法应用于我的案例:(我需要承认我不太遵循正则表达式)



if let html = String(data:data, encoding:.utf8) {
let pattern = "data-api-data=\"(.*?)\".*?>"
let regex = try! NSRegularExpression(pattern: pattern, options: .caseInsensitive)
let matches = regex.matches(in: html, options: [], range: NSMakeRange(0, html.count))
var results: [String] = []
matches.forEach { (match) -> () in
results.append( (html as NSString).substring(with: match.rangeAt(1)) )
}
if let stringJSON = results.first {
let d = stringJSON.data(using: String.Encoding.utf8)
if let json = try? JSONSerialization.jsonObject(with: d!, options: []) as? Any {
// it does not get here...
}

有谁擅长从 html 中提取并将其转换为 JSON?

谢谢。

最佳答案

您的模式看起来不错,只是HTML元素的属性值可能使用字符实体。

在将字符串解析为 JSON 之前,您需要将它们替换为实际字符。

if let html = String(data:data, encoding: .utf8) {
let pattern = "data-api-data=\"([^\"]*)\""
let regex = try! NSRegularExpression(pattern: pattern, options: .caseInsensitive)
let matches = regex.matches(in: html, range: NSRange(0..<html.utf16.count)) //<-USE html.utf16.count, NOT html.count
var results: [String] = []
matches.forEach {match in
let propValue = html[Range(match.range(at: 1), in: html)!]
//### You need to replace character entities into actual characters
.replacingOccurrences(of: "&quot;", with: "\"")
.replacingOccurrences(of: "&apos;", with: "'")
.replacingOccurrences(of: "&gt;", with: ">")
.replacingOccurrences(of: "&lt;", with: "<")
.replacingOccurrences(of: "&amp;", with: "&")
results.append(propValue)
}
if let stringJSON = results.first {
let dataJSON = stringJSON.data(using: .utf8)!
do {
let json = try JSONSerialization.jsonObject(with: dataJSON)
print(json)
} catch {
print(error) //You should not ignore errors silently...
}
} else {
print("NO result")
}
}

关于html - 仅使用 iOS API 从 html 中提取 JSON 字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52137374/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com