gpt4 book ai didi

swift - 使用 Swiftsoup (Swift) 解析 HTML?

转载 作者:行者123 更新时间:2023-11-28 15:00:40 27 4
gpt4 key购买 nike

我正在尝试用 Swiftsoup 解析一些网站,假设其中一个网站来自 Medium .我如何提取网站主体并将主体加载到另一个 UIViewController,就像 Instapaper 所做的那样?

enter image description here

这是我用来提取标题的代码:

import SwiftSoup

class WebViewController: UIViewController, UIWebViewDelegate {

...

override func viewDidLoad() {
super.viewDidLoad()

let url = URL(string: "https://medium.com/@timjwise/stop-lying-to-yourself-when-you-snub-panhandlers-its-not-for-their-own-good-199d0aa7a513")
let request = URLRequest(url: url!)
webView.loadRequest(request)

guard let myURL = url else {
print("Error: \(String(describing: url)) doesn't seem to be a valid URL")
return
}
let html = try! String(contentsOf: myURL, encoding: .utf8)

do {
let doc: Document = try SwiftSoup.parseBodyFragment(html)
let headerTitle = try doc.title()
print("Header title: \(headerTitle)")
} catch Exception.Error(let type, let message) {
print("Message: \(message)")
} catch {
print("error")
}

}

}

但我没有运气提取网站或任何其他网站的正文,有什么办法让它工作吗? CSS 或 JavaScript(我对 CSS 或 Javascript 一无所知)?

最佳答案

使用函数体https://github.com/scinfu/SwiftSoup#parsing-a-body-fragment试试这个:

let html = try! String(contentsOf: myURL, encoding: .utf8)

do {
let doc: Document = try SwiftSoup.parseBodyFragment(html)
let headerTitle = try doc.title()

// my body
let body = doc.body()
// elements to remove, in this case images
let undesiredElements: Elements? = try body?.select("img[src]")
//remove
undesiredElements?.remove()


print("Header title: \(headerTitle)")
} catch Exception.Error(let type, let message) {
print("Message: \(message)")
} catch {
print("error")
}

关于swift - 使用 Swiftsoup (Swift) 解析 HTML?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48963919/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com