gpt4 book ai didi

java - 从 URL 检索 XML,不写入前几行

转载 作者:行者123 更新时间:2023-12-01 11:47:59 25 4
gpt4 key购买 nike

我目前正在为大学编写一个基本的天气应用程序,其中包括从 BBC 天气 RSS 源检索天气信息。

我已将其全部设置为将 RSS 提要输出到文件 (output.xml) 中,然后解析器类将使用该文件来构建树。

但是我得到The markup in the document following the root element must be well- formed.当我运行它时出错。

检查下载的 XML 文件后,我注意到前两个节点丢失了。

这是下载的 XML:

<channel>
<atom:link href="http://open.live.bbc.co.uk/weather/feeds/en/2656397/observations.rss" rel="self" type="application/rss+xml" />
<title>BBC Weather - Observations for Bangor, United Kingdom</title>
<link>http://www.bbc.co.uk/weather/2656397</link>
<description>Latest observations for Bangor from BBC Weather, including weather, temperature and wind information</description>
<language>en</language>
<copyright>Copyright: (C) British Broadcasting Corporation, see http://www.bbc.co.uk/terms/additional_rss.shtml for more details</copyright>
<pubDate>Thu, 12 Mar 2015 05:35:08 +0000</pubDate>
<item>
<title>Thursday - 05:00 GMT: Thick Cloud, 10°C (50°F)</title>
<link>http://www.bbc.co.uk/weather/2656397</link>
<description>Temperature: 10°C (50°F), Wind Direction: South Easterly, Wind Speed: 8mph, Humidity: 90%, Pressure: 1021mb, Falling, Visibility: Very Good</description>
<pubDate>Thu, 12 Mar 2015 05:35:08 +0000</pubDate>
<guid isPermaLink="false">http://www.bbc.co.uk/weather/2656397-2015-03-12T05:35:08.000Z</guid>
<georss:point>53.22647 -4.13459</georss:point>
</item>
</channel>
</rss>

XML 在 <channel> 之前应具有以下两个节点节点:

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:georss="http://www.georss.org/georss" version="2.0">

这是我用来检索 XML 文件的代码:

public static void main(String[] args) throws SAXException, IOException, XPathExpressionException {
URL url = new URL("http://open.live.bbc.co.uk/weather/feeds/en/2656397/observations.rss");
URLConnection con = url.openConnection();
StringBuilder builder;
try (BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()))) {

builder = new StringBuilder();
String line;

if (!in.readLine().isEmpty()) {
line = in.readLine();
}

while ((line = in.readLine()) != null) {
builder.append(line).append("\n");
}

String input = builder.toString();

BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(new File("output.xml"))));
out.write(input);
out.flush();
}
try {
WeatherParser parser = new WeatherParser();
System.out.println(parser.parse("output.xml"));
} catch (ParserConfigurationException ex) {
}
}

下面是解析 XML 的代码 ( WeatherParser.java ):

public class WeatherParser {

public WeatherParser() throws ParserConfigurationException {
xpfactory = XPathFactory.newInstance();
path = xpfactory.newXPath();
dbfactory = DocumentBuilderFactory.newInstance();
builder = dbfactory.newDocumentBuilder();
}

public String parse(String fileName) throws SAXException, IOException, XPathExpressionException {
File f = new File(fileName);
org.w3c.dom.Document doc = builder.parse(f);
StringBuilder info = new StringBuilder();
info.append(path.evaluate("/channel/item/title", doc));
return info.toString();
}

private DocumentBuilderFactory dbfactory;
private DocumentBuilder builder;
private XPathFactory xpfactory;
private XPath path;
}

希望提供了足够的信息。

最佳答案

前两行缺失,因为您阅读了它但没有“保存”它
删除这个就可以了。

    if (!in.readLine().isEmpty()) {
line = in.readLine();
}

if您正在阅读第一行( <?xml.... ),但没有保留它。
line = in.readLine();得到第二个,但是当您输入 while 时你会失去 line 中的内容变量。

关于java - 从 URL 检索 XML,不写入前几行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29003192/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com