gpt4 book ai didi

java - 使用 Java 将 RSS Feed XML 转换为 JSON 显示特殊字符

转载 作者:太空宇宙 更新时间:2023-11-04 11:03:29 26 4
gpt4 key购买 nike

创建了一个基于 Spring MVC 的 Restful Controller ,它采用硬编码的 RSS HTTP URL 并将其从 XML 转换为 JSON:

RssFeedController:

import java.io.IOException;
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;

import org.apache.commons.io.IOUtils;
import org.apache.log4j.Logger;
import org.json.JSONObject;
import org.json.XML;

import com.fasterxml.jackson.databind.ObjectMapper;

@RestController
public class RssFeedController {

private HttpHeaders headers = null;

public RssFeedController() {
headers = new HttpHeaders();
headers.add("Content-Type", "application/json");
}

@RequestMapping(value = "/v2/convertToJson", method = RequestMethod.GET, produces = "application/json")
public String getRssFeedAsJson() throws IOException {
InputStream xml = getInputStreamForURLData("http://www.samplefeed.com/feed");
String xmlString = IOUtils.toString(xml);
JSONObject jsonObject = XML.toJSONObject(xmlString);
ObjectMapper objectMapper = new ObjectMapper();
Object json = objectMapper.readValue(jsonObject.toString(), Object.class);
String response = objectMapper.writeValueAsString(json);
return response;
}

public static InputStream getInputStreamForURLData(String targetUrl) {
URL url = null;
HttpURLConnection httpConnection = null;
InputStream content = null;

try {
url = new URL(targetUrl);
URLConnection conn = url.openConnection();
conn.setRequestProperty("User-Agent", "Mozilla/5.0");
httpConnection = (HttpURLConnection) conn;
int responseCode = httpConnection.getResponseCode();
content = (InputStream) httpConnection.getInputStream();
}
catch (MalformedURLException e) {
e.printStackTrace();
}
catch (IOException e) {
e.printStackTrace();
}
return content;
}

pom.xml

<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20170516</version>
</dependency>

<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.5</version>
</dependency>

因此,原始 RSS Feed 的内容如下:

<item>
<title>October Fest Weekend</title>
<link>http://www.samplefeed.com/feed/OctoberFestWeekend</link>
<comments>http://www.samplefeed.com/feed/OctoberFestWeekend/#comments</comments>
<pubDate>Wed, 04 Oct 2017 17:08:48 +0000</pubDate>
<dc:creator><![CDATA[John Doe]]></dc:creator>
<category><![CDATA[Uncategorized]]></category>

<guid isPermaLink="false">http://www.samplefeed.com/feed/?p=9227</guid>
<description><![CDATA[<p>
</p>
<p>Doors Open:6:30pm<br />
Show Begins: 7:30pm<br />
Show Ends (Estimated time): 11:00pm<br />
Location: Staples Center</p>
<p>Directions</p>
<p>Map of ...</p>
<p>The post <a rel="nofollow" href="http://www.samplefeed.com/feed/OctoberFestWeekend/">OctoberFest Weekend</a> appeared first on <a rel="nofollow" href="http://www.samplefeed.com">SampleFeed</a>.</p>
]]></description>

这会呈现为 JSON,如下所示:

{
"guid": {
"content": "http://www.samplefeed.com/feed/?p=9227",
"isPermaLink": false
},
"pubDate": "Wed, 04 Oct 2017 17:08:48 +0000",
"category": "Uncategorized",
"title": "October Fest Weekend",
"description": "<p>\n??</p>\n<p>Doors Open:6:30pm<br />\nShow Begins:?? 7:30pm<br />\nShow Ends (Estimated time):??11:00pm<br />\nLocation: Staples Center</p>\n<p>Directions</p>\n<p>Map of ...</p>\n<p>The post <a rel=\"nofollow\" href=\"http://www.samplefeed.com/feed/OctoberFestWeekend/\">OctoberFest Weekend</a> appeared first on <a rel=\"nofollow\" href=\"http://www.samplefeed.com\">Sample Feed</a>.</p>\n",
"dc:creator": "John Doe",
"link": "http://www.samplefeed.com/feed/OctoberFestWeekend",
"comments": "http://www.samplefeed.com/feed/OctoberFestWeekend/#comments"
}

请注意,在渲染的 JSON 中,“description”键的值后面有两个问号(“??”),如下所示:

"description": "<p>\n??</p>\n

此外,演出开始后还有两个问号:

<br />\nShow Begins:??

晚上 11:00 之前也是如此

Show Ends (Estimated time):??11:00pm<br />

这不是唯一显示特殊字符的模式,还有一些地方有三个 ???生成的标记以及一些地方,例如??????

例如

<title>Today’s 20th Annual Karaoke</title>

在 JSON 中呈现如下:

"title": "Today???s 20th Annual Karaoke"

<content-encoded>: <![CDATA[(Monte Vista High School, NY.).  </span></p>]]></content:encoded>

在 JSON 中呈现如下:

"content:encoded":  "(Monte Vista High School, NY.).????</span></p>

XML 中有些地方有破折号(“-”):

<strong>Welcome</strong> – Welcome to the Party!

以 JSON 格式呈现:

<strong>Welcome</strong>????? Welcome to the Party!

有谁知道如何在我的代码中设置正确的编码,以便我可以避免这些不良/特殊字符渲染问题?

最佳答案

Converting RSS Feed XML to JSON using Java is Displaying Special Characters

在逐行检查您的代码后,我得到了解决方案,我正在为您更新我的答案特殊字符响应为 ? 的问题

如果您更新这行代码

@RequestMapping(value = "/v2/convertToJson", method = RequestMethod.GET, produces = "application/json")

@RequestMapping(value = "/v2/convertToJson", method = RequestMethod.GET, produces = "application/json;charset=UTF-8")

您需要在使用 json 生成参数值时指定 UTF-8 字符集编码。对于我之前的误解回答,我深表歉意,但我现在更新它。

关于java - 使用 Java 将 RSS Feed XML 转换为 JSON 显示特殊字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46656103/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com