gpt4 book ai didi

java - 如何在java中使用dom解析器解析html标签

转载 作者:行者123 更新时间:2023-11-30 09:52:36 25 4
gpt4 key购买 nike

我从 web 服务获取值,我的属性值像 html 标记和特殊字符一样,任何人都可以告诉如何解析该值

当我使用 dom 解析器解析值时出现此异常

org.xml.sax.SAXParseException: Attr.value missing f. WIDOWS: (position:START_TAG <ARTICLE ARTICLE_ID='23221' HIDE_HEADER='0' MIGRATED='0' CITNART_DOC_REGION_INFO='' ISCSUSER='1' ARTICLE_TYPE_ID='31' ARTICLE_TYPE='Mobile- News and Commentary - Europe' CITN_ISSUE_NUMBER='' CITN_ARTICLE_TYPE_ID='' CITN_ARTICLE_TYPE='' SHOW_AUTH='1' LOGO_TYPE='QUEST' TITLE='Elementis - europe' DATE='2010-11-04T11:58:21.387' BODY='<span style=' WIDOWS:='null'>@1:726 in java.io.StringReader@43d85268) 

我的值(value)观来自网络服务是

<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><getDataResponse xmlns="http://tempuri.org/QuestIPhoneWebService/QuestIPhoneWebService"><getDataResult><ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"><ARTICLE ARTICLE_ID="23221" HIDE_HEADER="0" MIGRATED="0" CITNART_DOC_REGION_INFO="" ISCSUSER="1" ARTICLE_TYPE_ID="31" ARTICLE_TYPE="Mobile- News and Commentary - Europe" CITN_ISSUE_NUMBER="" CITN_ARTICLE_TYPE_ID="" CITN_ARTICLE_TYPE="" SHOW_AUTH="1" LOGO_TYPE="QUEST" TITLE="Elementis - europe" DATE="2010-11-04T11:58:21.387" BODY="<span style="WIDOWS: 2; TEXT-TRANSFORM: none; TEXT-INDENT: 0px; BORDER-COLLAPSE: separate; FONT: medium 'Times New Roman'; WHITE-SPACE: normal; ORPHANS: 2; LETTER-SPACING: normal; COLOR: rgb(0,0,0); WORD-SPACING: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: none; -webkit-text-stroke-width: 0px" class="Apple-style-span"><span class="Apple-style-span">
11-15 11:39:09.949: INFO/System.out(224): <p style="LINE-HEIGHT: 11pt" class="MsoNormal"><span lang="EN-US">At the end of 2008, the FTSE350 chemical sector consisted of just two names è Johnston Matthey and Croda. Since then we have had the admission of Victrex and, as of last week, Elementis and Yule Catto. Having met management, we believe that Elementis has all the ingredients for value creation that Croda has so successfully exhibited.</span></p>
11-15 11:39:09.949: INFO/System.out(224): <p style="LINE-HEIGHT: 11pt" class="MsoNormal"><span lang="EN-US">Being promoted into the FTSE250 opens Elementis up to a whole new investment audience. It has not just got there through a cyclical bounce back either. The company has gone through a very sensible rationalisation programme, exited a low-returning business (UK Chromium), is running much more efficient levels of working capital, and crucially, is more exposed to growth markets. To give an idea of managementès resolve, instead of selling the UK Chromium business they decided to effectively bulldoze the site. This will prevent a competitor from interfering in Elementisè position in US Chromium.</span></p>
11-15 11:39:09.961: INFO/System.out(224): <p style="LINE-HEIGHT: 11pt" class="MsoNormal"><span lang="EN-US">During the credit crunch Elementis picked up an Asian-focused speciality chemicals business called Deuchem for Å£38m (Å£45m sales). Deuchem has 12 offices in<?xml:namespace prefix = st1 /><st1:country-region><st1:place>China</st1:place></st1:country-region><span class="Apple-converted-space">&nbsp;</span>and is benefiting as the Chinese customer moves up the quality/performance scale. Previously, Chinese demand was not for sophisticated products è this is changing as we type. Coatings are the main market for speciality products, with Oilfield Chemicals the next biggest category. The cost of Elementisè products per end unit remains small, typically <5%. Yet the relationship with the customer (its largest is Akzo Nobel) is generally one that has been forged over many years (even decades) and required them to work closely together. In short, it is not particularly competitive, but does require consistent delivery and performance from Elementis. We have a very conservative top-line growth forecast of 3% for specialty chemicals, yet would not be surprised if it was nearer 5%. Margin progression here is key and we expect a mid-to-high teens margin up from 9%.</span></p>
11-15 11:39:09.980: INFO/System.out(224): <p style="LINE-HEIGHT: 11pt" class="MsoNormal"><span lang="EN-US">Another growth area is shale gas. Elementis makes the lubricant for the drill bit. Typically, drilling was vertical. But, now drill bits can be turned 90 degrees accessing much more of the shale seam. This requires much more lubricant è hence H1 2010 volumes were double the year before. There is only one competitor in this area. Elsewhere in the US Elementis has its US Chromium business. This is steady, has high<span class="Apple-converted-space">&nbsp;</span><st1:country-region>US</st1:country-region><span class="Apple-converted-space">&nbsp;</span>market shares and has a superior transport advantage to competitors exporting to the<span class="Apple-converted-space">&nbsp;</span><st1:country-region><st1:place>US</st1:place></st1:country-region>. This is a solid business growing at 3% with a 15% operating margin.</span></p>
11-15 11:39:09.980: INFO/System.out(224): <p style="LINE-HEIGHT: 11pt" class="MsoNormal"><span lang="EN-US">Since the credit crunch the CFO has tightened up inventory management and creditor days. This has helped to transfer c.ţ25m of value to shareholders, a vital step in maximizing returns for shareholders. On a separate note management think there is a chance that an EU fine worth ţ21m that Elementis has paid could be reversed.</span></p>
11-15 11:39:09.990: INFO/System.out(224): <p style="LINE-HEIGHT: 11pt" class="MsoNormal"><span lang="EN-US">Weève updated the Modeller approach we used in last monthès CITN note è <a href="http://www.csquest.com/QUEST?uid=MAIL&Tp=Cn&PCF=CNAR&ID=23243" target="_blank">Itès Elementary</a>è¡. Instead of using a<span class="Apple-converted-space">&nbsp;</span><a href="http://www.csquest.com/QUEST?clpg=ART&id=13586&clid=&pg=MDL&spl=&cid=0241854" target="_blank">central valuation (100p)</a><span class="Apple-converted-space">&nbsp;</span>è half way between the<a href="http://www.csquest.com/QUEST?clpg=ART&id=13629&clid=&pg=MDL&spl=&cid=0241854" target="_blank">bull (135p)</a><span class="Apple-converted-space">&nbsp;</span>and bear (67p) scenarios è since seeing management, weère now happier using a valuation halfway between the bull case and the central case. Given this renewed confidence, we think this 118p adjusted valuation is very credible indeed. With 24% upside to Fridayès close, Elementis is a buy.</span></p>
11-15 11:39:10.000: INFO/System.out(224): <p>
11-15 11:39:10.010: INFO/System.out(224): <table style="WIDTH: 345.75pt; BORDER-COLLAPSE: collapse; MARGIN-LEFT: 4pt" class="MsoTableGrid" border="0" cellspacing="0" cellpadding="0" width="461">
11-15 11:39:10.020: INFO/System.out(224): <tbody>
11-15 11:39:10.020: INFO/System.out(224): <tr>
11-15 11:39:10.020: INFO/System.out(224): <td style="PADDING-BOTTOM: 0cm; PADDING-LEFT: 5.4pt; WIDTH: 345.75pt; PADDING-RIGHT: 5.4pt; PADDING-TOP: 0cm" valign="top" width="461">
11-15 11:39:10.029: INFO/System.out(224): <p style="LINE-HEIGHT: 11pt; MARGIN: 0.75pt 0cm 0.75pt -3.95pt" class="MsoNormal"><b><span lang="EN-US">Sales Team</span></b><span lang="EN-US"><span class="Apple-converted-space">&nbsp;</span><a href="mailto:salesteam@collinsstewart.com" target="_blank">salesteam@collinsstewart.com</a>, Tel: +44 (0) 20 7523 8493</span></p></td></tr></tbody></table></p></span></span>" IS_PROTECTED="0" PDF_NAME="" REFERENCE_CITN_ARTICLE_ID="23221" ISNEWARTICLE="5" HYPERLINK="/PATH/23221.pdf"><SUMMARY>Elementis Europe Summary</SUMMARY><AUTHORS/></ARTICLE><ASSOCIATED_COMPANIES ARTICLE_ID="23221"/><COMPANIES_WITH_AUTH context="COMPANIES"/></ROOT>
11-15 11:39:10.029: INFO/System.out(224): </getDataResult></getDataResponse></soap:Body></soap:Envelope>

谁能告诉我如何解析特殊字符和 html 标签?

任何帮助将不胜感激

最佳答案

我复制了您的示例,对其进行了格式化,恕我直言,我理解了这个问题。您的 XML 是包含 HTML 的 SOAP 响应。 HTML 由标签 ARTICLE 的属性 BODY 保存。

XML 标签的内容不能包含多个禁止字符,如 "、'、<、> 等。但您的内容包含很多此类字符,因为它是 HTML。要发送 HTML,您必须转义禁止字符,即替换< 由 <

by > ' by & " by &qout;

我的意思是在生成响应时执行此操作,而不是在解析响应时执行!祝你好运。

关于java - 如何在java中使用dom解析器解析html标签,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4181931/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com