gpt4 book ai didi

java - 模式优化

转载 作者:塔克拉玛干 更新时间:2023-11-02 19:00:08 24 4
gpt4 key购买 nike

我需要使用 Java 从 HTTP 响应中抓取一些内容。响应中的必填字段是:foo、bar 和 bla。我目前的模式很慢。有什么改进的想法吗?

响应:

...
<div class="ui-a">
<div class="ui-b">
<p><strong>foo</strong></p>
<p>bar</p>
</div>
<div class="ui-c">
<p><strong>bla</strong></p>
<p>...</p>
</div>
</div>

<div class="ui-a">
<div class="ui-b">
<p><strong>foo1</strong></p>
<p>bar1</p>
</div>
<div class="ui-c">
<p><strong>bla1</strong></p>
<p>...</p>
</div>

图案:

.*?<div class="ui-a">.*?<strong>(.*?)</strong>.*?<p>(.*?)</p>.*?</div>.*?<div class="ui-c">.*?<strong>(.*?)</strong>.*?

最佳答案

由于您不能使用 HTML 解析器,请尝试如下操作:

import java.util.regex.*;

public class Main {
public static void main (String[] args) {
String html =
"...\n" +
"<div class=\"ui-a\">\n" +
"<div class=\"ui-b\">\n" +
" <p><strong>foo</strong></p>\n" +
" <p>bar</p>\n" +
"</div>\n" +
"<div class=\"ui-c\">\n" +
" <p><strong>bla</strong></p>\n" +
" <p>...</p>\n" +
"</div>\n" +
"</div>\n" +
"\n" +
"<div class=\"ui-a\">\n" +
"<div class=\"ui-b\">\n" +
" <p><strong>foo1</strong></p>\n" +
" <p>bar1</p>\n" +
"</div>\n" +
"<div class=\"ui-c\">\n" +
" <p><strong>bla1</strong></p>\n" +
" <p>...</p>\n" +
"</div>";

Pattern p = Pattern.compile(
"(?sx) # enable DOT-ALL and COMMENTS \n" +
"<div\\s+class=\"ui-a\"> # match '<div...ui-a...>' \n" +
"(?:(?!<strong>).)*+ # match everything up to <strong> \n" +
"<strong>([^<>]++)</strong> # match <strong>...</strong> \n" +
"(?:(?!<p>).)*+ # match up to <p> \n" +
"<p>([^<>]++)</p> # match <p>...</p> \n" +
"(?:(?!<div\\s+class=\"ui-c\">).)*+ # match up to '<div...ui-a...>' \n" +
"<div\\s+class=\"ui-c\"> # match '<div...ui-c...>' \n" +
"(?:(?!<strong>).)*+ # match everything up to <strong> \n" +
"<strong>([^<>]++)</strong> # match <strong>...</strong> \n"
);

Matcher m = p.matcher(html);

while(m.find()) {
System.out.println("---------------");
for(int i = 1; i <= m.groupCount(); i++) {
System.out.printf("group(%d) = %s\n", i, m.group(i));
}
}
}
}

这会将以下内容打印到控制台:

---------------group(1) = foogroup(2) = bargroup(3) = bla---------------group(1) = foo1group(2) = bar1group(3) = bla1

注意我的改变:

这应该会使它更快(不确定多少...)。

关于java - 模式优化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8086231/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com