gpt4 book ai didi

java - 使用正则表达式提取网址的特定部分

转载 作者:行者123 更新时间:2023-12-02 00:26:30 25 4
gpt4 key购买 nike

我想通过在java中使用正则表达式来提取位于中间的url的一部分这就是我尝试过的,检测 java+regex 的主要问题是它位于网址最后一部分的中间,我不知道如何忽略它后面的字符,我的正则表达式只是忽略之前它:

   String regex = "https://www\\.google\\.com/(search)?q=([^/]+)/";
String url = "https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a";
Pattern pattern = Pattern.compile (regex);
Matcher matcher = pattern.matcher (url);

if (matcher.matches ())
{
int n = matcher.groupCount ();
for (int i = 0; i <= n; ++i)
System.out.println (matcher.group (i));
}
}

结果应该是 regex+java 甚至是 regex java 。但我的代码没有成功...

最佳答案

尝试:

    String regex = "https://www\\.google\\.com/search\\?q=([^&]+).*";
String url = "https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a";
Pattern pattern = Pattern.compile (regex);
Matcher matcher = pattern.matcher (url);

if (matcher.matches ())
{
int n = matcher.groupCount ();
for (int i = 0; i <= n; ++i)
System.out.println (matcher.group (i));
}

结果是:

https://www.google.com/search?q=regex+java&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
regex+java

编辑

打印前替换所有加号:

for (int i = 0; i <= n; ++i) {
String str = matcher.group (i).replaceAll("\\+", " ");
System.out.println (str);
}

关于java - 使用正则表达式提取网址的特定部分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9927297/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com