gpt4 book ai didi

java - 在java中获取twitter页面标题、描述和关键字

转载 作者:太空宇宙 更新时间:2023-11-04 08:23:37 25 4
gpt4 key购买 nike

我想用 Java 获取 Twitter 页面标题描述关键字

我为此苦苦思索了很多次,但找不到解决方案。所有人都以 ISO-8859 字符集格式回复了我。请帮助我以 UTF-8 字符集格式回复。

我为此使用了以下代码,

public class TitDesKey
{
public static void main ( String[] args ) throws IOException
{
String inputLine,source= null,result_tit= null,result_des= null,result_key= null;
try
{
URL url = new URL("http://www.twitter.com");

URLConnection conn = url.openConnection();
conn.setRequestProperty("User-Agent","Mozilla/5.0 (X11; U; Linux x86_64; en-GB; rv:1.8.1.6) Gecko/20070723 Iceweasel/2.0.0.6 (Debian-2.0.0.6-0etch1)");
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));


while ((inputLine = in.readLine()) != null)
{
source=source+" "+inputLine;
if(inputLine.contains("</head>"))
{
break;
}
}
}
catch (MalformedURLException e)
{
System.out.println("Please Enter Write Information");
}
catch (IOException e)
{
System.out.println("Please Enter Write Information");
}


// Title Data
Pattern PATTERN_tit = Pattern.compile("<title>(.*?)</title>", Pattern.CASE_INSENSITIVE|Pattern.DOTALL);

Matcher m_tit = PATTERN_tit.matcher(source);
while (m_tit.find())
{
result_tit = m_tit.group(1);
result_tit = result_tit.replace("/", "").trim();
System.out.println(result_tit);
}

// Description Data
Pattern Pattern_dis = Pattern.compile("<meta name=\"description\" content=(.*?)>", Pattern.CASE_INSENSITIVE|Pattern.DOTALL);

Matcher m_dis = Pattern_dis.matcher(source);
while (m_dis.find())
{
result_des = m_dis.group(1);
result_des = result_des.replace("/", "").trim();
System.out.println(result_des);
}

// Keyword Data
Pattern Pattern_key = Pattern.compile("<meta name=\"keywords\" content=(.*?)>",Pattern.CASE_INSENSITIVE|Pattern.DOTALL);

Matcher m_key = Pattern_key.matcher(source);
while (m_key.find())
{
result_key = m_key.group(1);
result_key = result_key.replace("/", "").trim();
System.out.println(result_key);
}
}
}

提前谢谢您。

最佳答案

如果您要获取的页面已采用 UTF-8 格式,则使用也接受 Charset 的重载 InputStreamReader 构造函数。使用 UTF-8 应该没问题。

Documentation reference .

关于java - 在java中获取twitter页面标题、描述和关键字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9034025/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com