gpt4 book ai didi

java - 无法从txt文件中读取单词并计算单词数

转载 作者:行者123 更新时间:2023-12-02 04:17:23 26 4
gpt4 key购买 nike

我有一个小项目来编写 Twitter 爬虫程序,在分析收集的推文时遇到了一些问题。

收集的推文放入 txt 文件中。我想要实现的是计算 txt 文件中有多少个单词、包含“engineering”一词的单词数以及主题标签的数量。以下是我到目前为止所尝试过的,

import java.io.*;
import java.util.StringTokenizer;

public class TwitterAnalyzer {

public static void main(String args[]){
try{

String keyword = "Engineering";
FileInputStream fInstream = new FileInputStream("C:\\Users\\Alan\\Documents\\NetBeansProjects\\TwitterCrawler\\"+keyword+"-data.txt");
DataInputStream in = new DataInputStream(fInstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;


int numberOfKeywords = 0;
int numberOfWords = 0;
int numberOfHashtags = 0;

while((strLine = br.readLine()) != null){

strLine = br.readLine();
System.out.println(strLine);
StringTokenizer st = new StringTokenizer(strLine, " \t\n\r\f.,;:!?\"");
while(st.hasMoreTokens()){
String word = st.nextToken();
numberOfWords++;
if(word.contains(keyword)){
numberOfKeywords++;
}
if(word.contains("#")){
numberOfHashtags++;
}
}
}



System.out.println(numberOfWords);
System.out.println(numberOfKeywords);
System.out.println(numberOfHashtags);
br.close();

}catch (FileNotFoundException fe){
fe.printStackTrace();
System.out.println("Unable to locate file");
System.exit(-1);
}catch (IOException ie){
ie.printStackTrace();
System.out.println("Unable to read file");
System.exit(-1);
}


}
}

这是link到 txt 文件。

非常感谢这里的任何内容!

最佳答案

以下代码返回:202, 14, 22

public static void main(String args[]){
try{
String keyword = "engineering";
Pattern keywordPattern = Pattern.compile(keyword);

Pattern hashTagPattern = Pattern.compile("#[a-zA-Z0-9_]");

FileInputStream fInstream = new FileInputStream("E:\\t.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fInstream));
String strLine;


int numberOfKeywords = 0;
int numberOfWords = 0;
int numberOfHashtags = 0;

while((strLine = br.readLine()) != null){
Matcher matcher = keywordPattern.matcher(strLine.toLowerCase());
while (matcher.find())
numberOfKeywords++;
numberOfWords += strLine.split("\\s").length;
matcher = hashTagPattern.matcher(strLine);
while (matcher.find())
numberOfHashtags++;
}

System.out.println(numberOfWords);
System.out.println(numberOfKeywords);
System.out.println(numberOfHashtags);
br.close();

}catch (FileNotFoundException fe){
fe.printStackTrace();
System.out.println("Unable to locate file");
System.exit(-1);
}catch (IOException ie){
ie.printStackTrace();
System.out.println("Unable to read file");
System.exit(-1);
}
}

关于java - 无法从txt文件中读取单词并计算单词数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33170486/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com