gpt4 book ai didi

Java Scanner 没有完全读取 .txt 中的每一行

转载 作者:塔克拉玛干 更新时间:2023-11-02 08:46:54 30 4
gpt4 key购买 nike

这个程序试图将一个文本文件分成单词,然后计算每个单词被使用的次数。扫描仪似乎只读取每行的一部分,我不知道为什么。这是我第一次使用这种扫描方式。

import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Scanner;


public class WordStats {

public static void main(String args[]){
ArrayList<String> words = new ArrayList<>(1);
ArrayList<Integer> num = new ArrayList<>(1);
Scanner sc2 = null;
try {
sc2 = new Scanner(new File("source.txt"));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
while (sc2.hasNextLine()) {
Scanner s2 = new Scanner(sc2.nextLine());
boolean set=false;
while (s2.hasNext()) {
num.add(1);
String s = s2.next().replaceAll("[^A-Za-z ]", " ").toLowerCase().trim();
for(int i=0;i<words.size(); i++){
if(s.equals(words.get(i))){
num.set(i,num.get(i)+1);
set=true;
}
}
if(!set){
words.add(s);
num.add(1);
}
}
}
for(int i=0;i<words.size();i++){
System.out.println(words.get(i)+" "+num.get(i));
}
}
}

文本文件是葛底斯堡演说:

ABRAHAM LINCOLN, “GETTYSBURG ADDRESS” (19 NOVEMBER 1863)

Fourscore and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.

Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.

But, in a larger sense, we can not dedicate-we consecrate-we can not hallow-this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us-that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion-that we here highly resolve that these dead shall not have died in vain-that this nation, under God, shall have a new birth of freedom-and that government of the people, by the people, for the people shall not perish from the earth.

保留原始换行符。我的输出似乎只计算每行的一部分,并且两次将空格计算为一个单词。输出:

abraham 1
lincoln 1
gettysburg 1
address 1
2
november 1
fourscore 1
and 5
seven 1
years 1
ago 1
our 2
fathers 1
brought 1
forth 1
on 2
this 3
continent 1
a 7
new 2
nation 5
conceived 2
in 4
liberty 1
now 1
we 8
are 2
engaged 1
but 2

它可能不是扫描方法,但我更熟悉那部分代码,我不认为是这样。

最佳答案

您需要在此 while 循环开始时重置您的 boolean 集

 while (s2.hasNext()) {
set = false;

一旦您在每一行中遇到第一个重复的单词,set 始终为真,并且不会向您的列表中添加新单词。

空白计数是因为您的 replaceall 如何处理“(19”和“1863)”,因为这些“单词”中没有字母字符。

关于Java Scanner 没有完全读取 .txt 中的每一行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25792346/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com