gpt4 book ai didi

java - WordCount 项目缺陷

转载 作者:太空宇宙 更新时间:2023-11-04 13:11:21 26 4
gpt4 key购买 nike

我正在做一个类项目,该项目计算文本文件中的单词、行、字符和段落的总数。到目前为止,就文字而言,它是有效的,但我的字符数似乎减少了 3 个,并且该段落似乎正在计算两个额外的空行,我得到的是 5 个而不是 4 个。

这是我到目前为止所拥有的:

import java.util.*;
import java.io.*;

public class WordStats {

/* getWordCount() method will receive a String parameter
* and return the total number of words by splitting
* the received string into words and increment word count */
public static int getWordCount (String line){

int wordCount = 0;

String str [] = line.split((" "));
for (int i = 0; i <str.length; i ++){
if(str[i].length() > 0 ){
wordCount++;
}
}

return wordCount;
}

/* getParsCount method receives a string parameter
* and returns the total number of paragraphs in
* the text file. */
/*public static int getParsCount(String line){

int parCount=0;
boolean isText = false;

if(!line.isEmpty()){
isText=false;
}

else {
isText=true;
parCount++;

}

return parCount;
}
*/

public static int getParsCount(String line) {
boolean isText=false;
if (!line.isEmpty()) {
if (!isText) {
isText = true;
return 1;
}
}
else {
isText = false;
}

return 0;
}
public static void main(String[] args) {

try{

int chars =0, words = 1, lines =0, pars=0;

// creates new Scanner inFile
Scanner inFile = new Scanner(new File("data.txt"));

//creates file to write updated data file.
PrintWriter outFile = new PrintWriter(new FileOutputStream("dataCopy.txt"));

//Loop that sends string variables to methods so long as there is another
//line break in the file.
while(inFile.hasNextLine()){

String line = inFile.nextLine();// read aline from the input file

lines++; //increment line count
chars += (line.length()); //increment char count
words += getWordCount(line); //Increment word count
pars += getParsCount(line); // increment paragraph count.
outFile.println(line + "\n");
}

System.out.println("The number of Characters in the file are: " + chars);
System.out.println("The number of Words in the file are: " + words);
System.out.println("The number of Lines in the file are: " + lines);
System.out.println("The number of Paragraphs in the file are: " + pars);
inFile.close(); // closes file input.
outFile.close();// closes output file.
System.out.print("File Written");
}

catch(FileNotFoundException e){
System.out.print("ERROR: CANNOT PROCESS FILE");
}

}

}

这是输入文件:

Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in
Liberty, and dedicated to the proposition that all men are created equal.

Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so
dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a
portion of that field, as a final resting place for those who here gave their lives that that nation might
live. It is altogether fitting and proper that we should do this.

But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground.
The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add
or detract. The world will little note, nor long remember what we say here, but it can never forget
what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which
they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great
task remaining before us -- that from these honored dead we take increased devotion to that cause for which
they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have
died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government of
the people, by the people, for the people, shall not perish from the earth.



Abraham Lincoln
November 19, 1863

输出是这样的:

The number of Characters in the file are: 1495
The number of Words in the file are: 283
The number of Lines in the file are: 22
The number of Paragraphs in the file are: 5

最佳答案

您可以对代码进行以下更改,以使其能够正确计算输入文件中的段落数或连续文本 block 的数量。我创建了一个 boolean 标志,如果当前行有内容,则该标志设置为 true;如果当前行有内容,则设置为 false。那么,如果两个段落之间有多个空行,则多个空行只算一次。此外,输入文件末尾的额外空行将被忽略。

public class WordStats2 {

boolean isText = false;

public static int getParsCount(String line) {
if (!line.trim().isEmpty()) {
if (!isText) {
isText = true;
return 1;
}
}
else {
isText = false;
}

return 0;
}
}

由于您从未向我们展示过您的输入,因此我们只能推测为什么字符数也减少了。一种可能性是文件末尾的额外空行又是罪魁祸首。这些“空”行并不是空的,而是实际上包含一个或多个行尾字符(Windows 中为 \r\n,Linux 中为 \n)。所以你的程序可能正在计算这些字符。发表您的意见,我可以修改我的答案。

关于java - WordCount 项目缺陷,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33928789/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com