gpt4 book ai didi

java - 已显示合并条款

转载 作者:行者123 更新时间:2023-11-30 11:35:02 26 4
gpt4 key购买 nike

将文件内容标记化并显示内容但很少有术语合并在一起并显示的程序?

import java.io.*;
import java.util.*;
class JavaApplication1
{
static HashMap<String,Integer>hTable=new HashMap<String,Integer>();
static int word,uwords,oncewords;
public static void main(String args[])throws IOException
{

File folder=new File(File.txt);
File[] lFile=folder.listFiles();
int len=lFile.length;
for(int i=0 ;i<1 ;i++) {
File file=lFile[i];
if(file.isFile()) {
Scanner scanner=new Scanner(file);
String line = null;
StringBuilder sb = new StringBuilder();
while(scanner.hasNextLine()) {
line=scanner.nextLine();
sb.append(line);
}
// StringTokenizer st=new StringTokenizer(sb.toString(),"</>,?.[/]=()+|");
StringTokenizer st=new StringTokenizer(sb.toString()," </DOC>.,TITLE-\n");
//System.out.println("*************************");
while(st.hasMoreTokens())
{
String next=st.nextToken();
word=word+1;
if(hTable.containsKey(next))
{
int a=hTable.get(next);
hTable.put(next, a+1);
uwords++;
}
else
{
hTable.put(next,1);
System.out.println(next);
oncewords++;
}
}

}
}
System.out.println("Total number of tokens in the database is"+word);
System.out.println("Total number of tokens that are unique in the database are "+ uwords);
System.out.println("Total number of tokens that occur only once in the database is" +oncewords);

int count=0;
Collection <Integer> setofvalues=hTable.values();
Object[] Varr=setofvalues.toArray();
Arrays.sort(Varr,Collections.reverseOrder());
Set<Object> Set1 = new LinkedHashSet<Object>(Arrays.asList(Varr));
for (Object i:Set1)
{
for (Map.Entry<String, Integer> entry : hTable.entrySet())
{
/* if (i.equals(entry.getValue())&&count<30)
{
System.out.println(entry.getKey()+ "=" +entry.getValue());
count=count+1;
}*/
}
}

int avg=(word/len);
System.out.println("The average number of tokens per document" +avg);
}
}



and contents of file are:
<DOC>
<DOCNO>
1
</DOCNO>
<TITLE>
experimental investigation of the aerodynamics of a
wing in a slipstream .
</TITLE>
<AUTHOR>
brenckman,m.
</AUTHOR>
<BIBLIO>
j. ae. scs. 25, 1958, 324.
</BIBLIO>
<TEXT>
an experimental study of a wing in a propeller slipstream was
made in order to determine the spanwise distribution of the lift
increase due to slipstream at different angles of attack of the wing
and at different free stream to slipstream velocity ratios . the
results were intended in part as an evaluation basis for different
theoretical treatments of this problem .
the comparative span loading curves, together with supporting
evidence, showed that a substantial part of the lift increment
produced by the slipstream was due to a /destalling/ or boundary-layer-control
effect . the integrated remaining lift increment,
after subtracting this destalling lift, was found to agree
well with a potential flow theory .
an empirical evaluation of the destalling effects was made for
the specific configuration of the experiment .
</TEXT>
</DOC>

and the output is:
N
1
experimental
investigation
of
the
aerodynamics
awing
in
a
slipstream
AU
H
R
brenckman
m
B
j
ae
scs
25
1958
324
X
an
study
wing
propeller
wasmade
order
to
determine
spanwise
distribution
liftincrease
due
at
different
angles
attack
wingand
free
stream
velocity
ratios
theresults
were
intended
part
as
evaluation
basis
for
differenttheoretical
treatments
this
problem
comparative
span
loading
curves
together
with
supportingevidence
showed
that
substantial
lift
incrementproduced
by
was
destalling
or
boundary
layer
controleffect
integrated
remaining
increment
after
subtracting
found
agreewell
potential
flow
theory
empirical
effects
made
forthe
specific
configuration
experiment
Total number of tokens in the database is151
Total number of tokens that are unique in the database are 58
Total number of tokens that occur only once in the database is93

最佳答案

问题似乎在于:

  line=scanner.nextLine(); 
sb.append(line);

当将行读入 sb 时,您不会在行与行之间添加空格,因此一行中的最后一个单词会与下一行中的第一个单词合并。

关于java - 已显示合并条款,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15333278/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com