gpt4 book ai didi

java - 删除映射中重复的键值对,且值位于列表中

转载 作者:行者123 更新时间:2023-12-01 16:50:42 25 4
gpt4 key购买 nike

下面是我的代码,用于检测缩写及其长形式。该代码循环遍历文档中的一行,循环遍历该行的每个单词并识别首字母缩略词候选者。然后,它再次循环遍历文档的每一行,以找到合适的缩写形式。我的问题是,如果缩略词在文档中多次出现,我的输出将包含它的多个实例。我只想将首字母缩略词及其所有可能的长形式打印一次。这是我的代码:

public static void main(String[] args) throws FileNotFoundException
{
BufferedReader in = new BufferedReader(new FileReader("D:\\Workspace\\resource\\SampleSentences.txt"));
String str=null;
ArrayList<String> lines = new ArrayList<String>();
String matchingLongForm;
List <String> matchingLongForms = new ArrayList<String>() ;
List <String> shortForm = new ArrayList<String>() ;
Map<String, List<String>> abbreviationPairs = new HashMap<String, List<String>>();


try
{
while((str = in.readLine()) != null){
lines.add(str);
}
}
catch (IOException e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}
String[] linesArray = lines.toArray(new String[lines.size()]);




// document wide search for abbreviation long form and identifying several appropriate matches
for (String line : linesArray){
for (String word : (Tokenizer.getTokenizer().tokenize(line))){
if (isValidShortForm(word)){
for (int i = 0; i < linesArray.length; i++){
matchingLongForm = extractBestLongForm(word, linesArray[i]);
//shortForm.add(word);
if (matchingLongForm != null && !(matchingLongForms.contains(matchingLongForm))){
matchingLongForms.add(matchingLongForm);

//System.out.println(matchingLongForm);
abbreviationPairs.put(word, matchingLongForms);
//matchingLongForms.clear();
}
}

if (abbreviationPairs != null){
//for(abbreviationPairs.)
System.out.println("Abbreviation Pair:" + "\t" + abbreviationPairs);
abbreviationPairs.clear();
matchingLongForms.clear();
//System.out.println("Abbreviation Pair:" + "\t" + abbreviationPairsNew);
}


else
continue;
}
}
}
}

这是当前的输出:

Abbreviation Pair:  {GLBA=[Gramm Leach Bliley act]} 
Abbreviation Pair: {NCUA=[National credit union administration]}
Abbreviation Pair: {FFIEC=[Federal Financial Institutions Examination Council]}
Abbreviation Pair: {CFR=[comments for the Report]}
Abbreviation Pair: {CFR=[comments for the Report]}
Abbreviation Pair: {CFR=[comments for the Report]}
Abbreviation Pair: {CFR=[comments for the Report]}
Abbreviation Pair: {OFAC=[Office of Foreign Assets Control]}

最佳答案

尝试使用java.util.Set来存储匹配的短格式和长格式。来自该类的 javadoc:

... If this set already contains the element, the call leaves the set unchanged and returns false. In combination with the restriction on constructors, this ensures that sets never contain duplicate elements...

关于java - 删除映射中重复的键值对,且值位于列表中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40679028/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com