gpt4 book ai didi

java - 当字符串中有逗号时,无法找到两个字符串数组的正确交集

转载 作者:行者123 更新时间:2023-12-01 21:56:54 25 4
gpt4 key购买 nike

我有两个 CSV 文件:“userfeatures”和“itemfeatures”。用户特征中的每一行都与特定用户相关。例如,userfeature 文件中的第一行是:

005c2e08","Action","nm0000148","dir_ nm0764316","USA"

我需要找到该行与第二个文件“itemfeatures”的每一行的交集。 (实际上,我需要对所有用户重复此过程,即“userfeatures”的所有行)。

因此,第一个比较将与“itemfeatures”的第一行进行比较,即:

"tt0306047","Comedy,Action","nm0267506,nm0000221,nm0356021","dir_ nm0001878","USA"

交集的结果应该是["Action", "USA]"但不幸的是,我的代码只找到 [“USA”] 作为匹配项。这是我到目前为止所尝试过的:

public class Main {
public static void main(String[] args) throws Exception {
BufferedReader userfeatures = new BufferedReader(new FileReader("userFeatureVectorsTest.csv"));
BufferedReader itemfeatures = new BufferedReader(new FileReader("ItemFeatureVectorsTest.csv"));
ArrayList<String> userlines = new ArrayList<>();
ArrayList<String> itemlines = new ArrayList<>();
String Uline = null;
while ((Uline = userfeatures.readLine()) != null) {
for (String Iline = itemfeatures.readLine(); Iline != null; Iline = itemfeatures.readLine()) {
System.out.println(Uline);
System.out.println(Iline);
System.out.println(intersect(Uline, Iline));
System.out.println(union(Uline, Iline));
}
}
userfeatures.close();
itemfeatures.close();
}
static Set<String> intersect(String Uline, String Iline) {
Set<String> result = new HashSet<String>(Arrays.asList(Uline.split(",")));
Set<String> IlineSet = new HashSet<String>(Arrays.asList(Iline.split(",")));
result.retainAll(IlineSet);
return result;
}
static Set<String> union(String Uline, String Iline) {
Set<String> result = new HashSet<String>(Arrays.asList(Uline.split(",")));
Set<String> IlineSet = new HashSet<String>(Arrays.asList(Iline.split(",")));
result.addAll(IlineSet);
return result;
}
}

我认为该问题与 Uline.split(",") 有关和Iline.split(",")因为他们考虑"Comedy,Action"作为 1 个单词,因此找不到 [Action]作为 "Comedy,Action" 的交集和"Action" 。如果有人知道如何解决此问题,我将不胜感激。

最佳答案

尝试删除两个字符串中的双引号。

因为当你们分开

"tt0306047","Comedy,Action","nm0267506,nm0000221,nm0356021","dir_ nm0001878","USA"

您将获得

Action"

token ,永远不会匹配

"Action"

token 。

关于java - 当字符串中有逗号时,无法找到两个字符串数组的正确交集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34158934/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com