gpt4 book ai didi

java - apache.commons.text 余弦距离

转载 作者:行者123 更新时间:2023-11-30 10:33:10 26 4
gpt4 key购买 nike

我正在尝试使用 apache commons 的余弦距离类。但它总是返回 1.0。我错过了什么吗?这是我的代码:

public class ComputeDistance {
public static void main(String[] args)throws Exception{

CosineDistance dist = new CosineDistance();
CharSequence c1 = "example text1";
CharSequence c2 = "another file";
System.out.println(dist.apply(c1,c2));
}
}

最佳答案

CosineDistance 返回 1 - cosineSimilarity(leftVector, rightVector)leftVectorrightVector 是字符映射和字符序列中的出现次数,因此 cosineSimilarity(leftVector, rightVector) = 0 的结果.您可以更改您的代码以使用您的 char 序列的字符而不是单词:

public class ComputeDistance {
public static void main(String[] args) throws Exception {

CosineSimilarity dist = new CosineSimilarity();

String c1 = "example text1";
String c2 = "another file";

Map<CharSequence, Integer> leftVector =
Arrays.stream(c1.split(""))
.collect(Collectors.toMap(c -> c, c -> 1, Integer::sum));
Map<CharSequence, Integer> rightVector =
Arrays.stream(c2.split(""))
.collect(Collectors.toMap(c -> c, c -> 1, Integer::sum));

System.out.println(1 - dist.cosineSimilarity(leftVector,rightVector));

}
}

关于java - apache.commons.text 余弦距离,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42362054/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com