gpt4 book ai didi

java - lucene中 boolean 逻辑的正确使用

转载 作者:行者123 更新时间:2023-11-30 07:08:39 25 4
gpt4 key购买 nike

我对这个问题表示歉意,但这让我有点困惑。

首先,我有一组地址对象,并且我试图通过查询(以伪代码)查找相关的地址对象,该查询看起来像这样

SELECT
WHERE
Fuzzy(addr1, "address line 1) // = true
AND
(Fuzzy(addr2, "address line 2") OR
Fuzzy(addrcity, "address city") OR
//all the other address fields
)

本质上,我想返回至少地址行第一粗略匹配的所有实体,并且地址的其他部分之一也有模糊匹配。

我已通过此查询验证数据是否存在:

Query toRun = new FuzzyQuery(new Term("addr1", getLineOne()));

这会返回包含所有正确字段的文档。

我的代码如下:

public List<Address> search() {
List<Address> results = new ArrayList<>();

BooleanQuery.Builder queryBuilder = new BooleanQuery.Builder();
queryBuilder.setMinimumNumberShouldMatch(2);

BooleanQuery.Builder subQueryBuilder = new BooleanQuery.Builder();
subQueryBuilder.setMinimumNumberShouldMatch(1);

if(!getLineOne().equals("")) {
Query query = new FuzzyQuery(new Term("addr1", getLineOne()));
queryBuilder.add(query, BooleanClause.Occur.MUST);
}

if(!getLineTwo().equals("")) {
Query query = new FuzzyQuery(new Term("addr2", getLineTwo()));
subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
}
if(!getCity().equals("")) {
Query query = new FuzzyQuery(new Term("addrcity", getCity()));
subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
}
if(!getCounty().equals("")) {
Query query = new FuzzyQuery(new Term("addrcounty", getCounty()));
subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
}
if(!getCountry().equals("")) {
Query query = new FuzzyQuery(new Term("addrcountry", getCountry()));
subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
}
if(!getPostcode().equals("")) {
Query query = new FuzzyQuery(new Term("addrpostcode", getPostcode()));
subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
}

queryBuilder.add(subQueryBuilder.build(), BooleanClause.Occur.MUST);

try {
Query toRun = queryBuilder.build();

List<Document> searchResults = SearchEngine.getInstance(SEARCH_ENGINE)
.performSearch(toRun, 50);

searchResults.forEach(result -> {
results.add(new Address(result));
});
} catch (IOException e) {
e.printStackTrace();
}


return results;
}

当向对象提供第一行、第二行和国家/地区时,这会生成如下文本形式的查询:

(+addr1:地址行1~2 +((addr2:地址行2~2 addrcountry:罗马尼亚~2)~1))~2

如上所述,什么也不返回。

我的逻辑哪里出了问题?

最佳答案

您需要摆脱第一个minimumShouldMatch调用。

setMinimumNumberShouldMatch 指定必须匹配的 SHOULD 子句数量。您的 queryBuilder 没有 SHOULD 子句,因此它显然无法匹配其中两个子句,因此您不会得到任何结果。

您只需删除两行 setMinimumNumberShouldMatch 行,即可获得正常运行的查询。或者,您可以使用minimumShouldMatch 逻辑并简化为仅使用一个BooleanQuery,如下所示:

public List<Address> search() {
List<Address> results = new ArrayList<>();

BooleanQuery.Builder queryBuilder = new BooleanQuery.Builder();
queryBuilder.setMinimumNumberShouldMatch(1);

if(!getLineOne().equals("")) {
//This is a MUST clause, and so doesn't factor into the minimumShouldMatch
Query query = new FuzzyQuery(new Term("addr1", getLineOne()));
queryBuilder.add(query, BooleanClause.Occur.MUST);
}

if(!getLineTwo().equals("")) {
Query query = new FuzzyQuery(new Term("addr2", getLineTwo()));
queryBuilder.add(query, BooleanClause.Occur.SHOULD);
}
if(!getCity().equals("")) {
Query query = new FuzzyQuery(new Term("addrcity", getCity()));
queryBuilder.add(query, BooleanClause.Occur.SHOULD);
}
if(!getCounty().equals("")) {
Query query = new FuzzyQuery(new Term("addrcounty", getCounty()));
queryBuilder.add(query, BooleanClause.Occur.SHOULD);
}
if(!getCountry().equals("")) {
Query query = new FuzzyQuery(new Term("addrcountry", getCountry()));
queryBuilder.add(query, BooleanClause.Occur.SHOULD);
}
if(!getPostcode().equals("")) {
Query query = new FuzzyQuery(new Term("addrpostcode", getPostcode()));
queryBuilder.add(query, BooleanClause.Occur.SHOULD);
}

try {
Query toRun = queryBuilder.build();

List<Document> searchResults = SearchEngine.getInstance(SEARCH_ENGINE)
.performSearch(toRun, 50);

searchResults.forEach(result -> {
results.add(new Address(result));
});
} catch (IOException e) {
e.printStackTrace();
}

return results;
}

关于java - lucene中 boolean 逻辑的正确使用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39598795/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com