gpt4 book ai didi

regex - Neo4j 正则表达式字符串匹配未返回预期结果

转载 作者:行者123 更新时间:2023-12-03 09:23:28 25 4
gpt4 key购买 nike

我尝试在 Cypher 中使用 Neo4j 2.1.5 正则表达式匹配,但遇到了问题。

我需要对用户有权访问的特定字段实现全文搜索。访问要求是关键,它阻止我将所有内容转储到 Lucene 实例中并以这种方式进行查询。访问系统是动态的,因此我需要查询特定用户有权访问的节点集,然后在这些节点内执行搜索。我真的很想将节点集与 Lucene 查询进行匹配,但我不知道如何做到这一点,所以我现在只使用基本的正则表达式匹配。我的问题是 Neo4j 并不总是返回预期的结果。

例如,我有大约 200 个节点,其中之一如下:

( i:node {name: "Linear Glass Mosaic Tiles", description: "Introducing our new Rip Curl linear glass mosaic tiles. This Caribbean color combination of greens and blues brings a warm inviting feeling to a kitchen backsplash or bathroom. The colors work very well with white cabinetry or larger tiles. We also carry this product in a small subway mosaic to give you some options! SOLD OUT: Back in stock end of August. Call us to pre-order and save 10%!"})

此查询产生一个结果:

MATCH (p)-->(:group)-->(i:node)
WHERE (i.name =~ "(?i).*mosaic.*")
RETURN i

> Returned 1 row in 569 ms

但是即使描述属性与表达式匹配,此查询也会产生零结果:

MATCH (p)-->(:group)-->(i:node)
WHERE (i.description=~ "(?i).*mosaic.*")
RETURN i

> Returned 0 rows in 601 ms

即使该查询包含之前返回结果的 name 属性,它也会产生零结果:

MATCH (p)-->(:group)-->(i:node)
WITH i, (p.name + i.name + COALESCE(i.description, "")) AS searchText
WHERE (searchText =~ "(?i).*mosaic.*")
RETURN i

> Returned 0 rows in 487 ms

MATCH (p)-->(:group)-->(i:node)
WITH i, (p.name + i.name + COALESCE(i.description, "")) AS searchText
RETURN searchText

>
...
SotoLinear Glass Mosaic Tiles Introducing our new Rip Curl linear glass mosaic tiles. This Caribbean color combination of greens and blues brings a warm inviting feeling to a kitchen backsplash or bathroom. The colors work very well with white cabinetry or larger tiles. We also carry this product in a small subway mosaic to give you some options! SOLD OUT: Back in stock end of August. Call us to pre-order and save 10%!
...

mosaic

更奇怪的是,如果我搜索不同的术语,它会毫无问题地返回所有预期结果。

MATCH (p)-->(:group)-->(i:node)
WITH i, (p.name + i.name + COALESCE(i.description, "")) AS searchText
WHERE (searchText =~ "(?i).*plumbing.*")
RETURN i

> Returned 8 rows in 522 ms

然后,我尝试在节点上缓存搜索文本,并添加一个索引以查看这是否会改变任何内容,但它仍然没有产生任何结果。

CREATE INDEX ON :node(searchText)

MATCH (p)-->(:group)-->(i:node)
WHERE (i.searchText =~ "(?i).*mosaic.*")
RETURN i

> Returned 0 rows in 3182 ms

然后我尝试简化数据以重现问题,但在这个简单的情况下,它按预期工作:

MERGE (i:node {name: "Linear Glass Mosaic Tiles", description: "Introducing our new Rip Curl linear glass mosaic tiles. This Caribbean color combination of greens and blues brings a warm inviting feeling to a kitchen backsplash or bathroom. The colors work very well with white cabinetry or larger tiles. We also carry this product in a small subway mosaic to give you some options! SOLD OUT: Back in stock end of August. Call us to pre-order and save 10%!"})

WITH i, (
i.name + " " + COALESCE(i.description, "")
) AS searchText

WHERE searchText =~ "(?i).*mosaic.*"
RETURN i

> Returned 1 rows in 630 ms

我也尝试使用 CYPHER 2.1.EXPERIMENTAL 标签,但这并没有改变任何结果。我是否对正则表达式支持的工作原理做出了错误的假设?我还应该尝试其他方法或其他方法来调试问题吗?

其他信息

这是我在创建节点时对 Cypher Transactional Rest API 进行的示例调用。这是向数据库添加节点时发送的实际纯文本(除了一些便于阅读的格式之外)。任何字符串编码都只是 Go 在创建新的 HTTP 请求时执行的标准 URL 编码。

{"statements":[
{
"parameters":
{
"p01":"lsF30nP7TsyFh",
"p02":
{
"description":"Introducing our new Rip Curl linear glass mosaic tiles. This Caribbean color combination of greens and blues brings a warm inviting feeling to a kitchen backsplash or bathroom. The colors work very well with white cabinetry or larger tiles. We also carry this product in a small subway mosaic to give you some options! SOLD OUT: Back in stock end of August. Call us to pre-order and save 10%!",
"id":"lsF3BxzFdn0kj",
"name":"Linear Glass Mosaic Tiles",
"object":"material"
}
},
"resultDataContents":["row"],
"statement":
"MATCH (p:project { id: { p01 } })
WITH p

CREATE UNIQUE (p)-[:MATERIAL]->(:materials:group {name: \"Materials\"})-[:MATERIAL]->(m:material { p02 })"
}
]}

如果是编码问题,为什么搜索name工作,description不工作,并且name + description不行?有什么方法可以检查数据库以查看数据是否/如何编码。当我执行搜索时,返回的文本显示正确。

最佳答案

只是一些注意事项:

  • 可能会用 merge 替换 create unique(其工作方式略有不同)
  • 对于全文搜索,我会选择 lucene legacy index为了性能,如果您的组限制不足以将响应保持在几毫秒以下

我刚刚尝试了您的确切 json 语句,它工作完美

插入

curl -H accept:application/json -H content-type:application/json -d @insert.json \
-XPOST http://localhost:7474/db/data/transaction/commit

json:

{"statements":[
{
"parameters":
{
"p01":"lsF30nP7TsyFh",
"p02":
{
"description":"Introducing our new Rip Curl linear glass mosaic tiles. This Caribbean color combination of greens and blues brings a warm inviting feeling to a kitchen backsplash or bathroom. The colors work very well with white cabinetry or larger tiles. We also carry this product in a small subway mosaic to give you some options! SOLD OUT: Back in stock end of August. Call us to pre-order and save 10%!",
"id":"lsF3BxzFdn0kj",
"name":"Linear Glass Mosaic Tiles",
"object":"material"
}
},
"resultDataContents":["row"],
"statement":
"MERGE (p:project { id: { p01 } })
WITH p

CREATE UNIQUE (p)-[:MATERIAL]->(:materials:group {name: \"Materials\"})-[:MATERIAL]->(m:material { p02 }) RETURN m"
}
]}

查询:

MATCH (p)-->(:group)-->(i:material)
WHERE (i.description=~ "(?i).*mosaic.*")
RETURN i

返回:

name:   Linear Glass Mosaic Tiles
id: lsF3BxzFdn0kj
description: Introducing our new Rip Curl linear glass mosaic tiles. This Caribbean color combination of greens and blues brings a warm inviting feeling to a kitchen backsplash or bathroom. The colors work very well with white cabinetry or larger tiles. We also carry this product in a small subway mosaic to give you some options! SOLD OUT: Back in stock end of August. Call us to pre-order and save 10%!
object: material

您可以尝试检查数据的是查看浏览器提供的 json 或 csv 转储(结果和表格结果上的小下载图标)

或者你使用 neo4j-shell 和我的 shell-import-tools实际输出 csv 或 graphml 并检查这些文件。

或者使用一些java(或groovy)代码来检查您的数据。

neo4j-enterprise 下载中还附带了一致性检查器。这是blog post关于如何运行它。

java -cp 'lib/*:system/lib/*' org.neo4j.consistency.ConsistencyCheckTool /tmp/foo

我在这里添加了一个常规测试脚本:https://gist.github.com/jexp/5a183c3501869ee63d30

另一个想法:正则表达式标志

有时会发生多行情况,还有两个标志:

  • 多行 (?m) 也可以跨多行匹配,
  • dotall (?s) 允许点也匹配特殊字符,例如换行符

那么你可以尝试(?ism).*mosaic.*

关于regex - Neo4j 正则表达式字符串匹配未返回预期结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26571379/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com