gpt4 book ai didi

configuration - Neo4j 服务器每 2 小时持续挂起一次。请帮助我了解配置是否有问题

转载 作者:行者123 更新时间:2023-12-04 07:37:25 25 4
gpt4 key购买 nike

我们有一个包含大约 6000 万个节点和等效关系的 neo4j 图形数据库。

我们一直面临持续的数据包丢失和处理延迟以及 2 小时后服务器完全挂起的问题。每次发生这种情况时,我们都必须关闭并重新启动我们的服务器,而且我们无法理解我们的配置哪里出了问题。

我们在 console.log 文件中看到以下类型的异常 -

  1. java.lang.IllegalStateException: s=DISPATCHED i=true a=null o.e.jetty.server.HttpConnection - HttpConnection@609c1158{FILLING}
  2. java.lang.IllegalStateException: s=DISPATCHED i=true a=null o.e.j.util.thread.QueuedThreadPool
  3. java.lang.IllegalStateException: org.eclipse.jetty.util.SharedBlockingCallback$BlockerTimeoutException
  4. o.e.j.util.thread.QueuedThreadPool - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$3@59d5a975 in qtp1667455214{STARTED,14<=21<=21,i=0,q=58}
  5. org.eclipse.jetty.server.Response - Committed before 500 org.neo4j.server.rest.repr.OutputFormat$1@39beaadf
  6. o.e.jetty.servlet.ServletHandler - /db/data/cypher java.lang.IllegalStateException: Committed at org.eclipse.jetty.server.Response.resetBuffer(Response.java:1253) ~[jetty-server-9.2.
  7. org.eclipse.jetty.server.HttpChannel - /db/data/cypher java.lang.IllegalStateException: Committed at org.eclipse.jetty.server.Response.resetBuffer(Response.java:1253) ~[jetty-server-9.2.
  8. org.eclipse.jetty.server.HttpChannel - Could not send response error 500: java.lang.IllegalStateException: Committed o.e.jetty.server.ServerConnector - Stopped
  9. o.e.jetty.servlet.ServletHandler - /db/data/cypher org.neo4j.graphdb.TransactionFailureException: Transaction was marked as successful, but unable to commit transaction so rolled back.

我们在 Azure D 系列 8 核 CPU、56 GB RAM UBUNTU 14.04 LTS 机器上使用单/非集群模式的 neo4j 企业版 2.2.5 服务器,并附有 500GB 数据磁盘。 p>

这是 neostore 文件大小的快照

  • 8.5G Oct 2 15:48 neostore.propertystore.db
  • 15G Oct 2 15:48 neostore.relationshipstore.db
  • 2.5G Oct 2 15:48 neostore.nodestore.db
  • 6.9M Oct 2 15:48 neostore.relationshipgroupstore.db
  • 3.7K Oct 2 15:07 neostore.schemastore.db
  • 145 Oct 2 15:07 neostore.labeltokenstore.db
  • 170 Oct 2 15:07 neostore.relationshiptypestore.db

Neo4j配置如下-

  1. Allocated 30GB to file buffer cache (dbms.pagecache.memory=30G)
  2. Allocated 20GB to JVM heap memory (wrapper.java.initmemory=20480, wrapper.java.maxmemory=20480)
  3. Using the default hpc(High performance) type cache.
  4. Forcing the RULE planner by default (dbms.cypher.planner=RULE)
  5. Maximum threads processing queries is 16(twice the number of cores) - org.neo4j.server.webserver.maxthreads=16
  6. Transaction timeout of 60 seconds - org.neo4j.server.transaction.timeout=60
  7. Guard Timeout if query execution time is greater than 10 seconds - org.neo4j.server.webserver.limit.executiontime=10000

Rest of the settings are default

我们实际上想设置一个包含 3 个节点的集群,但在此之前我们想确定我们的基本配置是否正确。请帮助我们

-------------------------------------------- ----------------------------

已编辑以添加查询样本

通常,我们的密码查询频率是每小时 18K 次查询,平均每秒大约 5-6 次查询。有时每秒大约有 80 个查询。

我们的典型查询如下所示

match (a:TypeA {param:{param}})-[:RELA]->(d:TypeD) with distinct d,a skip {skip} limit 100 optional match (d)-[:RELF]->(c:TypeC)<-[:RELF]-(b:TypeB)<-[:RELB]-(a) with distinct d,a,collect(distinct b.bid) as bids,collect(distinct c.param3) as param3Coll optional match (d)-[:RELE]->(p:TypeE)<-[:RELE]-(b1:TypeB)<-[:RELB]-(a) with distinct d as distD,bids+collect(distinct b1.bid) as tbids,param3Coll,collect(distinct p.param4) as param4Coll optional match (distD)-[:RELC]->(f:TypeF) return id(distD),distD.param5,exists((distD)<-[:RELG]-()) as param6, tbids,param3Coll,param4Coll,collect(distinct id(f)) as fids

match (a:TypeA {param:{param}})-[:RELB]->(b) return count(distinct b)

MATCH (a:TypeA{param:{param}})-[r:RELD]->(a1)-[:RELH]->(h) where r.param1=true with a,a1,h match (h)-[:RELL]->(d:TypeI) where (d.param2/2)%2=1 optional match (a)-[:RELB]-(b)-[:RELM {param3:true}]->(c) return a1.param,id(a1),collect(b.bid),c.param5

match (a:TypeA {param:{param}}) match (a)-[:RELB]->(b) with distinct b,a skip {skip} limit 100 match (a)-[:RELH]->(h1:TypeH) match (b)-[:RELF|RELE]->(x)<-[:RELF|RELE]-(h2:TypeH)<-[:RELH]-(a1) optional match (a1)<-[rd:RELD]-(a) with distinct a1,a,h1,b,h2,rd.param1 as param2,collect(distinct x.param3) as param3s,collect(distinct x.param4) as param4s optional match (a1)-[:RELB]->(b1) where b1.param7 in [0,1] and exists((b1)-[:RELF|RELE]->()<-[:RELF|RELE]-(h1)) with distinct a1,a,b,h2,param2,param3s,param4s,b1,case when param2 then false else case when ((a1.param5 in [2,3] or length(param3s)>0) or (a1.param5 in [1,3] or length(param4s)>0)) then case when b1.param7=0 then false else true end else false end end as param8 MERGE (a)-[r2:RELD]->(a1) on create set r2.param6=true on match set r2.param6=case when param8=true and r2.param9=false then true else false end MERGE (b)-[r3:RELM]->(h2) SET r2.param9=param8, r3.param9=param8

MATCH (a:TypeA {param:{param}})-[:RELI]->(g:TypeG {type:'type1'}) match (g)<-[r:RELI]-(a1:TypeA)-[:RELJ]->(j)-[:RELK]->(g) return distinct g, collect(j.displayName), collect(r.param1), g.gid, collect(a1.param),collect(id(a1))

match (a:TypeA {param:{param}})-[r:RELD {param2:true}]->(a1:TypeA)-[:RELH]->(b:TypeE) remove r.param2 return id(a1),b.displayName, b.firstName,b.lastName match (a:TypeA {param:{param}})-[:RELA]->(b:TypeB) return a.param1,count(distinct id(b))

MATCH (a:TypeA {param:{param}}) set a.param1=true;

match (a:TypeE)<-[r:RELE]-(b:TypeB) where a.param4 in {param4s} delete r return count(b);

MATCH (a:TypeA {param:{param}}) return id(a);


添加一些我一直注意到的奇怪的东西......

我已经停止了我所有的网络服务器。因此,目前没有对 neo4j 的传入请求。但是我看到在 TCP 关闭/等待状态下有大约 40K 个打开的文件句柄,这意味着客户端由于超时而关闭了它的连接并且 Neo4j 没有处理它并响应该请求。我还看到(来自 messages.log)Neo4j 服务器是仍在处理查询,并且在执行此操作时,40K 的打开文件句柄正在缓慢减少。到我写这篇文章时,大约有 27K 个打开的文件句柄处于 TCP 关闭/等待状态。

我还看到查询没有连续处理。每隔一段时间我就会在 messages.log 中看到一个暂停,并且我看到这些关于日志轮换的消息,因为一些乱序的顺序如下所示

Rotating log version:5630

2015-10-04 05:10:42.712+0000 INFO [o.n.k.LogRotationImpl]: Log Rotation [5630]: Awaiting all transactions closed...

2015-10-04 05:10:42.712+0000 INFO [o.n.k.i.s.StoreFactory]: Waiting for all transactions to close...

committed: out-of-order-sequence:95494483 [95494476]

committing: 95494483

closed: out-of-order-sequence:95494480 [95494246]

2015-10-04 05:10:43.293+0000 INFO [o.n.k.LogRotationImpl]: Log Rotation [5630]: Starting store flush...

2015-10-04 05:10:44.941+0000 INFO [o.n.k.i.s.StoreFactory]: About to rotate counts store at transaction 95494483 to [/datadrive/graph.db/neostore.counts.db.b], from [/datadrive/graph.db/neostore.counts.db.a].

2015-10-04 05:10:44.944+0000 INFO [o.n.k.i.s.StoreFactory]: Successfully rotated counts store at transaction 95494483 to [/datadrive/graph.db/neostore.counts.db.b], from [/datadrive/graph.db/neostore.counts.db.a].

我偶尔也会看到这些消息

2015-10-04 04:59:59.731+0000 DEBUG [o.n.k.EmbeddedGraphDatabase]: NodeCache array:66890956 purge:93 size:1.3485746GiB misses:0.80978173% collisions:1.9829895% (345785) av.purge waits:13 purge waits:0 avg. purge time:110ms

2015-10-04 05:10:20.768+0000 DEBUG [o.n.k.EmbeddedGraphDatabase]: RelationshipCache array:66890956 purge:0 size:257.883MiB misses:10.522135% collisions:11.121769% (5442101) av.purge waits:0 purge waits:0 avg. purge time:N/A

所有这一切都是在没有传入请求并且 neo4j 正在处理旧的待处理 40K 请求时发生的,正如我上面提到的。

既然是专用服务器,服务器不应该在没有这么大的待处理队列的情况下连续处理查询吗?我在这里错过了什么吗?请帮助我

最佳答案

没有完全回答您的问题。您应该通过使用 PROFILEEXPLAIN 作为前缀来检查您经常发送的每个查询,以查看查询计划并了解它们导致了多少次访问。

例如以下查询中的第二个匹配项看起来很昂贵,因为这两个模式彼此不相关:

MATCH (a:TypeA{param:{param}})-[r:RELD]->(a1)-[:RELH]->(h) where    r.param1=true with a,a1,h match (m)-[:RELL]->(d:TypeI) where (d.param2/2)%2=1 optional match (a)-[:RELB]-(b)-[:RELM {param3:true}]->(c)  return a1.param,id(a1),collect(b.bid),c.bPhoto

同时在 neo4j-wrapper.conf 中启用垃圾收集日志记录,并检查您是否遇到长时间停顿。如果是这样,请考虑减小堆大小。

关于configuration - Neo4j 服务器每 2 小时持续挂起一次。请帮助我了解配置是否有问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32911395/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com