我正在尝试将大约 200 万个节点插入到 Neo4j 中,但遇到了性能问题。
我正在使用带有用 java 编写的服务器扩展的 neo4j enterprise 2.2.0。我的电脑有一个 ssd、32gb 内存、Intel Core i7 cpu 并且正在运行 Windows 8。我运行一个独立版本的服务器并通过运行 bin 文件夹中的 Neo4j.bat 来启动它。
现在插入 10 000 个没有关系的节点大约需要 25 秒(我稍后需要添加关系,但当时有一个问题)。
我认为这是配置问题,所以我尝试了一些设置,但性能没有变化。我觉得奇怪的是,即使我在 neo4j-wrapper.conf 中将 initmemory 和 maxmemory 设置为 15000,java 进程最多也只能分配 3gb。
for (Thing t : things) {
List<ValuePair> properties = parseThing(t);
String uid = createUid(t);
try (Transaction tx = graphDb.beginTx()) {
Node node = graphDb.createNode();
node.setProperty("uid", uid);
for (ValuePair vp : properties) {
node.setProperty(vp.getName(), vp.getValue());
# Neo4j
# neo4j.properties - database tuning parameters
# Enable this to be able to upgrade a store from an older version.
# The amount of memory to use for mapping the store files, in bytes (or
# kilobytes with the 'k' suffix, megabytes with 'm' and gigabytes with 'g').
# If Neo4j is running on a dedicated server, then it is generally recommended
# to leave about 2-4 gigabytes for the operating system, give the JVM enough
# heap to hold all your transaction state and query context, and then leave the
# rest for the page cache.
# The default page cache memory assumes the machine is dedicated to running
# Neo4j, and is heuristically set to 75% of RAM minus the max Java heap size.
# Enable this to specify a parser other than the default one.
# Keep logical logs, helps debugging but uses more disk space, enabled for
# legacy reasons To limit space needed to store historical logs use values such
# as: "7 days" or "100M size" instead of "true".
#keep_logical_logs=7 days
# Autoindexing
# Enable auto-indexing for nodes, default is false.
# The node property keys to be auto-indexed, if enabled.
# Enable auto-indexing for relationships, default is false.
# The relationship property keys to be auto-indexed, if enabled.
# Enable shell server so that remote clients can connect via Neo4j shell.
# The network interface IP the shell will listen on (use 0.0.0 for all interfaces).
# The port the shell will listen on, default is 1337.
# The type of cache to use for nodes and relationships.
# Maximum size of the heap memory to dedicate to the cached nodes.
# Maximum size of the heap memory to dedicate to the cached relationships.
# Enable online backups to be taken from this database.
# Port to listen to for incoming backup requests.
# Uncomment and specify these lines for running Neo4j in High Availability mode.
# See the High availability setup tutorial for more details on these settings
# http://neo4j.com/docs/2.2.0/ha-setup-tutorial.html
# ha.server_id is the number of each instance in the HA cluster. It should be
# an integer (e.g. 1), and should be unique for each cluster instance.
# ha.initial_hosts is a comma-separated list (without spaces) of the host:port
# where the ha.cluster_server of all instances will be listening. Typically
# this will be the same for all cluster instances.
# IP and port for this instance to listen on, for communicating cluster status
# information iwth other instances (also see ha.initial_hosts). The IP
# must be the configured IP address for one of the local interfaces.
# IP and port for this instance to listen on, for communicating transaction
# data with other instances (also see ha.initial_hosts). The IP
# must be the configured IP address for one of the local interfaces.
# The interval at which slaves will pull updates from the master. Comment out
# the option to disable periodic pulling of updates. Unit is seconds.
# Amount of slaves the master will try to push a transaction to upon commit
# (default is 1). The master will optimistically continue and not fail the
# transaction even if it fails to reach the push factor. Setting this to 0 will
# increase write performance when writing through master but could potentially
# lead to branched data (or loss of transaction) if the master goes down.
# Strategy the master will use when pushing data to slaves (if the push factor
# is greater than 0). There are two options available "fixed" (default) or
# "round_robin". Fixed will start by pushing to slaves ordered by server id
# (highest first) improving performance since the slaves only have to cache up
# one transaction at a time.
# Policy for how to handle branched data.
# Clustering timeouts
# Default timeout.
# How often heartbeat messages should be sent. Defaults to ha.default_timeout.
# Timeout for heartbeats between cluster members. Should be at least twice that of ha.heartbeat_interval.
# Neo4j
# neo4j-server.properties - runtime operational settings
# Server configuration
# location of the database directory
# Low-level graph engine tuning file
# Database mode
# Allowed values:
# HA - High Availability
# SINGLE - Single mode, default.
# To run in High Availability mode, configure the neo4j.properties config file, then uncomment this line:
# Let the webserver only listen on the specified IP. Default is localhost (only
# accept local connections). Uncomment to allow any connection. Please see the
# security section in the neo4j manual before modifying this.
# Require (or disable the requirement of) auth to access Neo4j
# HTTP Connector
# http port (for all data, administrative, and UI access)
# HTTPS Connector
# Turn https-support on/off
# https port (for all data, administrative, and UI access)
# Certificate location (auto generated if the file does not exist)
# Private key location (auto generated if the file does not exist)
# Internally generated keystore (don't try to put your own
# keystore there, it will get deleted when the server starts)
# Comma separated list of JAX-RS packages containing JAX-RS resources, one
# package name for each mountpoint. The listed package names will be loaded
# under the mountpoints specified. Uncomment this line to mount the
# org.neo4j.examples.server.unmanaged.HelloWorldResource.java from
# neo4j-server-examples under /examples/unmanaged, resulting in a final URL of
# http://localhost:7474/examples/unmanaged/helloworld/{nodeId}
# HTTP logging configuration
# HTTP logging is disabled. HTTP logging can be enabled by setting this
# property to 'true'.
# Logging policy file that governs how HTTP log output is presented and
# archived. Note: changing the rollover and retention policy is sensible, but
# changing the output format is less so, since it is configured to use the
# ubiquitous common log format
# Administration client configuration
# location of the servers round-robin database directory. possible values:
# - absolute path like /var/rrd
# - path relative to the server working directory like data/rrd
# - commented out, will default to the database data directory.
# Property file references
# JVM Parameters
# Remote JMX monitoring, uncomment and adjust the following lines as needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/7/docs/technotes/guides/management/agent.html
# On Unix based systems the jmx.password file needs to be owned by the user that will run the server,
# and have permissions set to 0600.
# For details on setting these file permissions on Windows see:
# http://docs.oracle.com/javase/7/docs/technotes/guides/management/security-windows.html
# Some systems cannot discover host name automatically, and need this line configured:
# Uncomment the following lines to enable garbage collection logging
# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size in MB.
# Wrapper settings
# path is relative to the bin dir
# Wrapper Windows NT/2000/XP Service Properties
# WARNING - Do not modify any of these properties when an application
# using this configuration file has been installed as a service.
# Please uninstall the service before modifying this section. The
# service can then be reinstalled.
# Name of the service
# User account to be used for linux installs. Will default to current
# user if not set.
# Other Neo4j system properties
wrapper.java.additional=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 -Xdebug-Xnoagent-Djava.compiler=NONE-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005
try (Transaction tx = graphDb.beginTx()) {
for (Thing t : things) {
List<ValuePair> properties = parseThing(t);
String uid = createUid(t);
Node node = graphDb.createNode();
node.setProperty("uid", uid);
for (ValuePair vp : properties) {
node.setProperty(vp.getName(), vp.getValue());
