4kw+的Company数据,8kw+的Person数据,和20kw左右的关系数据。

$ head company.csv
id:ID,name,:LABEL
00002cefc5e2d05b9311a791fd009160,"岳阳市岳阳楼区亿诺厨房电器总经销",Company
00002ebe645c0f777887ff17d525ba6b,"禄丰县碧城镇零点俱乐部",Company
000037a77f24a10153dcb4c2e7ae082d,"淄川将军路星显日用品经营部",Company
0000a82f6261107197cb386c5e5e01bc,"上海高易电子有限公司",Company
0000b340976a6425d4a4be58c119b445,"木垒县阿合杰乐看民族用品店",Company
0000cc18233fd6d22b418ad49a4b0829,"萍乡市晴瑞商贸有限公司",Company
00010eb2bdb567719240bde6ea69f1b0,"西宁市城西区毳毳母婴用品经营部",Company
000150a5c28756610a4fb825f5b0e0fa,"昆明市金丛林公众电脑屋",Company
0001ac7561867af639293a1917129126,"凌海市大凌河创启教育咨询中心",Company
$ head person.csv
id:ID,name,:LABEL
00000006f2943d149920c79adec497ac,"顾海庆",Person
0000000a6a9b6a1d25b4f3fe9f783998,"刘士辉",Person
0000003beffacaeafb73ee32985202f3,"郑乃忠",Person
0000007ab0d18f3a68328b17a3378e74,"刘艳青",Person
000001719da1825fee8c986f47244745,"黄军良",Person
000001dd6ac4e639751df87b6ec3682e,"吴邦华",Person
00000250374ae5e560c68309e27cdc04,"辛均良",Person
0000028a45f28c45414adf07ce2cdc57,"王广",Person
000002afcb526d6fc4209ebf12f2decc,"朱义",Person
$ head relation.csv
:START_ID,:TYPE,:END_ID,from,to
000003cfc523f8a9c7fcad2e0c305e13,7,3b7bb5f7c992f7b51a2d060e227cc96b,000003cfc523f8a9c7fcad2e0c305e13,3b7bb5f7c992f7b51a2d060e227cc96b
000006fd2cb0bc23b6c7d974310a67fe,7,0641095a8f9698e9f52108a681746621,000006fd2cb0bc23b6c7d974310a67fe,0641095a8f9698e9f52108a681746621
000010f9b795c7a48b0ec2672bd53bf1,7,5b060d65359641193ba3f01b55f8bfea,000010f9b795c7a48b0ec2672bd53bf1,5b060d65359641193ba3f01b55f8bfea
000015763acdc7958a797198338a5f9c,7,ad8df642cb3aaadf5216f16eb6d9436f,000015763acdc7958a797198338a5f9c,ad8df642cb3aaadf5216f16eb6d9436f
000021ca5107086c623b880ac846f056,7,08dce64cc5b6c7c46b5935fdd1522c15,000021ca5107086c623b880ac846f056,08dce64cc5b6c7c46b5935fdd1522c15
00006057e098b1936447f8ce1f81038d,5,c0c328ec63b5c039a55458bfd9f8242e,00006057e098b1936447f8ce1f81038d,c0c328ec63b5c039a55458bfd9f8242e
00007e02109d6dc8fc3accddc15e1299,7,50f222e1c1fb5af8875edcbd9977db3a,00007e02109d6dc8fc3accddc15e1299,50f222e1c1fb5af8875edcbd9977db3a
00008e8236b86d868c8e52b50f9a25f3,7,2b74ec85d562c81b7ac02d9a06e8d09d,00008e8236b86d868c8e52b50f9a25f3,2b74ec85d562c81b7ac02d9a06e8d09d
00008f88b5974fb0f09412c49355f0d1,7,9402285be44bbb0e654968dfa37890cd,00008f88b5974fb0f09412c49355f0d1,9402285be44bbb0e654968dfa37890cd

使用neo4j-import大概16分钟左右可以灌完:

$ bin/neo4j-admin import --ignore-missing-nodes=true --nodes /Users/argan/Data/company.csv --nodes /Users/argan/Data/person.csv --relationships /Users/argan/Data/relation.csv

...

IMPORT DONE in 19m 17s 939ms.
Imported:
  128880304 nodes
  196234357 relationships
  650229322 properties
Peak memory usage: 1.70 GB

然后执行查询:

MATCH (n:Company)
WHERE n.id='0afa4218343abd81efe4881917412222'
RETURN n

非常慢,耗时大概60s左右。

对Company和Person的id字段构建索引:

CREATE INDEX ON :Company(id);
CREATE INDEX ON :Person(id);

然后再执行查询:

MATCH (n:Company)
WHERE n.id='0afa4218343abd81efe4881917412222'
RETURN n

这次就非常快了:

neo4j> PROFILE MATCH (n:Company) WHERE n.id='0afa4218343abd81efe4881917412222' RETURN n;
+-----------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time | DbHits | Rows |
+-----------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.3" | "COST"  | "INTERPRETED" | 10   | 0      | 1    |
+-----------------------------------------------------------------------------------------+

+-----------------+----------------+------+---------+-----------+-------------+-------------------+
| Operator        | Estimated Rows | Rows | DB Hits | Cache H/M | Identifiers | Other             |
+-----------------+----------------+------+---------+-----------+-------------+-------------------+
| +ProduceResults |              1 |    1 |       0 |       0/0 | n           | 0.0               |
| |               +----------------+------+---------+-----------+-------------+-------------------+
| +NodeIndexSeek  |              1 |    1 |       2 |       0/0 | n           | 0.0; :Company(id) |
+-----------------+----------------+------+---------+-----------+-------------+-------------------+

+-------------------------------------------------------------------------+
| n                                                                       |
+-------------------------------------------------------------------------+
| (:Company {name: "??????????", id: "0afa4218343abd81efe4881917412222"}) |
+-------------------------------------------------------------------------+

1 row available after 9 ms, consumed after another 1 ms

或者这样子写也是等价的:

neo4j> profile match (n) where n.id = '0afa4218343abd81efe4881917412222' and n:Company return n;
+-----------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time | DbHits | Rows |
+-----------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.3" | "COST"  | "INTERPRETED" | 23   | 0      | 0    |
+-----------------------------------------------------------------------------------------+

+-----------------+----------------+------+---------+-----------+-------------+-------------------+
| Operator        | Estimated Rows | Rows | DB Hits | Cache H/M | Identifiers | Other             |
+-----------------+----------------+------+---------+-----------+-------------+-------------------+
| +ProduceResults |              1 |    0 |       0 |       0/0 | n           | 0.0               |
| |               +----------------+------+---------+-----------+-------------+-------------------+
| +NodeIndexSeek  |              1 |    0 |       1 |       0/0 | n           | 0.0; :Company(id) |
+-----------------+----------------+------+---------+-----------+-------------+-------------------+

0 rows available after 23 ms, consumed after another 0 ms

走不走索引果然性能差异巨大。

但是neo4j的schema index是挂载在label下的,如果不指定label,那么还是会走不到索引:

neo4j> profile match (n) where n.id = '0afa4218343abd81efe4881917412222' return n;
+-------------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time   | DbHits | Rows |
+-------------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.3" | "COST"  | "INTERPRETED" | 180646 | 0      | 0    |
+-------------------------------------------------------------------------------------------+

+-----------------+----------------+-----------+-----------+-----------+-------------+-----------------------------+
| Operator        | Estimated Rows | Rows      | DB Hits   | Cache H/M | Identifiers | Other                       |
+-----------------+----------------+-----------+-----------+-----------+-------------+-----------------------------+
| +ProduceResults |       12888030 |         0 |         0 |       0/0 | n           | 0.0                         |
| |               +----------------+-----------+-----------+-----------+-------------+-----------------------------+
| +Filter         |       12888030 |         0 | 128880304 |       0/0 | n           | 0.0; n.id = {  AUTOSTRING0} |
| |               +----------------+-----------+-----------+-----------+-------------+-----------------------------+
| +AllNodesScan   |      128880304 | 128880304 | 128880305 |       0/0 | n           | 0.0                         |
+-----------------+----------------+-----------+-----------+-----------+-------------+-----------------------------+

0 rows available after 43 ms, consumed after another 180603 ms

虽然neo4j如果没有指定label语义上是检索所有的label(对比Aerospike,没有指定label是检索null label),这个是我们预期的。但是没有走索引,对所有节点进行了全部扫描,性能根本没法接受。

怎么办呢?最简单的方式就是让用户查询的时候必须指定label,这样就可以走该label下的相应索引了。

那么用户能不能指定多个label呢?

neo4j的multi-labels match默认是AND操作,而这里我们要的OR的关系,所以这样子查询虽然会走索引(:Company(id))但是是查询不到结果的(被:Person filter掉了):

neo4j> profile match (n:Company:Person) where n.id = '0afa4218343abd81efe4881917412222' return n;
+-----------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time | DbHits | Rows |
+-----------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.3" | "COST"  | "INTERPRETED" | 24   | 0      | 0    |
+-----------------------------------------------------------------------------------------+

+-----------------+----------------+------+---------+-----------+-------------+-------------------+
| Operator        | Estimated Rows | Rows | DB Hits | Cache H/M | Identifiers | Other             |
+-----------------+----------------+------+---------+-----------+-------------+-------------------+
| +ProduceResults |              1 |    0 |       0 |       0/0 | n           | 0.0               |
| |               +----------------+------+---------+-----------+-------------+-------------------+
| +Filter         |              1 |    0 |       0 |       0/0 | n           | 0.0; n:Person     |
| |               +----------------+------+---------+-----------+-------------+-------------------+
| +NodeIndexSeek  |              1 |    0 |       1 |       0/0 | n           | 0.0; :Company(id) |
+-----------------+----------------+------+---------+-----------+-------------+-------------------+

0 rows available after 24 ms, consumed after another 0 ms

但是改写成OR操作发现居然走不了索引(这个很诡异,应该是neo4j的bug..):

neo4j> profile match (n) where n.id = '0afa4218343abd81efe4881917412222' and (n:Company OR n:Person) return n;
+-------------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time   | DbHits | Rows |
+-------------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.3" | "COST"  | "INTERPRETED" | 702694 | 0      | 0    |
+-------------------------------------------------------------------------------------------+

+--------------------+----------------+-----------+-----------+-----------+-------------+-----------------------------+
| Operator           | Estimated Rows | Rows      | DB Hits   | Cache H/M | Identifiers | Other                       |
+--------------------+----------------+-----------+-----------+-----------+-------------+-----------------------------+
| +ProduceResults    |        9891053 |         0 |         0 |       0/0 | n           | 0.0                         |
| |                  +----------------+-----------+-----------+-----------+-------------+-----------------------------+
| +Filter            |        9891053 |         0 | 128880304 |       0/0 | n           | 0.0; n.id = {  AUTOSTRING0} |
| |                  +----------------+-----------+-----------+-----------+-------------+-----------------------------+
| +Distinct          |       98910532 | 128880304 |         0 |       0/0 | n           | 0.0; n                      |
| |                  +----------------+-----------+-----------+-----------+-------------+-----------------------------+
| +Union             |       47410183 | 128880304 |         0 |       0/0 | n           | 0.0                         |
| |\                 +----------------+-----------+-----------+-----------+-------------+-----------------------------+
| | +NodeByLabelScan |       81470121 |  81470121 |  81470122 |       0/0 | n           | 0.0; :Person                |
| |                  +----------------+-----------+-----------+-----------+-------------+-----------------------------+
| +NodeByLabelScan   |       47410183 |  47410183 |  47410184 |       0/0 | n           | 0.0; :Company               |
+--------------------+----------------+-----------+-----------+-----------+-------------+-----------------------------+

0 rows available after 107 ms, consumed after another 702587 ms

看来只能针对每个label单独查询,最后进行合并了。但是查看了一下,neo4j倒是支持这个union的,所以不需要在业务层处理了:

neo4j> PROFILE
       MATCH (n:Company)
       WHERE n.id='0afa4218343abd81efe4881917412222'
       RETURN n limit 10
       UNION MATCH (n:Person)
       WHERE n.id='0afa4218343abd81efe4881917412222'
       RETURN n limit 10
       ;
+-----------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time | DbHits | Rows |
+-----------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.3" | "COST"  | "INTERPRETED" | 5    | 0      | 1    |
+-----------------------------------------------------------------------------------------+

+------------------+----------------+------+---------+-----------+-------------+--------------------+
| Operator         | Estimated Rows | Rows | DB Hits | Cache H/M | Identifiers | Other              |
+------------------+----------------+------+---------+-----------+-------------+--------------------+
| +ProduceResults  |              1 |    1 |       0 |       0/0 | n           | 0.0                |
| |                +----------------+------+---------+-----------+-------------+--------------------+
| +Distinct        |              1 |    1 |       0 |       0/0 | n           | 0.0; n             |
| |                +----------------+------+---------+-----------+-------------+--------------------+
| +Union           |              1 |    1 |       0 |       0/0 | n           | 0.0                |
| |\               +----------------+------+---------+-----------+-------------+--------------------+
| | +Projection    |              1 |    0 |       0 |       0/0 | n, n        | 0.0; {n :   n@116} |
| | |              +----------------+------+---------+-----------+-------------+--------------------+
| | +Limit         |              1 |    0 |       0 |       0/0 | n           | 0.0; 10            |
| | |              +----------------+------+---------+-----------+-------------+--------------------+
| | +NodeIndexSeek |              1 |    0 |       1 |       0/0 | n           | 0.0; :Person(id)   |
| |                +----------------+------+---------+-----------+-------------+--------------------+
| +Projection      |              1 |    1 |       0 |       0/0 | n, n        | 0.0; {n :   n@7}   |
| |                +----------------+------+---------+-----------+-------------+--------------------+
| +Limit           |              1 |    1 |       0 |       0/0 | n           | 0.0; 10            |
| |                +----------------+------+---------+-----------+-------------+--------------------+
| +NodeIndexSeek   |              1 |    1 |       2 |       0/0 | n           | 0.0; :Company(id)  |
+------------------+----------------+------+---------+-----------+-------------+--------------------+

+-------------------------------------------------------------------------+
| n                                                                       |
+-------------------------------------------------------------------------+
| (:Company {name: "??????????", id: "0afa4218343abd81efe4881917412222"}) |
+-------------------------------------------------------------------------+

1 row available after 4 ms, consumed after another 1 ms

或者使用collect将两个查询结果汇总也可以:

neo4j> PROFILE
       OPTIONAL MATCH (n1:Company)
       WHERE n1.id='0afa4218343abd81efe4881917412222'
       WITH collect(n1) as c1
       OPTIONAL MATCH (n2:Person)
       WHERE n2.id='0afa4218343abd81efe4881917412222'
       WITH collect(n2) + c1 as c2
       return c2
       ;
+-----------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time | DbHits | Rows |
+-----------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.3" | "COST"  | "INTERPRETED" | 5    | 0      | 1    |
+-----------------------------------------------------------------------------------------+

+-------------------+----------------+------+---------+-----------+--------------------------+-------------------+
| Operator          | Estimated Rows | Rows | DB Hits | Cache H/M | Identifiers              | Other             |
+-------------------+----------------+------+---------+-----------+--------------------------+-------------------+
| +ProduceResults   |              1 |    1 |       0 |       0/0 | anon[178], anon[192], c2 | 0.0               |
| |                 +----------------+------+---------+-----------+--------------------------+-------------------+
| +Projection       |              1 |    1 |       0 |       0/0 | anon[178], anon[192], c2 | 0.0; {c2 :  + }   |
| |                 +----------------+------+---------+-----------+--------------------------+-------------------+
| +EagerAggregation |              1 |    1 |       0 |       0/0 | anon[178], anon[192]     | 0.0; anon[192]    |
| |                 +----------------+------+---------+-----------+--------------------------+-------------------+
| +Apply            |              1 |    1 |       0 |       0/0 | c1, n2                   | 0.0               |
| |\                +----------------+------+---------+-----------+--------------------------+-------------------+
| | +Optional       |              1 |    1 |       0 |       0/0 | n2                       | 0.0               |
| | |               +----------------+------+---------+-----------+--------------------------+-------------------+
| | +NodeIndexSeek  |              1 |    0 |       1 |       0/0 | n2                       | 0.0; :Person(id)  |
| |                 +----------------+------+---------+-----------+--------------------------+-------------------+
| +EagerAggregation |              1 |    1 |       0 |       0/0 | c1                       | 0.0               |
| |                 +----------------+------+---------+-----------+--------------------------+-------------------+
| +Optional         |              1 |    1 |       0 |       0/0 | n1                       | 0.0               |
| |                 +----------------+------+---------+-----------+--------------------------+-------------------+
| +NodeIndexSeek    |              1 |    1 |       2 |       0/0 | n1                       | 0.0; :Company(id) |
+-------------------+----------------+------+---------+-----------+--------------------------+-------------------+

+---------------------------------------------------------------------------+
| c2                                                                        |
+---------------------------------------------------------------------------+
| [(:Company {name: "??????????", id: "0afa4218343abd81efe4881917412222"})] |
+---------------------------------------------------------------------------+

1 row available after 4 ms, consumed after another 1 ms

注意:这种方式必须是OPTIONAL MATCH,否则只要有一个查询没有结果,就全部为空了。另外,也需要注意到这里返回的是一个数组了,而不是一个个的元素。如果想保持一致,只需要再UNWIND一下就可以了:

PROFILE
OPTIONAL MATCH (n1:Company)
WHERE n1.id='0afa4218343abd81efe4881917412222'
WITH collect(n1) as c1
OPTIONAL MATCH (n2:Person)
WHERE n2.id='0afa4218343abd81efe4881917412222'
WITH collect(n2) + c1 as c2
UNWIND(c2) AS n
return n
;

虽然本质上还是每个label查询了一次,N个label就需要查询N次,但是在label不多的情况下,性能还是相当不错的。


进一步的,假设说用户并不知道要查询哪个label,能不能提供针对库里所有label的检索呢?比如用户传递了一个 vertexLabel=all 的参数。

答案似乎是可以的,其实就是多了一个操作而已:先把neo4j的所有labels查询出来,然后针对这些labels进行前面的操作。

Cypher中获取某个节点的labels的函数是:lables(node),但是我们是想获取db中所有的labels,cypher也提供了db.lables函数:

neo4j> call db.labels;
+-----------+
| label     |
+-----------+
| "Company" |
| "Person"  |
+-----------+

2 rows available after 33 ms, consumed after another 0 ms

NOTES

StackOverflow上有人提供的答案是根据labels(node)实现的:

neo4j> MATCH (n) RETURN distinct labels(n);
+-------------+
| labels(n)   |
+-------------+
| ["Company"] |
| ["Person"]  |
+-------------+

2 rows available after 102 ms, consumed after another 271604 ms

这个会做全表扫描,非常非常慢。千万不要使用。

但是有个问题就是cypher并不支持动态label(neo4j如何批量导入JSON数据 》4、neo4j如何支持动态node label?),所以这样子的语句是查询不到结果的:

neo4j> PROFILE
       CALL db.labels() YIELD label
       MATCH (n:label)
       WHERE n.id='0afa4218343abd81efe4881917412222'
       RETURN n;
+-----------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time | DbHits | Rows |
+-----------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.3" | "COST"  | "INTERPRETED" | 11   | 0      | 0    |
+-----------------------------------------------------------------------------------------+

+--------------------+----------------+------+---------+-----------+-------------+---------------------------------------+
| Operator           | Estimated Rows | Rows | DB Hits | Cache H/M | Identifiers | Other                                 |
+--------------------+----------------+------+---------+-----------+-------------+---------------------------------------+
| +ProduceResults    |              0 |    0 |       0 |       0/0 | label, n    | 0.0                                   |
| |                  +----------------+------+---------+-----------+-------------+---------------------------------------+
| +Filter            |              0 |    0 |       0 |       0/0 | label, n    | 0.0; n.id = {  AUTOSTRING0}           |
| |                  +----------------+------+---------+-----------+-------------+---------------------------------------+
| +Apply             |              0 |    0 |       0 |       0/0 | label, n    | 0.0                                   |
| |\                 +----------------+------+---------+-----------+-------------+---------------------------------------+
| | +NodeByLabelScan |          10000 |    0 |       2 |       0/0 | label, n    | 0.0; :label                           |
| |                  +----------------+------+---------+-----------+-------------+---------------------------------------+
| +ProcedureCall     |          10000 |    2 |       1 |       0/0 | label       | 0.0; db.labels() :: (label :: String) |
+--------------------+----------------+------+---------+-----------+-------------+---------------------------------------+

0 rows available after 10 ms, consumed after another 1 ms

必须使用procedure进行服务端拼接。这里可以使用 APOC Procedures’ cypher.run():

neo4j> PROFILE
       CALL db.labels() YIELD label
       WITH label
       CALL apoc.cypher.run("match (n:" + label + ") where n.id=" + "'0afa4218343abd81efe4881917412222'" + " return n", null) YIELD value
       RETURN value.n AS node
       ;
+-----------------------------------------------------------------------------------------+
| Plan      | Statement   | Version      | Planner | Runtime       | Time | DbHits | Rows |
+-----------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 3.3" | "COST"  | "INTERPRETED" | 24   | 0      | 1    |
+-----------------------------------------------------------------------------------------+

+-----------------+----------------+------+---------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Operator        | Estimated Rows | Rows | DB Hits | Cache H/M | Identifiers        | Other                                                                                                                                                                                                                                              |
+-----------------+----------------+------+---------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| +ProduceResults |          10000 |    1 |       0 |       0/0 | label, node, value | 0.0                                                                                                                                                                                                                                                |
| |               +----------------+------+---------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| +Projection     |          10000 |    1 |       0 |       0/0 | label, node, value | 0.0; {node : value.n}                                                                                                                                                                                                                              |
| |               +----------------+------+---------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| +ProcedureCall  |          10000 |    1 |       2 |       0/0 | label, value       | 0.0; apoc.cypher.run(CoerceTo(Add(Add(Add(Add(Parameter(  AUTOSTRING0,String),Variable(label)),Parameter(  AUTOSTRING1,String)),Parameter(  AUTOSTRING2,String)),Parameter(  AUTOSTRING3,String)),String), CoerceTo(Null(),Map)) :: (value :: Map) |
| |               +----------------+------+---------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| +ProcedureCall  |          10000 |    2 |       1 |       0/0 | label              | 0.0; db.labels() :: (label :: String)                                                                                                                                                                                                              |
+-----------------+----------------+------+---------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

+-------------------------------------------------------------------------+
| node                                                                    |
+-------------------------------------------------------------------------+
| (:Company {name: "??????????", id: "0afa4218343abd81efe4881917412222"}) |
+-------------------------------------------------------------------------+

1 row available after 19 ms, consumed after another 5 ms

当然,你也可以在应用层分两步处理。先获取所有的labels,然后再使用前面那种指定多个label的方法。这样就不需要服务端动态拼接了。


TIPS 在cypher-shell中如何绑定变量

调试cypher最好的方式就是先在cypher-shell中测试通过,再在代码中拼装。那么如何在cypher中指定绑定参数呢?答案是跟web界面一样,使用:param语句。

neo4j> :param id: "0afa4218343abd81efe4881917412222"
neo4j> CALL db.labels() YIELD label
       WITH label
       CALL apoc.cypher.run('match (n:' + label + ') where n.id = {id} return n', {id: {id}}) YIELD value
       RETURN value.n AS node
       ;
+-------------------------------------------------------------------------+
| node                                                                    |
+-------------------------------------------------------------------------+
| (:Company {name: "??????????", id: "0afa4218343abd81efe4881917412222"}) |
+-------------------------------------------------------------------------+

1 row available after 1 ms, consumed after another 2 ms

neo4j> :param batch: {id: "0afa4218343abd81efe4881917412222"}
neo4j> UNWIND {batch} as row
       CALL db.labels() YIELD label
       WITH label ,row
       CALL apoc.cypher.run('match (n:' + label + ') where n.id = {id} return n', {id: row.id}) YIELD value
       RETURN value.n AS node
       ;
+-------------------------------------------------------------------------+
| node                                                                    |
+-------------------------------------------------------------------------+
| (:Company {name: "??????????", id: "0afa4218343abd81efe4881917412222"}) |
+-------------------------------------------------------------------------+

1 row available after 21 ms, consumed after another 2 ms

也可以绑定数组变量:

:param batch: [{properties: {name: "argan", label: "Person", id: "1", age: 31}}, {properties: {name: "magi", label: "Person", id: "2", age: 28}}]

注意

cypher的变量绑定有点恶心,param的key注意不要加双引号,否则会报错:

neo4j> :param batch: {id: "0afa4218343abd81efe4881917412222"}
neo4j> :params
batch: {id=0afa4218343abd81efe4881917412222}
neo4j> :param batch: {"id": "0afa4218343abd81efe4881917412222"}
Invalid input '"': expected whitespace, an identifier, UnsignedDecimalInteger, a property key name or '}' (line 1, column 9 (offset: 8))
"RETURN {"id": "0afa4218343abd81efe4881917412222"} as batch"
         ^

java代码的拼接就可以这么写:

/**
 * 因为Neo4j的索引是跟label绑定的,而且label又不支持动态绑定。所以这里采用了比较恶心的方式处理,服务端枚举所有的label,动态拼接。。
 */
@Override
public List<Vertex> getVertices(String...ids) {
    StatementBuilder sb = new StatementBuilder();
    sb.append("UNWIND {batch} as row ") //
            .append(" CALL db.labels() YIELD label ") //
            .append(" WITH label, row ") //
            .append(" CALL apoc.cypher.run(" //
                    + "'match (n:`' + label + '`) where n.id = {id} return n', {id: row.id}" //
                    + ") YIELD value") //
            .append(" RETURN value.n AS node");
    String statement = sb.toString();

    Map<String, Object> params = new HashMap<>();
    List<Map<String, Object>> batch = new ArrayList<>();
    for (String id : ids) {
        Map<String, Object> map = new HashMap<>();
        map.put("id", id);
        batch.add(map);
    }
    params.put("batch", batch);

    return cypher.query(statement, params, new StatementResultMapper<List<Vertex>>() {
        @Override
        public List<Vertex> mapResult(StatementResult result) {
            List<Vertex> vs = new ArrayList<>();
            while (result.hasNext()) {
                Record record = result.next();
                for (Value val : record.values()) {
                    if (!StringUtils.equalsIgnoreCase(val.type().name(), "NODE")) {
                        throw new RuntimeException("value should be a NODE");
                    }
                    Vertex v = ModelUtil.toVertex(val.asNode());
                    vs.add(v);
                }
            }
            return vs;
        }
    });
}

参考文章

  1. How to find all labels that contain string in neo4j