Performance
Performance Benchmark
- Environment
- Configuration
- Dataset
- Update Performance
- Query Performance (with no updates)
- Query Performance (with updates)
Environment Setup
(1) Machine Setup
In this test, we use the server with the following setup:
Operating System and Software:Red Hat Enterprise Linux workstation Release 6.1 (Santiago) Kernel Linux 2.6.32-131.4.1.el6.x86_64 GNOME 2.28.2 MySQL Server version: 5.1.52
Hardware:
8-core Intel Xeon CPU E5620 @ 2.40 GHz Memory: 64GB Disksize: 350GB
(2) System Startup Configuration
We used default startup configuration for both Sensei and MySQL. However, Sensei requires custom property file for each application. So to start Sensei, checkout the sensei code from https://github.com/senseidb/sensei/downloads, Sensei comes with a car demo setup, so replace the car demo configuration file with this one: https://github.com/sguo/search-perf/tree/master/configs/mysql/cars/conf, and replace the original car demo data with a much bigger 3 million dataset(we will talk about how the data source is generated in the following section). change directory to Sensei home directory and start sensei as below,
$ bin/start-sensei-node.sh example/cars/conf/
( By default, Sensei will start up with 1GB memory: HEAP_OPTS="-Xmx1g -Xms1g -XX:NewSize=256m"
)
(3) Data Source
We use car data to generate the testing dataset, and the testing dataset will be used by both MySQL and Sensei. To do this, firstly download the car demo dataset from: https://raw.github.com/senseidb/sensei/master/example/cars/data/cars.json then we replicated car data 200 times to get to 3M docs. The python script to do this could be downloaded here: https://github.com/kwei/search-perf/tree/master/data. To generate the large 3 million testing dataset, simply run:
python expandData.py cars.json cars3m.json 3000000
It will create the testing data which contains 3 million lines. we also created a MySql table:
CREATE TABLE car( id INT not null AUTO_INCREMENT, category VARCHAR(30), city VARCHAR(255), mileage INT, tags VARCHAR(255), color VARCHAR(15), price FLOAT, groupid LONG, makemodel VARCHAR(255), year INT, contents VARCHAR(255), PRIMARY KEY (id), fulltext (contents) )
mysql> desc cars;
+-----------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------+--------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | category | varchar(30) | YES | | NULL | | | city | varchar(255) | YES | | NULL | | | mileage | int(11) | YES | | NULL | | | tags | varchar(255) | YES | | NULL | | | color | varchar(15) | YES | | NULL | | | price | float | YES | | NULL | | | groupid | mediumtext | YES | | NULL | | | makemodel | varchar(255) | YES | | NULL | | | year | int(11) | YES | | NULL | | | contents | varchar(255) | YES | MUL | NULL | | +-----------+--------------+------+-----+---------+----------------+ 11 rows in set (0.00 sec)
We insert the 3 million testing dataset into MySQL database. Note that columns color and price have indexes, and contents is a fulltext field containing tokens from: color,category,tags, makemodel, and city.
To create index for the two columns of the table, we can execute the following two SQL statements:
mysql> create index index_color on car (color); Query OK, 3000000 rows affected (2 min 42.32 sec) Records: 3000000 Duplicates: 0 Warnings: 0 mysql> create index index_price on car (price); Query OK, 3000000 rows affected (2 min 47.14 sec) Records: 3000000 Duplicates: 0 Warnings: 0
Data Inserting Test
Inserting data results (3M records):
MySQL: Time spent:1469924 ms ( 2040.92 rows per second) Sensei: Time spent:449413 ms ( 6675.37 rows per second)
To insert data into MySQL server, we used the MySQL Java connector, and write the custom code to measure the inserting time.
To insert data into Sensei, we have configured the Gateway in the configuration file, so when Sensei is started, it will automatically insert data into the system.
Both Sensei and MySQL insert the data one by one for time measurement convenience.
Query Latency Test
We evaluated Sensei query latency in two ways. First we tested the latency while no inserting data statement is executed, then we tested the query latency while another thread keeps inserting data into the table.
(1) Query latency without inserting data
We did the following 10 types of query latency test. Each result covers a certain type of queries.
Summary
# | Query Type | MySQL | Sensei (JSON API) | Sensei (BQL) |
---|---|---|---|---|
1 | Simple select with a range query, including result count | 1.19 s | 34 ms | 33 ms |
2 | Sorting on one field with no index | 2.35 s | 40 ms | 37 ms |
3 | Sorting on one indexed field | 0.00 s | 40 ms | 39 ms |
4 | Sorting on two indexed fields | 2.31 s | 45 ms | 49 ms |
5 | Sorting on three fields, 2 index, 1 not indexed | 2.66 s | 46 ms | 45 ms |
6 | Group By on a field with an index | 5.07 s | 72 ms / 46 ms (facet-only) | 60 ms / 47 ms (facet-only) |
7 | Group By on a field without an index | 3.59 s | 49 ms / 40 ms (facet-only) | 41 ms / 44 ms (facet-only) |
8 | With text search | 1.11 s | 44 ms | 45 ms |
9 | With text search, with sort | 3.78 s | 46 ms | 46 ms |
10 | With text search, with sort and group by | 3.71 s | 65 ms / 52 ms (facet-only) | 64 ms / 52 ms (facet-only) |
Details
- (1) Simple select on mileage, a range query, including result count. Sensei automatically returns the count, which translate to the following mysql queries:
mysql> select id from car where mileage>12750 limit 1,2;
+----+ | id | +----+ | 2 | | 3 | +----+ 2 rows in set (0.00 sec)
mysql> select count(*) from car where mileage>12750;
+----------+ | count(*) | +----------+ | 1803200 | +----------+ 1 row in set (1.19 sec)
Sensei result(The two MySQL query can be done by sending one query to Sensei since Sensei result contains the total document number):
JSON API: 34ms
{ "query": { "query_string": { "query": "" } }, "from": 1, "size": 2, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }] }
Sensei BQL: 33ms
select * from car where mileage>12750 limit 1,2
- (2) Sorting on one field with no index:
mysql> select id from car where mileage>12750 order by category limit 1,10;
+-------+ | id | +-------+ | 18891 | | 14084 | | 14085 | | 14086 | | 14087 | | 14088 | | 14089 | | 14090 | | 14091 | | 14092 | +-------+ 10 rows in set (2.35 sec)
Sensei result:
JSON API: 40ms
{ "query": { "query_string": { "query": "" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "sort":[ {"category":"asc"} ] }
Sensei BQL: 37ms
select * from car where mileage>12750 order by category limit 1,10
- (3) Sorting on one indexed field:
mysql> select id from car where mileage>12750 order by color limit 1,10;
+----+ | id | +----+ | 55 | | 57 | | 60 | | 63 | | 65 | | 66 | | 70 | | 81 | | 84 | | 90 | +----+ 10 rows in set (0.00 sec)
Sensei result:
JSON API: 40ms
{ "query": { "query_string": { "query": "" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "sort":[ {"color":"asc"} ] }
Sensei BQL: 39ms
select * from car where mileage>12750 order by color limit 1,10
- (4) Sorting on two indexed fields:
mysql> select id from car where mileage>12750 order by color,price limit 1,10;
+--------+ | id | +--------+ | 32205 | | 2205 | | 47205 | | 62205 | | 122205 | | 107205 | | 137205 | | 167205 | | 152205 | | 77205 | +--------+ 10 rows in set (2.31 sec)
Sensei result:
JSON API: 45ms
{ "query": { "query_string": { "query": "" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "sort":[ {"color":"asc"}, {"price":"asc"} ] }
Sensei BQL: 49ms
select * from car where mileage>12750 order by color,price limit 1,10
- (5) Sorting on three fields, 2 index, 1 not indexed:
mysql> select id from car where mileage>12750 order by color,price,category limit 1,10;
+--------+ | id | +--------+ | 17205 | | 47205 | | 62205 | | 32205 | | 77205 | | 92205 | | 107205 | | 137205 | | 152205 | | 122205 | +--------+ 10 rows in set (2.66 sec)
Sensei result:
JSON API: 46ms
{ "query": { "query_string": { "query": "" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "sort":[ {"color":"asc"}, {"price":"asc"}, {"category":"asc"} ] }
Sensei BQL: 45ms
select * from car where mileage>12750 order by color,price,category limit 1,10
- (6) Group By on a field with an index:
mysql> select color,count(*) from car where mileage>12750 group by color limit 1,10;
+--------+----------+ | color | count(*) | +--------+----------+ | blue | 135000 | | gold | 131400 | | green | 128400 | | red | 254200 | | silver | 249200 | | white | 265200 | | yellow | 264800 | +--------+----------+ 7 rows in set (5.07 sec)
Sensei result:
JSON API: 72ms
{ "query": { "query_string": { "query": "" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "facets": { "color": { "max": 10, "minCount": 1, "expand": false, "order": "hits" } }, "groupBy" : { "columns" : ["color"], "top" : 1 } }
Sensei BQL: 60ms
select color,_grouphitscount from car where mileage>12750 group by color top 1 limit 1,10
However, Sensei comes with the facet result, which contains the count number for columns, so the groupBy is not necessary when facet is specified. A facet only query is a lightweight alternative in this case:
JSON API: 46ms
{ "query": { "query_string": { "query": "" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "facets": { "color": { "max": 10, "minCount": 1, "expand": false, "order": "hits" } } }
Sensei BQL: 47ms
select color from car where mileage>12750 browse by color limit 1,10
- (7) Group By on a field without an index:
mysql> select category,count(*) from car where mileage>12750 group by category limit 1,10;
+---------------+----------+ | category | count(*) | +---------------+----------+ | exotic | 222600 | | luxury | 330600 | | mini-van | 105200 | | sedan | 133200 | | sports car | 193400 | | station wagon | 27800 | | sub-compact | 58800 | | suv | 163800 | | truck | 56800 | | van | 19600 | +---------------+----------+ 10 rows in set (3.59 sec)
Sensei result:
JSON API: 49ms
{ "query": { "query_string": { "query": "" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "facets": { "category": { "max": 10, "minCount": 1, "expand": false, "order": "hits" } }, "groupBy" : { "columns" : ["category"], "top" : 1 } }
Sensei BQL: 41ms
select category,_grouphitscount from car where mileage>12750 group by category top 1 limit 1,10
Facet Only method (When group-by is used only for counting):
JSON API: 40ms
{ "query": { "query_string": { "query": "" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "facets": { "category": { "max": 10, "minCount": 1, "expand": false, "order": "hits" } } }
Sensei BQL: 44ms
select category from car where mileage>12750 browse by category limit 1,10
- (8) With text search:
mysql> select id from car where mileage>12750 AND match(contents) against('cool') limit 1,10;
+---------+ | id | +---------+ | 1163528 | | 1208529 | | 2312764 | | 1062485 | | 1058529 | | 1208528 | | 2687764 | | 1058528 | | 1988528 | | 182764 | +---------+ 10 rows in set (1.11 sec)
Sensei result:
JSON API: 44ms
{ "query": { "query_string" : { "default_field" : "contents", "query" : "cool" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }] }
Sensei BQL: 45ms
select * from car where mileage>12750 AND match(contents) against('cool') limit 1,10
- (9) With text search, with sort:
mysql> select id from car where mileage>'12750' AND match(contents) against('cool') order by category limit 1,10;
+-----+ | id | +-----+ | 7 | | 8 | | 98 | | 99 | | 100 | | 101 | | 102 | | 103 | | 105 | | 106 | +-----+ 10 rows in set (3.78 sec)
Sensei result:
JSON API: 46ms
{ "query": { "query_string" : { "default_field" : "contents", "query" : "cool" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "sort":[ {"category":"asc"} ] }
Sensei BQL: 46ms
select * from car where mileage>12750 AND match(contents) against('cool') order by category limit 1,10
- (10) With text search, with sort and group by:
mysql> select color,count(*) from car where mileage>'12750' AND match(contents) against('cool') group by color order by category limit 1,10;
+--------+----------+ | color | count(*) | +--------+----------+ | gold | 63600 | | silver | 119600 | | green | 64200 | | blue | 69000 | | black | 185600 | | yellow | 135800 | | red | 123800 | +--------+----------+ 7 rows in set (3.71 sec)
Sensei result:
JSON API: 65ms
{ "query": { "query_string" : { "default_field" : "contents", "query" : "cool" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "sort":[ {"category":"asc"} ], "groupBy" : { "columns" : ["color"], "top" : 1 } }
Sensei BQL: 64ms
select color, _grouphitscount from car where mileage>12750 AND match(contents) against('cool') group by color top 1 order by category limit 1,10
Facet Only method (When group-by is used only for counting):
JSON API: 52ms
{ "query": { "query_string" : { "default_field" : "contents", "query" : "cool" } }, "from": 1, "size": 10, "explain": false, "fetchStored": false, "selections": [ { "range":{ "mileage":{ "from":"12750", "include_lower":false, "include_upper":false } } }], "sort":[ {"category":"asc"} ], "facets": { "color": { "max": 10, "minCount": 1, "expand": false, "order": "hits" } } }
Sensei BQL: 52ms
select color from car where mileage>12750 AND match(contents) against('cool') browse by color order by category limit 1,10
(2) Query latency while inserting data
We also did the following 10 types of query latency test while keep inserting another 3 million data items. Each result also covers a certain type of queries.
Since the testing MySQL query, Sensei JSON and BQL statements are the same, we only list the latency values in the following summary table.
# | Query Type | MySQL | Sensei (JSON API) | Sensei (BQL) |
---|---|---|---|---|
1 | Simple select with a range query, including result count | 1.40 s | 28 ms | 28 ms |
2 | Sorting on one field with no index | 2.78 s | 37 ms | 36 ms |
3 | Sorting on one indexed field | 0.00 s | 40 ms | 38 ms |
4 | Sorting on two indexed fields | 3.92 s | 56 ms | 57 ms |
5 | Sorting on three fields, 2 index, 1 not indexed | 4.59 s | 63 ms | 63 ms |
6 | Group By on a field with an index | 14.91 s | 114 ms / 65 ms (facet-only) | 94 ms / 64 ms (facet-only) |
7 | Group By on a field without an index | 6.58 s | 78 ms / 69 ms (facet-only) | 56 ms / 70 ms (facet-only) |
8 | With text search | 8.44 s | 79 ms | 82 ms |
9 | With text search, with sort | 10.09 s | 82 ms | 82 ms |
10 | With text search, with sort and group by | 10.72 s | 122 ms / 98 ms (facet-only) | 123 ms / 95 ms (facet-only) |
Sensei Multi-Thread Performance Test
We also conducted the multi-thread performance test for Sensei. The following pictures show our experimental results:
Latency test (Multi-thread):
QPS test (Multi-thread):
The testing query in JSON format is:
{ "partitions":[ ], "facetInit":{ }, "facets":{ "tags":{ "max":100000, "order":"hits", "minHit":0, "expand":true }, "category":{ "max":100000, "order":"hits", "minHit":0, "expand":true }, "price":{ "max":100000, "order":"hits", "minHit":0, "expand":true } }, "fetchStored":false, "selections":[ { "terms":{ "category":{ "values":[ "sub-compact", "luxury", "station wagon", "compact" ], "excludes":[ "sedan", "mini-van", "sports car" ], "operator":"or", "_noOptimize":false } } }, { "terms":{ "tags":{ "values":[ "automatic", "chick magnet", "reliable", "cool" ], "excludes":[ "expensive", "electric", "highend" ], "operator":"or", "_noOptimize":false } } }, { "range":{ "price":{ "to":"6700", "include_lower":true, "include_upper":true, "from":"*" } } } ], "explain":false, "sorts":[ ], "termVectors":[ ] }