SenseiDB
Open-source, distributed, realtime, semi-structured database
Powering LinkedIn homepage and LinkedIn Signal.
Some Features:
- Full-text search
- Fast realtime updates
- Structured and faceted search
- BQL: SQL-like query language
- Fast key-value lookup
- High performance under concurrent heavy update and query volumes
- Hadoop integration
Sensei the Name
Sensei (先生) means teacher or professor in Japanese (http://en.wikipedia.org/wiki/Sensei).
It shares the same pronunciation and writing with the Chinese word that has the same meaning. This name indicates that the system can be used in place of Oracle database in many applications.
Data Guarantees
Sensei provides a high-level guarantee of durability so your data is safe.
Sensei also provides eventual consistency between data replicas while not compromising on performance.
BQL
Browse Query Language (BQL) is an SQL-like language used to to interface with Sensei.
Example BQL:
SELECT _uid,_score,color FROM members WHERE color="red" AND category IN ("van","exotic") AND MATCH(contents) AGAINST("cool leather seats") GROUP BY color TOP 3 BROWSE BY color,category ORDER BY RELEVANCE LIMIT 0,10
Clients
Client libraries to interact with SenseiDB programmatically. e.g.:
Gateways
Data can be streamed in to Sensei via different Gateways e.g.:
Hadoop integration
ETL data into Hadoop has become more and more of industry standard.
Sensei offers the ability to bootstrap from data in HDFS via a built-in Map-reduce indexing job. This can be a very appealing for a wide-variety of data-warehousing applications.
Relevance Support
Sensei provides a relevance component which makes the relevance tuning and writing a complicated relevance model much easier.
By inserting the Java code for scoring inside a request query object, we can conduct complicated ranking scheme, and the compiled model (java code) will be cached in the server.