Introduction

Sensei is a distributed data system that was built to support many product initiatives at LinkedIn.com, e.g. LinkedIn Signal and the LinkedIn Homepage. It is foundation to the LinkedIn's search and data infrastructure.

Sensei is both a search engine and a database. Sensei is designed to query and navigate through documents with parts that contain text and are unstructured, as well as parts containing meta information that have well-formed structures.

Learn more »

If you know RDBMS

A good start to understand Sensei is by comparing it to traditional RDBMSs'. This provides a quick reference point to the common feature-sets as well as the differences.

Query Language

RDBMS

SQL is the de-facto way for querying in the RDBMS world.

Sensei

In the Sensei world, the query language is BQL, which is a SQL-variant that exposes the Sensei specific functionalities.

Interface with data in a program

RDBMS

To interface with your data programmatically, the JDBC API is foundation to most Java based frameworks.

Sensei

With Sensei, we offer a variety of client libraries, e.g. Java, Python etc. over a JSON/Http Rest api. Click here for details.

Creating a table/store

RDBMS

CREATE TABLE SQL statement is issued to RDBMS

Sensei

A Sensei schema is defined in the schema.xml in the configuration file. See example.

Data population

RDBMS

Data are pushed into RDBMS via INSERT, DELETE and UPDATE SQL commands

Sensei

Data are pulled into Sensei via Gateways, which defines a flowing stream of data events. See details

Architecture Diagram


Design considerations