Architecture

At a high level, a Sensei system consists of two parts: a cluster of Sensei servers (a.k.a. search nodes) and a cluster of Sensei brokers.

  1. The cluster of Sensei servers

    Each server covers one or more partitions (or shards ) of the entire index space, and is responsible for real-time indexing and searching on the partition(s) belonging to the node.

  2. The cluster of Sensei brokers

    Brokers receive search requests from clients, pass them to selected servers in the Sensei search cluster, and then merge/return search results back to the clients.

Sensei brokers are relatively simple and lightweight. Most of the work is done at Sensei servers. A Sensei server plays two roles: indexer and searcher, both of which are implemented by Zoie and Bobo embedded in the node.

Conceptually, a Sensei node consists of the following four components:

  1. Data Gateway

    This is also simply called gateway, which is the component responsible for getting data from external sources and passing them to the indexer after an optional conversion. The input data to the gateway can be in different formats, while the output of the gateway has to be in the format acceptable by Zoie's streaming data provider. (By default, JSON is the format used by Sensei to communicate with Zoie.) The output data from the gateway also needs to match the table schema definition.

    Several types of built-in gateway are available in Sensei. They can be used to get data from common data sources including:

    For each built-in data gateway, a filter can be plugged in to convert the original source data into the format defined by the table schema. (See Figure 1.2, “Sensei Data Gateway”.)

    Figure 1.2. Sensei Data Gateway

    Sensei Data Gateway

  2. Indexing Manager

    Indexing manager acts as the bridge between the gateway and the Zoie system. It is responsible for passing data from the gateway to Zoie, controlling the pace of data consumption, and maintaining the index versions on the Sensei node.

  3. Zoie System

    This is the underlying system powering real-time indexing and search. Two types of Zoie systems are supported by Sensei today: regular Zoie and Hourglass.

    Hourglass (http://linkedin.jira.com/wiki/display/ZOIE/HourGlass+-+Forward-Rolling+Indexing) is a forward-rolling, append-only indexing system based on Zoie. It is used to power LinkedIn Signal (http://www.linkedin.com/signal).

  4. Facet Handlers

    Facet handlers are a key component in a Sensei server. They are the way we do faceted searches with Bobo. Without facet handlers, none of the column-based queries can be done in Sensei.