At a high level, a Sensei system consists of two parts: a cluster of Sensei servers (a.k.a. search nodes) and a cluster of Sensei brokers.
The cluster of Sensei servers
Each server covers one or more partitions (or shards ) of the entire index space, and is responsible for real-time indexing and searching on the partition(s) belonging to the node.
The cluster of Sensei brokers
Brokers receive search requests from clients, pass them to selected servers in the Sensei search cluster, and then merge/return search results back to the clients.
Sensei brokers are relatively simple and lightweight. Most of the work is done at Sensei servers. A Sensei server plays two roles: indexer and searcher, both of which are implemented by Zoie and Bobo embedded in the node.
Conceptually, a Sensei node consists of the following four components:
This is also simply called gateway, which is the component responsible for getting data from external sources and passing them to the indexer after an optional conversion. The input data to the gateway can be in different formats, while the output of the gateway has to be in the format acceptable by Zoie's streaming data provider. (By default, JSON is the format used by Sensei to communicate with Zoie.) The output data from the gateway also needs to match the table schema definition.
Several types of built-in gateway are available in Sensei. They can be used to get data from common data sources including:
Line-based text file containing JSON objects.
Kafka (http://sna-projects.com/kafka/)
JMS
JDBC
For each built-in data gateway, a filter can be plugged in to convert the original source data into the format defined by the table schema. (See Figure 1.2, “Sensei Data Gateway”.)
Indexing manager acts as the bridge between the gateway and the Zoie system. It is responsible for passing data from the gateway to Zoie, controlling the pace of data consumption, and maintaining the index versions on the Sensei node.
This is the underlying system powering real-time indexing and search. Two types of Zoie systems are supported by Sensei today: regular Zoie and Hourglass.
Hourglass (http://linkedin.jira.com/wiki/display/ZOIE/HourGlass+-+Forward-Rolling+Indexing) is a forward-rolling, append-only indexing system based on Zoie. It is used to power LinkedIn Signal (http://www.linkedin.com/signal).
Facet handlers are a key component in a Sensei server. They are the way we do faceted searches with Bobo. Without facet handlers, none of the column-based queries can be done in Sensei.