Glossary
A list of Sensei vocabularies.
Broker
A Sensei broker performs scatter-gather logic on query requests. The broker also listens to events from ZooKeeper and maintains a view of partition to node list map which is used to route and scatter query requests. Every node process has a broker co-located.
Data Event
Data events are units of indexing activities.
Data Stream
Stream of data events that Sensei consumes from via Gateways.
Some properties of Data Streams:
- Versioned - each event on the stream has a monotonically increasing value indicating a unique point in the stream
- Ordered - ordering of all events should reflect the semantics of the application
Gateway
Gateways are how Sensei consumes from a data stream. The following are some built-in gateways:
- File - Streaming of a text file where each line is a json representation of a data event, with each line number as the version. (Example)
- JMS - Data stream abstraction over a JMS queue.
- JDBC - Data stream abstraction over a ResultSet via a defined PreparedStatement
- Apache Kafka - Kafka is a fast distributed messaging system. A Kafka data-stream is versioned by offsets, and each message is a data event.
You can write your own Gateway implementation by simply implementing the SenseiGateway interface. And configure it via the sensei.gateway properties.
Node
A Sensei node is a Java process performing indexing work as handling query requests. A node can be configured to query over N shards.
Shard
A partition or slice of the data corpus.
Sharding Strategy
ShardingStrategy interface defines how a data event defined by a JSON object is mapped to a shard.