Hadoop Indexing Sub-system

To those who are not interested in the technical details of sensei hadoop indexing, you can skip this section and continue the following demo section.

File layout of sensei hadoop indexing system source packages:

System Workflow:

Figure 6.1. Sensei Hadoop Indexing Workflow

Sensei Hadoop Indexing Workflow

As we can see from the system workflow above, Sensei Hadoop indexer is relatively independent from other Sensei components. Users can sepicfy how many shards as the system output, and also the sharding strategy, input data converter, etc. The generated index can be directly used by Sensei to bootstrap.