System Design — Autocomplete
Product Requirements
- Give a list of query suggestions based on user input (as a prefix)
- Suggestions ordered by some ranking score
Non-product Requirements
- High performance, response should be quicker than user type speed, let’s say < 200ms
- Result accuracy and system availability are not strictly required
Main Data Structure
Trie is a data structure that fits prefix search well.
To save space, instead of presenting each char with one node, we present common substring in a node. For example, ‘rest’, ‘restaurant’, ‘restroom’ and ‘restriction’ share the common prefix ‘rest’. In our trie, ‘rest’ is stored in a parent node with a bit set to True indicating the the node itself is a completion (‘rest’), and its children branches are ‘restaurant’, ‘restroom’ and ‘restriction’.
Now if there’s a user input of length L, and we want to return K results from the trie that contains N words, average lookup time would be: O(L) + O(Nlog(K))