Trie or Prefix Tree is an interesting tree based data structure for storing
set of words efficiently and for
- Finding the words stored
- Finding the words with common prefix
in an efficient manner.
Its common use-cases are
- Auto-complete
- Spell checker
- Longest Prefix Matching
Trie Node Structure
Trie Node structure is simple with an array of Trie Node links. The size of
this array depends on the data we want to insert. We want to store only
english words, we use size of 26 here representing each alphabet in English.
We also store a boolean variable to mark a word's end. The Trie Node
structure in Java represented as below.
Trie Implementation
In this implementation, we are going to implement three methods
- Inserting the word into trie
- Searching the whole word if exists in trie or not
- Searching the prefix if exists or not
The insertion case is straight forward. We run a loop to the length of the
word to be inserted into trie, for each character, we check if in the array of
the node corresponding to character is not null. If it is null, we create a
node and point it to the corresponding character in the array and move forward
to that node. If it not null, we do not insert instead we move forward to that
node. We do this until all the characters in the word are inserted into the
trie and we mark isEnd variable to true which means the end of the word.
The search implementation is similar to insert except here we do not insert,
we check whether each word exists in the trie from the root. There are two
cases here
-
If the word exists or prefix exists, then it should return the last node of
the word or prefix.
- If it doesn't exist, it will return null.
We identify the complete word from prefix with the isEnd variable set while
inserting. For prefix check, we don't need to check the isEnd variable as
return of the last node means, the prefix exists in the trie.
Trie Time Complexity Insert and search : O(length of word to be inserted)
Trie Space Complexity : In worst case, We have insert nodes equal to the
length of the word.
Trie Implementation ii
In this extension of the above problem, we see how to add erase functionality
and the count of search words and prefixes inserted in the trie.
For Example, if we insert "pranay" string two times into the trie. When we
search for the count, we should get 2 as answer.
For this purpose, we are going to our existing TrieNode Entity, Here we
added two variables end and prefix to keep track of the count and
corresponding methods for the encapsulation purpose. Code would be cleaner
this way.
Updated Implementation
While inserting, we increment prefix count of the node till the end and
in the last node, we increment end variable count. For returning count of both
search and prefix, we can reuse the old searchprefix method which will return
the last node the word or prefix and we can return the prefix or end count
from the node.
For Erase to erase the word from trie, we do the exact opposite of the insert.
We decrement the prefix for every character and at the end, we decrement the
end and mark isEnd as false if the count is equal to zero. Here there is no
need delete the pointers for simplicity.
We are assuming, the erase will be called for the words which already exist in
the trie.
Post a Comment