java trie example

Trie or Prefix Tree is an interesting tree based data structure for storing set of words efficiently and for

  • Finding the words stored  
  • Finding the words with common prefix 
in an efficient manner. 

Its common use-cases are 

  • Auto-complete
  • Spell checker
  • Longest Prefix Matching

Trie Node Structure

Trie Node structure is simple with an array of Trie Node links. The size of this array depends on the data we want to insert. We want to store only english words, we use size of 26 here representing each alphabet in English.

We also store a boolean variable to mark a word's end. The Trie Node structure in Java represented as below. 

class TrieNode {
private TrieNode[] links;
private boolean isEnd;
public TrieNode() {
links = new TrieNode[26];
}
public TrieNode getNode(char ch) {
return links[ch - 'a'];
}
public void setNode(char ch, TrieNode node) {
links[ch - 'a'] = node;
}
public boolean containsKey(char ch) {
return links[ch - 'a'] != null;
}
public void setEnd() {
isEnd = true;
}
public boolean isEnd() {
return isEnd;
}
}
view raw TrieNode.java hosted with ❀ by GitHub

Trie Implementation

In this implementation, we are going to implement three methods 

  • Inserting the word into trie
  • Searching the whole word if exists in trie or not
  • Searching the prefix if exists or not
The insertion case is straight forward. We run a loop to the length of the word to be inserted into trie, for each character, we check if in the array of the node corresponding to character is not null. If it is null, we create a node and point it to the corresponding character in the array and move forward to that node. If it not null, we do not insert instead we move forward to that node. We do this until all the characters in the word are inserted into the trie and we mark isEnd variable to true which means the end of the word. 

class Trie {
private TrieNode root;
public Trie() {
root = new TrieNode();
}
public void insert(String word) {
TrieNode node = root;
for (int i = 0; i < word.length(); i++) {
char ch = word.charAt(i);
if (!node.containsKey(ch)) {
node.setNode(ch, new TrieNode());
}
node = node.getNode(ch);
}
node.setEnd();
}
private TrieNode searchPrefix(String word) {
TrieNode node = root;
for (int i = 0; i < word.length(); i++) {
char ch = word.charAt(i);
if (node.containsKey(ch)) {
node = node.getNode(ch);
} else {
return null;
}
}
return node;
}
public boolean search(String word) {
TrieNode node = searchPrefix(word);
return node != null && node.isEnd();
}
public boolean startsWith(String prefix) {
TrieNode node = searchPrefix(prefix);
return node != null;
}
}
view raw Trie.java hosted with ❀ by GitHub

The search implementation is similar to insert except here we do not insert, we check whether each word exists in the trie from the root. There are two cases here

  • If the word exists or prefix exists, then it should return the last node of the word or prefix. 
  • If it doesn't exist, it will return null.
We identify the complete word from prefix with the isEnd variable set while inserting. For prefix check, we don't need to check the isEnd variable as return of the last node means, the prefix exists in the trie. 

Trie Time Complexity Insert and search : O(length of word to be inserted)
Trie Space Complexity : In worst case, We have insert nodes equal to the length of the word.


Trie Implementation ii

In this extension of the above problem, we see how to add erase functionality and the count of search words and prefixes inserted in the trie. 

For Example, if we insert "pranay" string two times into the trie. When we search for the count, we should get 2 as answer. 

For this purpose, we are going to our existing TrieNode Entity,  Here we added two variables end and prefix to keep track of the count and corresponding methods for the encapsulation purpose. Code would be cleaner this way. 

class TrieNode {
private final TrieNode[] links;
private boolean isEnd;
private int end = 0;
private int prefix = 0;
private static final int count = 26;
public TrieNode() {
links = new TrieNode[count];
}
public TrieNode getNode(char ch) {
return links[ch - 'a'];
}
public void setNode(char ch, TrieNode node) {
links[ch - 'a'] = node;
}
public boolean containsKey(char ch) {
return links[ch - 'a'] != null;
}
public void setEnd(boolean value) {
isEnd = value;
}
public boolean isEnd() {
return isEnd;
}
public void incrementEnd() {
end++;
}
public int getEnd() {
return end;
}
public void incrementPrefix() {
prefix++;
}
public int getPrefix() {
return prefix;
}
public void decrementPrefix() {
prefix--;
}
public void decrementEnd() {
end--;
}
}
view raw TrieNode.java hosted with ❀ by GitHub


Updated Implementation


While inserting, we increment prefix  count of the node till the end and in the last node, we increment end variable count. For returning count of both search and prefix, we can reuse the old searchprefix method which will return the last node the word or prefix and we can return the prefix or end count from the node. 

For Erase to erase the word from trie, we do the exact opposite of the insert. We decrement the prefix for every character and at the end, we decrement the end and mark isEnd as false if the count is equal to zero. Here there is no need delete the pointers for simplicity. 

We are assuming, the erase will be called for the words which already exist in the trie.

class TrieWithErase {
private TrieNode root;
public TrieWithErase() {
root = new TrieNode();
}
public void insert(String word) {
TrieNode node = root;
for (int i = 0; i < word.length(); i++) {
char ch = word.charAt(i);
if (!node.containsKey(ch)) {
node.setNode(ch, new TrieNode());
}
node = node.getNode(ch);
node.incrementPrefix();
}
node.incrementEnd();
node.setEnd(true);
}
private TrieNode searchPrefix(String word) {
TrieNode node = root;
for (int i = 0; i < word.length(); i++) {
char ch = word.charAt(i);
if (node.containsKey(ch)) {
node = node.getNode(ch);
} else {
return null;
}
}
return node;
}
public boolean search(String word) {
TrieNode node = searchPrefix(word);
return node != null && node.isEnd();
}
public int getEndsWithCount(String word) {
TrieNode node = searchPrefix(word);
return (node != null && node.isEnd()) ? node.getEnd() : 0;
}
public boolean startsWith(String prefix) {
TrieNode node = searchPrefix(prefix);
return node != null;
}
public int getStartsWithCount(String word) {
TrieNode node = searchPrefix(word);
return (node != null) ? node.getPrefix() : 0;
}
// Assuming erase will be called only if the word exists
public void eraseWord(String word) {
TrieNode node = root;
for (int i = 0; i < word.length(); i++) {
char ch = word.charAt(i);
if (node.containsKey(ch)) {
node = node.getNode(ch);
node.decrementPrefix();
}
}
node.decrementEnd();
if (node.getEnd() == 0) {
node.setEnd(false);
}
}
}
view raw TrieWithErase.java hosted with ❀ by GitHub

Post a Comment