Implementing Trie (Prefix Tree): A Complete Guide with Java

The Problem: Fast String Lookups and Autocomplete

Imagine you're building a search engine or a text editor with autocomplete functionality. As the user types "pro", you want to instantly suggest words like "program", "project", and "professional". Or perhaps you're developing a spell checker that needs to quickly verify if a word exists in a dictionary with millions of entries.

How do you efficiently store and retrieve strings for these scenarios? A naive approach might use an array or a hash map, but these structures fall short when dealing with prefix-based operations. Arrays require linear search time, and hash maps, while offering O(1) lookups for exact matches, don't efficiently support prefix-based queries.

This is where the Trie data structure shines.

What is a Trie?

A Trie (pronounced "try" or "tree") is a specialized tree-based data structure designed for efficient string operations, particularly those involving prefixes. The name comes from "retrieval," highlighting its primary purpose.

Unlike binary trees where each node typically contains a single value, in a Trie:

Each node represents a single character in a string
The path from the root to a node forms a prefix
Children of a node are indexed by the next character in the sequence
Nodes may contain a flag indicating if they represent the end of a complete word

This structure allows Tries to excel at:

Prefix matching: Find all words starting with a given prefix
Exact string search: Check if a string exists in the collection
Lexicographical ordering: Words are naturally stored in sorted order
Space efficiency: Common prefixes are stored only once

Building a Trie in Java

Let's implement a complete Trie data structure in Java with the following operations:

Insert a word
Search for a word
Delete a word
Count words with a given prefix
Get all words with a given prefix

Here's our implementation:

java
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class Trie {
    private TrieNode root;
    
    public Trie() {
        root = new TrieNode();
    }
    
    // Node class for our Trie
    private static class TrieNode {
        Map<Character, TrieNode> children;
        boolean isEndOfWord;
        int wordCount;  // Count of words passing through this node
        
        public TrieNode() {
            children = new HashMap<>();
            isEndOfWord = false;
            wordCount = 0;
        }
    }
    
    /**
     * Inserts a word into the trie.
     */
    public void insert(String word) {
        if (word == null || word.isEmpty()) {
            return;
        }
        
        TrieNode current = root;
        
        for (char c : word.toCharArray()) {
            current.children.putIfAbsent(c, new TrieNode());
            current = current.children.get(c);
            current.wordCount++;  // Increment count for prefix statistics
        }
        
        current.isEndOfWord = true;
    }
    
    /**
     * Returns true if the word is in the trie.
     */
    public boolean search(String word) {
        if (word == null || word.isEmpty()) {
            return false;
        }
        
        TrieNode node = findNode(word);
        return node != null && node.isEndOfWord;
    }
    
    /**
     * Returns true if there is any word in the trie that starts with the given prefix.
     */
    public boolean startsWith(String prefix) {
        return findNode(prefix) != null;
    }
    
    /**
     * Returns the count of words with the given prefix.
     */
    public int countWordsWithPrefix(String prefix) {
        TrieNode node = findNode(prefix);
        return node != null ? node.wordCount : 0;
    }
    
    /**
     * Helper method to find a node corresponding to the given string.
     */
    private TrieNode findNode(String str) {
        TrieNode current = root;
        
        for (char c : str.toCharArray()) {
            if (!current.children.containsKey(c)) {
                return null;
            }
            current = current.children.get(c);
        }
        
        return current;
    }
    
    /**
     * Removes a word from the trie if present.
     * Returns true if the word was removed.
     */
    public boolean delete(String word) {
        if (word == null || word.isEmpty()) {
            return false;
        }
        
        return delete(root, word, 0);
    }
    
    /**
     * Recursive helper method for delete operation.
     */
    private boolean delete(TrieNode current, String word, int index) {
        if (index == word.length()) {
            // If word doesn't exist
            if (!current.isEndOfWord) {
                return false;
            }
            
            // Mark as non-end of word
            current.isEndOfWord = false;
            
            // Return true if this node has no other children
            return current.children.isEmpty();
        }
        
        char c = word.charAt(index);
        TrieNode node = current.children.get(c);
        
        if (node == null) {
            return false;
        }
        
        // Decrement the word count as we're removing a word
        node.wordCount--;
        
        boolean shouldDeleteCurrentNode = delete(node, word, index + 1);
        
        // If true is returned from the recursive call and the node is not 
        // the end of another word, we can delete it
        if (shouldDeleteCurrentNode && !node.isEndOfWord) {
            current.children.remove(c);
            return current.children.isEmpty();
        }
        
        return false;
    }
    
    /**
     * Returns all words in the trie with the given prefix.
     */
    public List<String> getWordsWithPrefix(String prefix) {
        List<String> result = new ArrayList<>();
        TrieNode node = findNode(prefix);
        
        if (node != null) {
            collectWords(node, prefix, result);
        }
        
        return result;
    }
    
    /**
     * Helper method to collect all words from a given node.
     */
    private void collectWords(TrieNode node, String prefix, List<String> result) {
        if (node.isEndOfWord) {
            result.add(prefix);
        }
        
        for (Map.Entry<Character, TrieNode> entry : node.children.entrySet()) {
            collectWords(entry.getValue(), prefix + entry.getKey(), result);
        }
    }
}

How the Trie Works

Let's break down the key operations of our Trie implementation:

Insertion

When inserting a word like code, we:

Start at the root node
For each character ('c', 'o', 'd', 'e'):
- Create a new node if the character doesn't exist as a child
- Move to that child node
- Increment the word count for prefix statistics
Mark the final node as the end of a word

This process gives us O(m) time complexity, where m is the length of the word.

Searching

To search for a word:

Start at the root and follow the path defined by each character
If at any point a character doesn't exist in the current node's children, return false
If we reach the end of the word, check if the current node is marked as an end of word

Like insertion, search has O(m) time complexity.

Deletion

Deletion is more complex as we need to:

Find the end node of the word
Mark it as not an end of word
Remove nodes that are no longer needed (no children and not marking the end of another word)
Update word counts along the path

Prefix Operations

The real power of Tries comes from prefix operations:

startsWith(prefix): Check if any word starts with the given prefix
countWordsWithPrefix(prefix): Count words with the given prefix
getWordsWithPrefix(prefix): Get all words with the given prefix

Practical Example: Autocomplete System

Let's see how our Trie can be used to build a simple autocomplete system:

java
public class AutocompleteSystem {
    private Trie dictionary;
    
    public AutocompleteSystem() {
        dictionary = new Trie();
        
        // Initialize with some words
        String[] words = {
            "apple", "application", "apply", "apartment",
            "book", "bookstore", "bookmark",
            "code", "coding", "coder", "computer"
        };
        
        for (String word : words) {
            dictionary.insert(word);
        }
    }
    
    public List<String> getSuggestions(String prefix) {
        return dictionary.getWordsWithPrefix(prefix);
    }
    
    public static void main(String[] args) {
        AutocompleteSystem system = new AutocompleteSystem();
        
        // Test with different prefixes
        System.out.println("Suggestions for 'a': " + system.getSuggestions("a"));
        System.out.println("Suggestions for 'ap': " + system.getSuggestions("ap"));
        System.out.println("Suggestions for 'co': " + system.getSuggestions("co"));
        
        // Count words with prefix
        System.out.println("Words starting with 'co': " + 
                          system.dictionary.countWordsWithPrefix("co"));
        
        // Delete a word and check again
        system.dictionary.delete("code");
        System.out.println("After deleting 'code', suggestions for 'co': " + 
                          system.getSuggestions("co"));
    }
}

Output:

plaintext
Suggestions for 'a': [apple, application, apply, apartment]
Suggestions for 'ap': [apple, application, apply, apartment]
Suggestions for 'co': [code, coding, coder, computer]
Words starting with 'co': 4
After deleting 'code', suggestions for 'co': [coding, coder, computer]

Performance Analysis

Let's analyze the performance of our Trie implementation:

Time Complexity

Insert: O(m) where m is the length of the word
Search: O(m) where m is the length of the word
Delete: O(m) where m is the length of the word
Prefix operations: O(m + k) where m is the length of the prefix and k is the number of words with that prefix

Space Complexity

O(n × m) where n is the number of words and m is the average word length
This can be reduced if many words share common prefixes

Comparison with Other Data Structures

Operation	Trie	HashMap	ArrayList
Insert	O(m)	O(m)	O(n)
Search	O(m)	O(m)	O(n×m)
Delete	O(m)	O(m)	O(n)
Prefix Search	O(m+k)	O(n×m)	O(n×m)

Where:

m = length of the word
n = number of words
k = number of words with the given prefix

Real-World Applications

Tries are used in numerous applications:

Autocomplete and Search Suggestions: As demonstrated in our example
Spell Checkers: Quickly verify if a word exists in a dictionary
IP Routing: Longest prefix matching in network routers
Predictive Text: Mobile keyboards that suggest the next word
Word Games: Efficiently check if a string forms a valid word
Contact Lists: Fast lookup as users type a name

Optimizations and Variations

Several optimizations can make Tries even more efficient:

Compressed Tries (Radix Trees)

In a standard Trie, nodes with only one child can be compressed, reducing space requirements:

t -> r -> e -> e

Can be compressed to:

tree

Ternary Search Tries

Instead of using a HashMap for children, a ternary search trie uses three pointers per node:

Left (for smaller characters)
Equal (for the next character in the sequence)
Right (for larger characters)

This can be more memory-efficient for certain character distributions.

Character Set Optimization

If you know your character set is limited (e.g., only lowercase English letters), you can use an array instead of a HashMap:

java
private static class TrieNode {
    TrieNode[] children = new TrieNode[26]; // For lowercase English letters
    boolean isEndOfWord;
}

Conclusion

Tries are powerful data structures that excel at string operations, particularly those involving prefixes. Their efficiency in handling these operations makes them invaluable for applications like autocomplete systems, spell checkers, and search engines.

By understanding how Tries work and implementing them effectively, you can significantly improve the performance of string-intensive applications. The implementation we've covered provides a solid foundation that you can adapt and optimize for your specific use cases.

Whether you're building a text editor, a search engine, or any application that deals with string lookups, consider using a Trie to enhance your solution's efficiency and capabilities.

The Problem: Fast String Lookups and Autocomplete

This is where the Trie data structure shines.

What is a Trie?

Unlike binary trees where each node typically contains a single value, in a Trie:

Each node represents a single character in a string
The path from the root to a node forms a prefix
Children of a node are indexed by the next character in the sequence
Nodes may contain a flag indicating if they represent the end of a complete word

This structure allows Tries to excel at:

Prefix matching: Find all words starting with a given prefix
Exact string search: Check if a string exists in the collection
Lexicographical ordering: Words are naturally stored in sorted order
Space efficiency: Common prefixes are stored only once

Building a Trie in Java

Let's implement a complete Trie data structure in Java with the following operations:

Insert a word
Search for a word
Delete a word
Count words with a given prefix
Get all words with a given prefix

Here's our implementation:

java
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class Trie {
    private TrieNode root;
    
    public Trie() {
        root = new TrieNode();
    }
    
    // Node class for our Trie
    private static class TrieNode {
        Map<Character, TrieNode> children;
        boolean isEndOfWord;
        int wordCount;  // Count of words passing through this node
        
        public TrieNode() {
            children = new HashMap<>();
            isEndOfWord = false;
            wordCount = 0;
        }
    }
    
    /**
     * Inserts a word into the trie.
     */
    public void insert(String word) {
        if (word == null || word.isEmpty()) {
            return;
        }
        
        TrieNode current = root;
        
        for (char c : word.toCharArray()) {
            current.children.putIfAbsent(c, new TrieNode());
            current = current.children.get(c);
            current.wordCount++;  // Increment count for prefix statistics
        }
        
        current.isEndOfWord = true;
    }
    
    /**
     * Returns true if the word is in the trie.
     */
    public boolean search(String word) {
        if (word == null || word.isEmpty()) {
            return false;
        }
        
        TrieNode node = findNode(word);
        return node != null && node.isEndOfWord;
    }
    
    /**
     * Returns true if there is any word in the trie that starts with the given prefix.
     */
    public boolean startsWith(String prefix) {
        return findNode(prefix) != null;
    }
    
    /**
     * Returns the count of words with the given prefix.
     */
    public int countWordsWithPrefix(String prefix) {
        TrieNode node = findNode(prefix);
        return node != null ? node.wordCount : 0;
    }
    
    /**
     * Helper method to find a node corresponding to the given string.
     */
    private TrieNode findNode(String str) {
        TrieNode current = root;
        
        for (char c : str.toCharArray()) {
            if (!current.children.containsKey(c)) {
                return null;
            }
            current = current.children.get(c);
        }
        
        return current;
    }
    
    /**
     * Removes a word from the trie if present.
     * Returns true if the word was removed.
     */
    public boolean delete(String word) {
        if (word == null || word.isEmpty()) {
            return false;
        }
        
        return delete(root, word, 0);
    }
    
    /**
     * Recursive helper method for delete operation.
     */
    private boolean delete(TrieNode current, String word, int index) {
        if (index == word.length()) {
            // If word doesn't exist
            if (!current.isEndOfWord) {
                return false;
            }
            
            // Mark as non-end of word
            current.isEndOfWord = false;
            
            // Return true if this node has no other children
            return current.children.isEmpty();
        }
        
        char c = word.charAt(index);
        TrieNode node = current.children.get(c);
        
        if (node == null) {
            return false;
        }
        
        // Decrement the word count as we're removing a word
        node.wordCount--;
        
        boolean shouldDeleteCurrentNode = delete(node, word, index + 1);
        
        // If true is returned from the recursive call and the node is not 
        // the end of another word, we can delete it
        if (shouldDeleteCurrentNode && !node.isEndOfWord) {
            current.children.remove(c);
            return current.children.isEmpty();
        }
        
        return false;
    }
    
    /**
     * Returns all words in the trie with the given prefix.
     */
    public List<String> getWordsWithPrefix(String prefix) {
        List<String> result = new ArrayList<>();
        TrieNode node = findNode(prefix);
        
        if (node != null) {
            collectWords(node, prefix, result);
        }
        
        return result;
    }
    
    /**
     * Helper method to collect all words from a given node.
     */
    private void collectWords(TrieNode node, String prefix, List<String> result) {
        if (node.isEndOfWord) {
            result.add(prefix);
        }
        
        for (Map.Entry<Character, TrieNode> entry : node.children.entrySet()) {
            collectWords(entry.getValue(), prefix + entry.getKey(), result);
        }
    }
}

How the Trie Works

Let's break down the key operations of our Trie implementation:

Insertion

When inserting a word like code, we:

Start at the root node
For each character ('c', 'o', 'd', 'e'):
- Create a new node if the character doesn't exist as a child
- Move to that child node
- Increment the word count for prefix statistics
Mark the final node as the end of a word

This process gives us O(m) time complexity, where m is the length of the word.

Searching

To search for a word:

Start at the root and follow the path defined by each character
If at any point a character doesn't exist in the current node's children, return false
If we reach the end of the word, check if the current node is marked as an end of word

Like insertion, search has O(m) time complexity.

Deletion

Deletion is more complex as we need to:

Find the end node of the word
Mark it as not an end of word
Remove nodes that are no longer needed (no children and not marking the end of another word)
Update word counts along the path

Prefix Operations

The real power of Tries comes from prefix operations:

startsWith(prefix): Check if any word starts with the given prefix
countWordsWithPrefix(prefix): Count words with the given prefix
getWordsWithPrefix(prefix): Get all words with the given prefix

Practical Example: Autocomplete System

Let's see how our Trie can be used to build a simple autocomplete system:

java
public class AutocompleteSystem {
    private Trie dictionary;
    
    public AutocompleteSystem() {
        dictionary = new Trie();
        
        // Initialize with some words
        String[] words = {
            "apple", "application", "apply", "apartment",
            "book", "bookstore", "bookmark",
            "code", "coding", "coder", "computer"
        };
        
        for (String word : words) {
            dictionary.insert(word);
        }
    }
    
    public List<String> getSuggestions(String prefix) {
        return dictionary.getWordsWithPrefix(prefix);
    }
    
    public static void main(String[] args) {
        AutocompleteSystem system = new AutocompleteSystem();
        
        // Test with different prefixes
        System.out.println("Suggestions for 'a': " + system.getSuggestions("a"));
        System.out.println("Suggestions for 'ap': " + system.getSuggestions("ap"));
        System.out.println("Suggestions for 'co': " + system.getSuggestions("co"));
        
        // Count words with prefix
        System.out.println("Words starting with 'co': " + 
                          system.dictionary.countWordsWithPrefix("co"));
        
        // Delete a word and check again
        system.dictionary.delete("code");
        System.out.println("After deleting 'code', suggestions for 'co': " + 
                          system.getSuggestions("co"));
    }
}

Output:

plaintext
Suggestions for 'a': [apple, application, apply, apartment]
Suggestions for 'ap': [apple, application, apply, apartment]
Suggestions for 'co': [code, coding, coder, computer]
Words starting with 'co': 4
After deleting 'code', suggestions for 'co': [coding, coder, computer]

Performance Analysis

Let's analyze the performance of our Trie implementation:

Time Complexity

Insert: O(m) where m is the length of the word
Search: O(m) where m is the length of the word
Delete: O(m) where m is the length of the word
Prefix operations: O(m + k) where m is the length of the prefix and k is the number of words with that prefix

Space Complexity

O(n × m) where n is the number of words and m is the average word length
This can be reduced if many words share common prefixes

Comparison with Other Data Structures

Operation	Trie	HashMap	ArrayList
Insert	O(m)	O(m)	O(n)
Search	O(m)	O(m)	O(n×m)
Delete	O(m)	O(m)	O(n)
Prefix Search	O(m+k)	O(n×m)	O(n×m)

Where:

m = length of the word
n = number of words
k = number of words with the given prefix

Real-World Applications

Tries are used in numerous applications:

Autocomplete and Search Suggestions: As demonstrated in our example
Spell Checkers: Quickly verify if a word exists in a dictionary
IP Routing: Longest prefix matching in network routers
Predictive Text: Mobile keyboards that suggest the next word
Word Games: Efficiently check if a string forms a valid word
Contact Lists: Fast lookup as users type a name

Optimizations and Variations

Several optimizations can make Tries even more efficient:

Compressed Tries (Radix Trees)

In a standard Trie, nodes with only one child can be compressed, reducing space requirements:

t -> r -> e -> e

Can be compressed to:

tree

Ternary Search Tries

Instead of using a HashMap for children, a ternary search trie uses three pointers per node:

Left (for smaller characters)
Equal (for the next character in the sequence)
Right (for larger characters)

This can be more memory-efficient for certain character distributions.

Character Set Optimization

If you know your character set is limited (e.g., only lowercase English letters), you can use an array instead of a HashMap:

java
private static class TrieNode {
    TrieNode[] children = new TrieNode[26]; // For lowercase English letters
    boolean isEndOfWord;
}

Conclusion

Whether you're building a text editor, a search engine, or any application that deals with string lookups, consider using a Trie to enhance your solution's efficiency and capabilities.

Implementing Trie (Prefix Tree): A Complete Guide with Java

The Problem: Fast String Lookups and Autocomplete

What is a Trie?

Building a Trie in Java

How the Trie Works

Insertion

Searching

Deletion

Prefix Operations

Practical Example: Autocomplete System

Performance Analysis

Time Complexity

Space Complexity

Comparison with Other Data Structures

Real-World Applications

Optimizations and Variations

Compressed Tries (Radix Trees)

Ternary Search Tries

Character Set Optimization

Conclusion

Further Reading

Related Posts

Merkle Trees: Implementation in Java and Real-World Applications

Spring Boot 3.x: What Actually Changed (and What Matters)

Java Multithreading Coding Questions for Interviews: Classic Problems and Solutions

Implementing Trie (Prefix Tree): A Complete Guide with Java

The Problem: Fast String Lookups and Autocomplete

What is a Trie?

Building a Trie in Java

How the Trie Works

Insertion

Searching

Deletion

Prefix Operations

Practical Example: Autocomplete System

Performance Analysis

Time Complexity

Space Complexity

Comparison with Other Data Structures

Real-World Applications

Optimizations and Variations

Compressed Tries (Radix Trees)

Ternary Search Tries

Character Set Optimization

Conclusion

Further Reading

Related Posts

Merkle Trees: Implementation in Java and Real-World Applications

Spring Boot 3.x: What Actually Changed (and What Matters)

Java Multithreading Coding Questions for Interviews: Classic Problems and Solutions