Trie 树 | Notion

结构示意

当我们在 Trie 树中查找一个字符串的时候，比如查找字符串“her”，那我们将要查找的字符串分割成单个的字符 h，e，r，然后从 Trie 树的根节点开始匹配。

这里红色节点表示一个单词的结束，红色节点并非都是叶子节点。

在实现上，Trie 树是一个多叉树，每一个节点对应一个字符，因此，我们可以采用数组或者 Hash表两种方式来记录子树。

在实现上，Hash表更节约内存，数组因为每个节点都要分配 K（ K是字符串中所有字符的总数）个数组空间，所以更浪费内存，但有更好的性能。

构建 Trie 树的过程，需要扫描所有的字符串，时间复杂度是 O(n)（n 表示所有字符串的长度和）。

每次查询时，如果要查询的字符串长度是 k，那我们只需要比对大约 k 个节点，就能完成查询操作。所以查询的时间负载度是 O( k )，k 表示要查找的字符串的长度。

public class Trie {
  private TrieNode root = new TrieNode('/'); // 存储无意义字符

  // 往Trie树中插入一个字符串
  public void insert(char[] text) {
    TrieNode p = root;
    for (int i = 0; i < text.length; ++i) {
      int index = text[i] - 'a';
      if (p.children[index] == null) {
        TrieNode newNode = new TrieNode(text[i]);
        p.children[index] = newNode;
      }
      p = p.children[index];
    }
    p.isEndingChar = true;
  }

  // 在Trie树中查找一个字符串
  public boolean find(char[] pattern) {
    TrieNode p = root;
    for (int i = 0; i < pattern.length; ++i) {
      int index = pattern[i] - 'a';
      if (p.children[index] == null) {
        return false; // 不存在pattern
      }
      p = p.children[index];
    }
    if (p.isEndingChar == false) return false; // 不能完全匹配，只是前缀
    else return true; // 找到pattern
  }

  public class TrieNode {
    public char data;
    public TrieNode[] children = new TrieNode[26];
    public boolean isEndingChar = false;
    public TrieNode(char data) {
      this.data = data;
    }
  }
}

其他变种

替换数据结构

通过将数组替换为其他数据结构，如序数组、跳表、散列表、红黑树等，Trie 树有多种变种。

缩点优化

对只有一个子节点的节点，而且此节点不是一个串的结束节点，可以将此节点与子节点合并。

双数组Trie树

实际项目中由于内存问题，原生的 Trie 树用的不是很多，这里还有另一种实现。

结构示意

其他变种

替换数据结构

缩点优化

双数组Trie树

Compressed trie

应用