Longest Common Prefix of K Strings After Removal

HARD

Description

You are given an array of strings words and an integer k.

For each index i in the range [0, words.length - 1], find the length of the longest common prefix among any k strings (selected at distinct indices) from the remaining array after removing the ith element.

Return an array answer, where answer[i] is the answer for ith element. If removing the ith element leaves the array with fewer than k strings, answer[i] is 0.

 

Example 1:

Input: words = ["jump","run","run","jump","run"], k = 2

Output: [3,4,4,3,4]

Explanation:

  • Removing index 0 ("jump"):
    • words becomes: ["run", "run", "jump", "run"]. "run" occurs 3 times. Choosing any two gives the longest common prefix "run" (length 3).
  • Removing index 1 ("run"):
    • words becomes: ["jump", "run", "jump", "run"]. "jump" occurs twice. Choosing these two gives the longest common prefix "jump" (length 4).
  • Removing index 2 ("run"):
    • words becomes: ["jump", "run", "jump", "run"]. "jump" occurs twice. Choosing these two gives the longest common prefix "jump" (length 4).
  • Removing index 3 ("jump"):
    • words becomes: ["jump", "run", "run", "run"]. "run" occurs 3 times. Choosing any two gives the longest common prefix "run" (length 3).
  • Removing index 4 ("run"):
    • words becomes: ["jump", "run", "run", "jump"]. "jump" occurs twice. Choosing these two gives the longest common prefix "jump" (length 4).

Example 2:

Input: words = ["dog","racer","car"], k = 2

Output: [0,0,0]

Explanation:

  • Removing any index results in an answer of 0.

 

Constraints:

  • 1 <= k <= words.length <= 105
  • 1 <= words[i].length <= 104
  • words[i] consists of lowercase English letters.
  • The sum of words[i].length is smaller than or equal 105.

Approaches

Checkout 2 different approaches to solve Longest Common Prefix of K Strings After Removal. Click on different approaches to view the approach and algorithm in detail.

Optimized Trie with Pre-computation

This efficient approach avoids rebuilding the Trie for each word. Instead, it builds a single Trie for all words and pre-computes necessary information. For each word to be removed, it uses this pre-computed data to quickly find the longest common prefix among the remaining strings.

Algorithm

  • Build a single Trie for all words in the input array. Each node should store a count of words passing through it.
  • Perform a post-order DFS traversal on the Trie to compute two values for each node u at depth d:
    • maxLen1[u]: The maximum length of a prefix in the subtree of u (inclusive) whose original count is >= k.
    • maxLen2[u]: The maximum length of a prefix in the subtree of u (inclusive) whose original count is >= k+1.
  • Initialize an answer array ans of the same size as words.
  • For each index i from 0 to words.length - 1:
    • If words.length - 1 < k, set ans[i] = 0 and continue.
    • Initialize currentMax = 0.
    • Traverse the path for words[i] in the Trie, starting from the root. Let the current node be p.
    • At each step of the traversal for words[i] (for character at depth d):
      • Let the next node on the path be childOnPath.
      • Iterate through all children c of the current node p. If a child c is not childOnPath, it represents a branch of prefixes not found in words[i]. Update currentMax = max(currentMax, c.maxLen1).
      • Move to the next node: p = childOnPath.
      • The prefix ending at the new node p is a prefix of words[i]. Its count will be reduced by 1. It is a valid LCP if its original count was >= k+1. If so, update currentMax = max(currentMax, d + 1).
    • After the traversal for words[i] is complete, the final node p might have children. These branches are also not part of words[i]. For all children c of p, update currentMax = max(currentMax, c.maxLen1).
    • Set ans[i] = currentMax.
  • Return ans.

The core idea is to build one Trie for all n words and then, for each i, calculate the result without modifying the Trie structure. First, we insert all words into a single Trie where each node stores a count of how many words pass through it.

The key insight is that when we remove words[i], the count of nodes corresponding to prefixes of words[i] decreases by 1, while counts for all other nodes remain the same. A prefix P is a candidate for the LCP if its count among the remaining n-1 strings is at least k.

  • If P is a prefix of words[i], its new count is original_count - 1. It's a candidate if original_count - 1 >= k, i.e., original_count >= k+1.
  • If P is not a prefix of words[i], its count is unchanged. It's a candidate if original_count >= k.

To find the maximum length efficiently, we pre-compute two values for each node u in the Trie using a post-order DFS traversal:

  1. maxLen1: The maximum length of a prefix in the subtree of u (inclusive) whose original count is >= k.
  2. maxLen2: The maximum length of a prefix in the subtree of u (inclusive) whose original count is >= k+1.

After this pre-computation, we iterate through each words[i] to calculate answer[i]. We traverse the path of words[i] in the Trie. At each node p on the path, the maximum LCP length can come from two sources:

  1. Prefixes of words[i] itself. We check if count(p) >= k+1.
  2. Prefixes not belonging to words[i]. These are found in the subtrees of p's children that are not on the path of words[i]. For any such "off-path" child c, the maximum length we can get from its subtree is maxLen1[c], since counts in that subtree are unaffected.

We take the maximum of all these possibilities to find answer[i].

class Solution {
    class TrieNode {
        TrieNode[] children = new TrieNode[26];
        int count = 0;
        int maxLen1 = 0; // Max depth in subtree with count >= k
        int maxLen2 = 0; // Max depth in subtree with count >= k+1
    }

    public int[] longestCommonPrefix(String[] words, int k) {
        int n = words.length;
        int[] answer = new int[n];
        if (k > n - 1) {
            return answer; // All answers are 0
        }

        TrieNode root = new TrieNode();
        for (String word : words) {
            TrieNode curr = root;
            for (char c : word.toCharArray()) {
                if (curr.children[c - 'a'] == null) {
                    curr.children[c - 'a'] = new TrieNode();
                }
                curr = curr.children[c - 'a'];
                curr.count++;
            }
        }

        dfs(root, 0, k);

        for (int i = 0; i < n; i++) {
            String word = words[i];
            TrieNode curr = root;
            int maxLcp = 0;

            for (int j = 0; j < word.length(); j++) {
                char c = word.charAt(j);
                TrieNode nextNode = curr.children[c - 'a'];
                
                // Check other branches
                for (int childIdx = 0; childIdx < 26; childIdx++) {
                    TrieNode child = curr.children[childIdx];
                    if (child != null && child != nextNode) {
                        maxLcp = Math.max(maxLcp, child.maxLen1);
                    }
                }
                
                curr = nextNode;
                if (curr.count >= k + 1) {
                    maxLcp = Math.max(maxLcp, j + 1);
                }
            }
            
            // Check branches from the last node of the word's path
            for(TrieNode child : curr.children){
                if(child != null){
                    maxLcp = Math.max(maxLcp, child.maxLen1);
                }
            }

            answer[i] = maxLcp;
        }

        return answer;
    }

    private void dfs(TrieNode node, int depth, int k) {
        if (node == null) return;

        node.maxLen1 = (node.count >= k) ? depth : 0;
        node.maxLen2 = (node.count >= k + 1) ? depth : 0;

        for (TrieNode child : node.children) {
            if (child != null) {
                dfs(child, depth + 1, k);
                node.maxLen1 = Math.max(node.maxLen1, child.maxLen1);
                node.maxLen2 = Math.max(node.maxLen2, child.maxLen2);
            }
        }
    }
}

Complexity Analysis

Time Complexity: O(S * A), where `S` is the total length of all words and `A` is the alphabet size. Building the Trie and the DFS for pre-computation takes `O(S)`. Calculating each answer takes `O(length(word) * A)`. The total for all words sums up to `O(S * A)`.Space Complexity: O(S * A), where `S` is the total length of all words and `A` is the alphabet size (26). This space is for storing the Trie. Each of the `O(S)` nodes has an array of `A` children pointers and a few integer fields.

Pros and Cons

Pros:
  • Highly efficient and passes the given constraints.
  • Avoids redundant computations by using a single Trie and pre-computation.
  • Scales well with a large number of words.
Cons:
  • More complex to implement and reason about compared to the brute-force approach.
  • Requires more space to store the pre-computed values in each Trie node.

Code Solutions

Checking out 2 solutions in different languages for Longest Common Prefix of K Strings After Removal. Click on different languages to view the code.

class Solution {
  static class TrieNode {
    int count = 0;
    int depth = 0;
    int[] children = new int[26];
    TrieNode() {
      for (int i = 0; i < 26; ++i)
        children[i] = -1;
    }
  } static class SegmentTree {
    int n;
    int[] tree;
    int[] globalCount;
    SegmentTree(int n, int[] globalCount) {
      this.n = n;
      this.globalCount = globalCount;
      this.tree = new int[4 * (n + 1)];
      for (int i = 0; i < tree.length; i++)
        tree[i] = -1;
      build(1, 1, n);
    }
    void build(int idx, int l, int r) {
      if (l == r) {
        tree[idx] = globalCount[l] > 0 ? l : -1;
        return;
      }
      int mid = (l + r) / 2;
      build(idx * 2, l, mid);
      build(idx * 2 + 1, mid + 1, r);
      tree[idx] = Math.max(tree[idx * 2], tree[idx * 2 + 1]);
    }
    void update(int idx, int l, int r, int pos, int newVal) {
      if (l == r) {
        tree[idx] = newVal > 0 ? l : -1;
        return;
      }
      int mid = (l + r) / 2;
      if (pos <= mid) {
        update(idx * 2, l, mid, pos, newVal);
      } else {
        update(idx * 2 + 1, mid + 1, r, pos, newVal);
      }
      tree[idx] = Math.max(tree[idx * 2], tree[idx * 2 + 1]);
    }
    int query() { return tree[1]; }
  } public int[] longestCommonPrefix(String[] words, int k) {
    int n = words.length;
    int[] ans = new int[n];
    if (n - 1 < k)
      return ans;
    ArrayList<TrieNode> trie = new ArrayList<>();
    trie.add(new TrieNode());
    for (String word : words) {
      int cur = 0;
      for (char c : word.toCharArray()) {
        int idx = c - 'a';
        if (trie.get(cur).children[idx] == -1) {
          trie.get(cur).children[idx] = trie.size();
          TrieNode node = new TrieNode();
          node.depth = trie.get(cur).depth + 1;
          trie.add(node);
        }
        cur = trie.get(cur).children[idx];
        trie.get(cur).count++;
      }
    }
    int maxDepth = 0;
    for (int i = 1; i < trie.size(); ++i) {
      if (trie.get(i).count >= k) {
        maxDepth = Math.max(maxDepth, trie.get(i).depth);
      }
    }
    int[] globalCount = new int[maxDepth + 1];
    for (int i = 1; i < trie.size(); ++i) {
      TrieNode node = trie.get(i);
      if (node.count >= k && node.depth <= maxDepth) {
        globalCount[node.depth]++;
      }
    }
    List<List<Integer>> fragileList = new ArrayList<>();
    for (int i = 0; i < n; ++i) {
      fragileList.add(new ArrayList<>());
    }
    for (int i = 0; i < n; ++i) {
      int cur = 0;
      for (char c : words[i].toCharArray()) {
        int idx = c - 'a';
        cur = trie.get(cur).children[idx];
        if (trie.get(cur).count == k) {
          fragileList.get(i).add(trie.get(cur).depth);
        }
      }
    }
    int segSize = maxDepth;
    if (segSize >= 1) {
      SegmentTree segTree = new SegmentTree(segSize, globalCount);
      for (int i = 0; i < n; ++i) {
        if (n - 1 < k) {
          ans[i] = 0;
        } else {
          for (int d : fragileList.get(i)) {
            segTree.update(1, 1, segSize, d, globalCount[d] - 1);
          }
          int res = segTree.query();
          ans[i] = res == -1 ? 0 : res;
          for (int d : fragileList.get(i)) {
            segTree.update(1, 1, segSize, d, globalCount[d]);
          }
        }
      }
    }
    return ans;
  }
}

Video Solution

Watch the video walkthrough for Longest Common Prefix of K Strings After Removal



Data Structures:

ArrayStringTrie

Subscribe to Scale Engineer newsletter

Learn about System Design, Software Engineering, and interview experiences every week.

No spam, unsubscribe at any time.

Longest Common Prefix of K Strings After Removal | Scale Engineer