Sum of Prefix Scores of Strings

HARD

Description

You are given an array words of size n consisting of non-empty strings.

We define the score of a string term as the number of strings words[i] such that term is a prefix of words[i].

For example, if words = ["a", "ab", "abc", "cab"], then the score of "ab" is 2, since "ab" is a prefix of both "ab" and "abc".

Return an array answer of size n where answer[i] is the sum of scores of every non-empty prefix of words[i].

Note that a string is considered as a prefix of itself.

Example 1:

Input: words = ["abc","ab","bc","b"]
Output: [5,4,3,2]
Explanation: The answer for each string is the following:
- "abc" has 3 prefixes: "a", "ab", and "abc".
- There are 2 strings with the prefix "a", 2 strings with the prefix "ab", and 1 string with the prefix "abc".
The total is answer[0] = 2 + 2 + 1 = 5.
- "ab" has 2 prefixes: "a" and "ab".
- There are 2 strings with the prefix "a", and 2 strings with the prefix "ab".
The total is answer[1] = 2 + 2 = 4.
- "bc" has 2 prefixes: "b" and "bc".
- There are 2 strings with the prefix "b", and 1 string with the prefix "bc".
The total is answer[2] = 2 + 1 = 3.
- "b" has 1 prefix: "b".
- There are 2 strings with the prefix "b".
The total is answer[3] = 2.

Example 2:

Input: words = ["abcd"]
Output: [4]
Explanation:
"abcd" has 4 prefixes: "a", "ab", "abc", and "abcd".
Each prefix has a score of one, so the total is answer[0] = 1 + 1 + 1 + 1 = 4.

Constraints:

1 <= words.length <= 1000
1 <= words[i].length <= 1000
words[i] consists of lowercase English letters.

Approaches

Checkout 3 different approaches to solve Sum of Prefix Scores of Strings. Click on different approaches to view the approach and algorithm in detail.

Optimal Approach using a Trie (Prefix Tree)

The most efficient solution utilizes a Trie data structure, which is perfectly suited for problems involving prefixes. We first build a Trie from all the words, incrementing a counter in each node to track how many words pass through it (i.e., the prefix score). Then, for each word, we traverse the Trie and sum the counters of the nodes corresponding to its prefixes.

Algorithm

Define a TrieNode class containing an array of children (e.g., size 26 for lowercase English letters) and an integer count.
First Pass (Build Trie):
- Create a root node for the Trie.
- Iterate through each word in the words array.
- For each word, traverse the Trie from the root, character by character.
- For each character, move to the corresponding child node. If it doesn't exist, create it.
- Increment the count of the visited node. This count represents the score of the prefix ending at that node.
Second Pass (Calculate Scores):
- Initialize an integer array answer of size n.
- Iterate through each word at index i in the words array.
- Initialize totalScore to 0.
- Traverse the Trie again for the current word, character by character.
- At each node in the traversal, add its count to totalScore.
- After traversing the entire word, store totalScore in answer[i].
Return the answer array.

A Trie, or prefix tree, is an ideal data structure for this problem. Each node in the Trie can represent a character, and a path from the root to a node represents a prefix. We can augment the Trie node to store a count of how many words in the input array share the prefix represented by that node.

The algorithm works in two passes:

Build Trie and Count Prefixes: We insert every word from the input array into the Trie. While inserting a word, we traverse the Trie character by character. For each node we visit (or create), we increment its count field. By the end of this pass, the count at any node will represent the score of the prefix corresponding to that node.
Calculate Sum of Scores: We iterate through the input words array again. For each word, we traverse the Trie following the path of its characters. As we visit each node along the path, we add its count to a running total for the current word. This sum is the answer for that word.

This approach is highly efficient because it processes each character of each word a constant number of times.

class TrieNode {
    TrieNode[] children;
    int count;

    public TrieNode() {
        children = new TrieNode[26];
        count = 0;
    }
}

class Solution {
    public int[] sumPrefixScores(String[] words) {
        TrieNode root = new TrieNode();

        // Pass 1: Build the Trie and count prefixes
        for (String word : words) {
            TrieNode curr = root;
            for (char c : word.toCharArray()) {
                int index = c - 'a';
                if (curr.children[index] == null) {
                    curr.children[index] = new TrieNode();
                }
                curr = curr.children[index];
                curr.count++;
            }
        }

        int n = words.length;
        int[] answer = new int[n];

        // Pass 2: Calculate the sum of scores for each word
        for (int i = 0; i < n; i++) {
            String word = words[i];
            TrieNode curr = root;
            int totalScore = 0;
            for (char c : word.toCharArray()) {
                int index = c - 'a';
                curr = curr.children[index];
                totalScore += curr.count;
            }
            answer[i] = totalScore;
        }

        return answer;
    }
}

Complexity Analysis

Time Complexity: O(L), where `L` is the total number of characters in all words combined (`sum(|w|)`). We traverse each character of each word twice: once to build the Trie and once to calculate the scores. Each character operation (node creation, traversal, increment) takes constant time.Space Complexity: O(L), where `L` is the total number of characters in all words combined (`sum(|w|)`). The space is dominated by the Trie. In the worst case, where there are no overlapping prefixes, the number of nodes in the Trie is equal to the total number of characters `L`. Each node has a constant size.

Pros and Cons

Pros:

Highly efficient in both time and space.
It's the optimal solution for this problem as it avoids string manipulations and redundant computations.

Cons:

Requires knowledge of the Trie data structure, making it slightly more complex to implement than the other approaches.

Code Solutions

Checking out 4 solutions in different languages for Sum of Prefix Scores of Strings. Click on different languages to view the code.

Video Solution

Watch the video walkthrough for Sum of Prefix Scores of Strings

Patterns:

Counting

Data Structures:

ArrayStringTrie