Number of Unique Good Subsequences

HARD

Description

You are given a binary string binary. A subsequence of binary is considered good if it is not empty and has no leading zeros (with the exception of "0").

Find the number of unique good subsequences of binary.

  • For example, if binary = "001", then all the good subsequences are ["0", "0", "1"], so the unique good subsequences are "0" and "1". Note that subsequences "00", "01", and "001" are not good because they have leading zeros.

Return the number of unique good subsequences of binary. Since the answer may be very large, return it modulo 109 + 7.

A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.

 

Example 1:

Input: binary = "001"
Output: 2
Explanation: The good subsequences of binary are ["0", "0", "1"].
The unique good subsequences are "0" and "1".

Example 2:

Input: binary = "11"
Output: 2
Explanation: The good subsequences of binary are ["1", "1", "11"].
The unique good subsequences are "1" and "11".

Example 3:

Input: binary = "101"
Output: 5
Explanation: The good subsequences of binary are ["1", "0", "1", "10", "11", "101"]. 
The unique good subsequences are "0", "1", "10", "11", and "101".

 

Constraints:

  • 1 <= binary.length <= 105
  • binary consists of only '0's and '1's.

Approaches

Checkout 2 different approaches to solve Number of Unique Good Subsequences. Click on different approaches to view the approach and algorithm in detail.

Dynamic Programming

A highly efficient approach using dynamic programming. We iterate through the binary string once, keeping track of the number of unique good subsequences ending in '0' and '1'. This avoids generating and storing the actual subsequences, leading to a linear time and constant space solution.

Algorithm

  • Initialize two variables, endsWithZero = 0 and endsWithOne = 0, to store the counts of unique good subsequences ending with '0' and '1' respectively. These subsequences must start with '1'.
  • Initialize a boolean flag hasZero = false to track if the input string contains a '0'.
  • Define the modulus MOD = 10^9 + 7.
  • Iterate through each character c of the input binary string.
  • If c == '1': The new count of subsequences ending in '1' is the sum of all previously found unique good subsequences plus 1 (for the new subsequence "1"). Update endsWithOne = (endsWithZero + endsWithOne + 1) % MOD.
  • If c == '0': The new count of subsequences ending in '0' is the sum of all previously found unique good subsequences. Update endsWithZero = (endsWithZero + endsWithOne) % MOD. Also, set hasZero = true.
  • After the loop, the total number of unique good subsequences starting with '1' is (endsWithZero + endsWithOne) % MOD.
  • The final result is this total plus 1 if hasZero is true (to account for the unique good subsequence "0").

This problem can be solved efficiently by breaking it down. The set of unique good subsequences consists of two disjoint sets:

  1. The single subsequence "0" (if the input contains at least one '0').
  2. The set of all unique non-empty subsequences that start with '1'.

We can calculate the size of the second set using dynamic programming. Let's maintain two counts as we iterate through the string:

  • endsWithZero: The number of unique good subsequences seen so far that end with '0'.
  • endsWithOne: The number of unique good subsequences seen so far that end with '1'.

When we encounter a character c:

  • If c == '1': We can form new subsequences ending in '1' by appending '1' to all existing unique good subsequences. The number of such subsequences is endsWithZero + endsWithOne. We can also form a new subsequence, "1" itself. Thus, the new count for endsWithOne becomes endsWithZero + endsWithOne + 1.
  • If c == '0': We can form new subsequences ending in '0' by appending '0' to all existing unique good subsequences. The number of such subsequences is endsWithZero + endsWithOne. We don't add 1 because a subsequence starting with "0" is not good (the special case "0" is handled separately). Thus, the new count for endsWithZero becomes endsWithZero + endsWithOne.

We also track whether we've seen a '0' in the input string with a hasZero flag. The final answer is (endsWithZero + endsWithOne + (hasZero ? 1 : 0)) % MOD.

class Solution {
    public int numberOfUniqueGoodSubsequences(String binary) {
        int mod = 1_000_000_007;
        long endsWithZero = 0;
        long endsWithOne = 0;
        boolean hasZero = false;

        for (char c : binary.toCharArray()) {
            if (c == '1') {
                // New subsequences ending with '1' are formed by appending '1' to:
                // - all previous good subsequences ending with '0'
                // - all previous good subsequences ending with '1'
                // - the empty subsequence (which forms "1")
                endsWithOne = (endsWithZero + endsWithOne + 1) % mod;
            } else { // c == '0'
                // New subsequences ending with '0' are formed by appending '0' to:
                // - all previous good subsequences ending with '0'
                // - all previous good subsequences ending with '1'
                // We don't add 1 because "0" itself is not a good subsequence we count here.
                endsWithZero = (endsWithZero + endsWithOne) % mod;
                hasZero = true;
            }
        }

        long result = (endsWithZero + endsWithOne) % mod;
        if (hasZero) {
            // Add 1 for the unique good subsequence "0"
            result = (result + 1) % mod;
        }

        return (int) result;
    }
}

Complexity Analysis

Time Complexity: O(N), where N is the length of the binary string. We perform a single pass through the string, with constant time operations at each step.Space Complexity: O(1). We only use a few variables (`endsWithZero`, `endsWithOne`, `hasZero`) to store our state, regardless of the input string's length.

Pros and Cons

Pros:
  • Optimal solution with linear time complexity.
  • Constant space complexity, making it highly memory-efficient.
  • Easily handles the large constraints of the problem.
Cons:
  • The dynamic programming logic can be non-intuitive to derive without careful consideration of how unique subsequences are formed.

Code Solutions

Checking out 3 solutions in different languages for Number of Unique Good Subsequences. Click on different languages to view the code.

class Solution {
public
  int numberOfUniqueGoodSubsequences(String binary) {
    final int mod = (int)1 e9 + 7;
    int f = 0, g = 0;
    int ans = 0;
    for (int i = 0; i < binary.length(); ++i) {
      if (binary.charAt(i) == '0') {
        g = (g + f) % mod;
        ans = 1;
      } else {
        f = (f + g + 1) % mod;
      }
    }
    ans = (ans + f + g) % mod;
    return ans;
  }
}

Video Solution

Watch the video walkthrough for Number of Unique Good Subsequences



Patterns:

Dynamic Programming

Data Structures:

String

Companies:

Subscribe to Scale Engineer newsletter

Learn about System Design, Software Engineering, and interview experiences every week.

No spam, unsubscribe at any time.