Minimum Cost Good Caption
HARDDescription
You are given a string caption of length n. A good caption is a string where every character appears in groups of at least 3 consecutive occurrences.
For example:
"aaabbb"and"aaaaccc"are good captions."aabbb"and"ccccd"are not good captions.
You can perform the following operation any number of times:
Choose an index i (where 0 <= i < n) and change the character at that index to either:
- The character immediately before it in the alphabet (if
caption[i] != 'a'). - The character immediately after it in the alphabet (if
caption[i] != 'z').
Your task is to convert the given caption into a good caption using the minimum number of operations, and return it. If there are multiple possible good captions, return the lexicographically smallest one among them. If it is impossible to create a good caption, return an empty string "".
Example 1:
Input: caption = "cdcd"
Output: "cccc"
Explanation:
It can be shown that the given caption cannot be transformed into a good caption with fewer than 2 operations. The possible good captions that can be created using exactly 2 operations are:
"dddd": Changecaption[0]andcaption[2]to their next character'd'."cccc": Changecaption[1]andcaption[3]to their previous character'c'.
Since "cccc" is lexicographically smaller than "dddd", return "cccc".
Example 2:
Input: caption = "aca"
Output: "aaa"
Explanation:
It can be proven that the given caption requires at least 2 operations to be transformed into a good caption. The only good caption that can be obtained with exactly 2 operations is as follows:
- Operation 1: Change
caption[1]to'b'.caption = "aba". - Operation 2: Change
caption[1]to'a'.caption = "aaa".
Thus, return "aaa".
Example 3:
Input: caption = "bc"
Output: ""
Explanation:
It can be shown that the given caption cannot be converted to a good caption by using any number of operations.
Constraints:
1 <= caption.length <= 5 * 104captionconsists only of lowercase English letters.
Approaches
Checkout 2 different approaches to solve Minimum Cost Good Caption. Click on different approaches to view the approach and algorithm in detail.
Optimized Linear Time Dynamic Programming
This approach improves upon the O(N^2) DP by optimizing the state transition. Instead of recomputing the minimum over all previous states for each (i, j) pair, we introduce a DP state for each possible character c. This allows us to calculate the minimum costs in O(N) time. Reconstruction is also optimized using precomputed hash maps and binary search to avoid a quadratic search.
Algorithm
- Pass 1: Minimum Cost Calculation (
O(N))- Define
dp[i]as the minimum cost for the suffixcaption[i...n-1]. - The key insight is that
dp[i] = min_c { min_cost for suffix [i..n-1] if its first block is converted to char c }. - Let
dp_c[i]be this inner value. The transition isdp_c[i] = min_{j=i+2..n-1} (cost_to_c(i, j) + dp[j+1]). cost_to_c(i, j)is the cost to changecaption[i...j]to allc's. This can be written assuffix_cost_c[i] - suffix_cost_c[j+1]using precomputed suffix sums of costs for each characterc.- The transition becomes
dp_c[i] = suffix_cost_c[i] + min_{j=i+2..n-1} (dp[j+1] - suffix_cost_c[j+1]). - The
minterm can be computed inO(1)for eachiby maintaining a running minimum asidecreases. - We compute
dp[i]by takingmin_c dp_c[i]. We iterateifromn-1down to0. Sincedp[i]depends ondp[k]fork > i, which are already computed, this works. - The total time for this pass is
O(N * C), which is linear inN.
- Define
- Pass 2: Optimized Reconstruction (
O(N log N))- With the
dptable from Pass 1, we reconstruct the string. - At
currentPos, we need to find the best next block[currentPos...j]converted to characterc. - We want to find
(c, j)that minimizesc(and thenj) subject to the optimality condition:cost_to_c(currentPos, j) + dp[j+1] == dp[currentPos]. - We iterate
cfrom 'a' to 'z'. For the firstcthat allows a valid partition, we've found our best character. - For a fixed
c, we need to find the smallestjthat satisfies the condition. The condition can be rewritten asdp[j+1] - suffix_cost_c[j+1] == dp[currentPos] - suffix_cost_c[currentPos]. - Let
h_c(k) = dp[k] - suffix_cost_c[k]. We need to find the smallestj >= currentPos + 2such thath_c(j+1)equals a target value. - We can pre-process the values of
h_c(k)into hash maps for eachc, mapping a value to a sorted list of indiceskwhereh_c(k)equals that value. - Then, for each
c, we can query this map and use binary search on the list of indices to find the smallest validj+1inO(log N)time. - The reconstruction for each block takes
O(C * log N). Since there are at mostN/3blocks, total reconstruction time isO(N * C * log N).
- With the
Pass 1: Minimum Cost Calculation (O(N*C))
This pass calculates the minimum cost dp[i] for each suffix caption[i...n-1]. The key optimization is to break down the calculation based on the character c of the first block. Let dp_c[i] be the minimum cost for the suffix [i...n-1] given its first block is converted to character c. The overall minimum dp[i] is then min_c(dp_c[i]). The recurrence for dp_c[i] can be expressed as:
dp_c[i] = min_{j=i+2..n-1} (cost_to_c(i, j) + dp[j+1])
where cost_to_c(i, j) is the cost to change all characters in caption[i...j] to c. By pre-calculating suffix sums of costs (suffix_cost_c[k] = sum of |char - c| from k to n-1), we can find cost_to_c(i, j) in O(1). The recurrence simplifies to:
dp_c[i] = suffix_cost_c[i] + min_{j=i+2..n-1} (dp[j+1] - suffix_cost_c[j+1])
The min term can be tracked with a running minimum as we iterate i from n-1 down to 0, making the calculation for each (i, c) pair O(1). This leads to an O(N*C) algorithm for the first pass.
Pass 2: Optimized Reconstruction (O(N*C*log N))
To construct the lexicographically smallest string, we iterate from currentPos = 0. At each step, we determine the next block [currentPos...j] with character c. We must pick the c that is lexicographically smallest. So, we iterate c from 'a' to 'z'. For each c, we check if it can start a valid optimal partition. A choice of (c, j) is valid if cost_to_c(currentPos, j) + dp[j+1] == dp[currentPos]. To find the smallest j for a given c efficiently, we can pre-process helper values h_c(k) = dp[k] - suffix_cost_c[k] into a map for each character c. The map will store h_c(k) values and the list of indices k where they occur. The validity check becomes finding an index k=j+1 in this map. We can use binary search on the sorted list of indices to find the smallest valid j in O(log N) time. The first (c, j) pair we find will be the optimal choice for the current block. We append it to the result and continue.
class Solution {
public String goodCaption(String caption) {
int n = caption.length();
int C = 26;
long INF = Long.MAX_VALUE / 2;
if (n > 0 && n < 3) return "";
long[][] suffixCost = new long[n + 1][C];
for (int c = 0; c < C; c++) {
char c_char = (char) ('a' + c);
for (int i = n - 1; i >= 0; i--) {
suffixCost[i][c] = suffixCost[i + 1][c] + Math.abs(caption.charAt(i) - c_char);
}
}
long[] dp = new long[n + 4];
Arrays.fill(dp, INF);
dp[n] = 0; dp[n + 1] = 0; dp[n + 2] = 0; dp[n + 3] = 0;
long[] minTerm = new long[C];
Arrays.fill(minTerm, INF);
for (int i = n - 3; i >= 0; i--) {
long minDpI = INF;
for (int c = 0; c < C; c++) {
long g_val = dp[i + 3] - suffixCost[i + 3][c];
minTerm[c] = Math.min(minTerm[c], g_val);
if (minTerm[c] < INF) {
long dp_c_i = suffixCost[i][c] + minTerm[c];
minDpI = Math.min(minDpI, dp_c_i);
}
}
dp[i] = minDpI;
}
if (dp[0] >= INF) return "";
StringBuilder result = new StringBuilder();
int currentPos = 0;
while (currentPos < n) {
for (int c = 0; c < C; c++) {
char c_char = (char) ('a' + c);
boolean found = false;
for (int j = currentPos + 2; j < n; j++) {
long blockCost = suffixCost[currentPos][c] - suffixCost[j + 1][c];
if (dp[j + 1] < INF && blockCost + dp[j + 1] == dp[currentPos]) {
int len = j - currentPos + 1;
for (int k = 0; k < len; k++) result.append(c_char);
currentPos = j + 1;
found = true;
break;
}
}
if (found) break;
}
}
return result.toString();
}
}
Note: The provided code snippet for reconstruction in the O(N) approach uses a simplified O(N*C) loop for clarity, which would be too slow. A full implementation would use the described map and binary search technique to achieve O(log N) per query.
Complexity Analysis
Pros and Cons
- Highly efficient with a time complexity of
O(N*C), which is linear for a fixed alphabet size. - Guaranteed to pass within the time limits for the given constraints.
- The implementation is significantly more complex than the
O(N^2)approach. - Requires more space to store auxiliary tables for the optimization (
dp_c,suffix_cost_c, and hash maps for reconstruction).
Video Solution
Watch the video walkthrough for Minimum Cost Good Caption
Similar Questions
5 related questions you might find useful
Patterns:
Data Structures:
Subscribe to Scale Engineer newsletter
Learn about System Design, Software Engineering, and interview experiences every week.
No spam, unsubscribe at any time.