Woodstock Blog

a tech blog for general algorithmic interview questions

[Question] Number of Distinct Sub-sequence

Question

link

Find the number of distinct subsequences of a string (include “” as a subsequence).

For example, Input

AAA 
ABCDEFG 
CODECRAFT 

Output

4 
128 
496 

Solution

In [LeetCode 115] Distinct Subsequences, we discuss finding occurence of a given subsequence.

Now if we do not specify a subsequence, we want the total number of distinct subsequence.

The solution is DP, with the following equation:

Let, 

dp[i] = number of distinct subsequences ending with a[i]

last[i] = last position of character i in the given string.

Equation:

dp[i] = dp[last[i] - 1] + ... + dp[i - 1]

The final result is:

Distinct Subsequences = dp[1] + ... dp[len - 1]

Example 1:

Input   : - A B C
dp array: 1 1 2 4
Total = 8

Example 2:

Input   : - A A C
dp array: 1 1 1 3
Total = 6

The code is posted below.

Optimize Solution

There is a good optimization of this DP solution, which is to keep another dp array ‘sum’, which sum[i] = dp[1] + dp[2] + … + dp[i]. The final answer would be sum[len - 1].

This nice idea is from this post. Credit goes to IVlad.

Code

un-optimized code. calculate dp[0] … dp[n], then sum to final result.

public int countDistinctSubseq(String input) {
    int len = input.length();
    int[] dp = new int[len + 1];
    // dp[i] denotes the number of distinct subseq within first 'i' chars
    dp[0] = 1;
    // the first 0 chars is "" - we consider it as 1 subseq

    for (int i = 1; i <= len; i++) {
        // set dp[i]
        // dp[i] = dp[i-1] + ... + dp[k] where input{k} == input{i}
        int p = i - 1;
        while (p >= 0) {
            dp[i] += dp[p];
            if (p > 0 && input.charAt(p - 1) == input.charAt(i - 1)) {
                // when meeting a same char ahead of position i, stop
                // adding to dp[i]
                break;
            }
            p--;
        }
    }
    int sum = 0;
    for (int i : dp) {
        sum += i;
    }
    return sum;
}