Woodstock Blog

a tech blog for general algorithmic interview questions

[Amazon] Longest Repeating Substring

Question

link

Finding the longest repeated substring.

Example: “banana” ==> “ana”

Solution

There are 2 solutions: Suffix array, and Suffix tree.

1. Suffix array. Simple code, explained here.

Bentley’s programming pearl book has the simplest implementation (less than 15 lines code) which sort all suffix, and then check common prefix length among adjacent suffix. The time complexity is O(n2logn) for sorting the suffix (which has avg length of O(n)).

A detailed step-by-step explanation:

str = banana, its suffixes are:
banana
anana
nana
ana
na
a

after sort, the suffix array looks like:

a
ana
anana
banana
na
nana

Then for each two adjacent suffixes, check the length of the common prefix.

The answer is “ana” (if overlapping is allowed, otherwise, should be “an”).

2. Suffix tree. Suggest by this post, Or this:

a good solution is to create a suffix tree for the given word and then find the deepest internal node in that tree (node with at least 2 descendants under it)…

For a nice PPT presentation about suffix tree, look here.

Code

Suffix array approach.

public String longestRepeat(String input) {
    int len = input.length();
    String[] suffixArray = new String[len];
    for (int i = 0; i < len; i++) {
        suffixArray[i] = input.substring(i);
    }
    // now sort the suffix array
    Arrays.sort(suffixArray);
    String longest = "";
    // start to compare neighborhood suffixes, and check LCP
    for (int i = 0; i < suffixArray.length - 1; i++) {
        String lcp = longestCommonPrefix(suffixArray[i], suffixArray[i + 1]);
        if (lcp.length() > longest.length()) {
            longest = lcp;
        }
    }
    return longest;
}

private String longestCommonPrefix(String s1, String s2) {
    int p = 0;
    while (p < s1.length() && p < s2.length()) {
        if (s1.charAt(p) != s2.charAt(p)) {
            break;
        }
        p++;
    }
    return s1.substring(0, p);
}

[Google] Crazy Distance Between Strings

Question

link

X and Y are strings formed by 0 or 1. Distance is define as:

D(X,Y) = Remove chars common at the start from both X & Y. 
Then add the remaining lengths from both the strings.

For e.g.

D(1111, 1000) = Only First alphabet is common. So the remaining string is 111 & 000. Therefore the result length("111") & length("000") = 3 + 3 = 6

For e.g.

D(0101, 01100) = Only First two alphabets are common. So the remaining string is 01 & 100. Therefore the result length("01") & length("100") = 2 + 3 = 5

Now given n input, say like

1111
1000
101
1100

Find out the maximum crazy distance between 2 strings.

n is the number of input strings. m is the max length of any input string.

Solution

This is the source.

Put the strings into a tree, where 0 means go left and 1 means go right. O(m*n) time.

Example:

            Root
             1
          0      1
         0 1*   0  1
        0*     0*    1*

where the * means that an element ends there. Constructing this tree clearly takes O(n m).

Now we have to find the diameter of the tree (the longest path between two nodes).

How to find out longest path between 2 leaf nodes? Please refer to [Google] Diameter of a Binary Tree for explanation.

Total time complexity is O(m*n) time.

Code

public int crazyDist(String[] input) {
    TreeNode root = this.buildTree(input);
    return this.findMaxPath(root).path - 1;
}

private Result findMaxPath(TreeNode node) {
    if (node == null) {
        return new Result(Integer.MIN_VALUE, 0);
    }
    Result lr = this.findMaxPath(node.left);
    Result rr = this.findMaxPath(node.right);
    int path = Math.max(lr.path, rr.path);
    if (lr.depth != 0 && rr.depth != 0) {
        // this check is important, because if any of the child node is
        // NULL, this root will not be eligible for computing the path
        path = Math.max(path, lr.depth + rr.depth + 1);
        // Why? cuz diameter must go from one leaf, thru root, and reach
        // another leaf. This is different from "Maximum Path Sum" leetcode
    }
    return new Result(path, 1 + Math.max(lr.depth, rr.depth));
}

private TreeNode buildTree(String[] input) {
    TreeNode root = new TreeNode(123);
    // share a common root. this root is deducted from the final calculation
    for (String str : input) {
        // insert str under the root
        TreeNode p = root;
        for (char c : str.toCharArray()) {
            if (c == '0') {
                if (p.left == null) {
                    p.left = new TreeNode(124);
                    // if 0, go to left; otherwise go to right
                    // thus value of TreeNodes does not really matter
                }
                p = p.left;
            } else {
                if (p.right == null) {
                    p.right = new TreeNode(125);
                }
                p = p.right;
            }
        }
    }
    return root;
}

class Result {
    int path;
    int depth;

    public Result(int a, int b) {
        path = a;
        depth = b;
    }
}

[Google] Check if Repeating Subsequence Exists

Question

link

Given a string, find if there is any sub-sequence that repeats itself. Do not reuse any char.

Eg:

1. abab <------yes, ab is repeated
2. abba <---- No, a and b follow different order 
3. acbdaghfb <-------- yes, a followed by b twice 
4. abcdacb <----- yes, a followed by b twice 

Note that no char should be reused. I.e. “aab” is a false.

Solution

This looks like a question without any clue. However, this actually is a modified version of [LintCode] Longest Common Subsequence.

Look at that question: there’s 2 input string, and they match char-by-char. For this question, we are simply matching input string with input string itself. And chars should be match ONLY at different positions, that’s the key. As pointed out by the top comment:

Now we can modify the longest_common_subsequence(a, a) to find the value of the longest repeated subsequence in a by excluding the cases when i == j

Code

public boolean checkRepeatSubseq(String input) {
    int len = input.length();
    int[][] dp = new int[len + 1][len + 1];
    // dp[i][j] denotes the length of subseq between 2 strings:
    // 1. first i chars of input
    // 2. first j chars of input
    for (int i = 1; i <= len; i++) {
        for (int j = i; j <= len; j++) {
            if (i != j && input.charAt(i - 1) == input.charAt(j - 1)) {
                int temp = Math.max(dp[i - 1][j], dp[i][j - 1]);
                dp[i][j] = Math.max(temp, dp[i - 1][j - 1] + 1);
            } else {
                dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
            }
        }
    }
    return dp[len][len] >= 2;
}

[LinkedIn] Find All Repeating Substring With Given Length

Question

link

Find all the repeating substring of specified length in a large string sequence.

For e.g.

Input String: "ABCACBABC" 
repeated sub-string length: 3 
Output: ABC 

eg.

Input String: "ABCABCA" 
repeated sub-string length: 2 
Output: AB, BC, CA

Solution

Similar to [Amazon] Longest Repeating Substring, the best solution is to do Suffix Tree, or suffix array. We then need to print nodes on a certain level, who has more than 1 descendant.

However, since the length of substring is given, we can also do simply iteration: insert all substring with given length into a HashSet, and check repetition. ref

Code

Suffix tree solution: not written.

Hashset code:

public List<String> solve(String input, int k) {
    List<String> ans = new ArrayList<String>();
    HashSet<String> set = new HashSet<String>();
    for (int i = 0; i <= input.length() - k; i++) {
        String sub = input.substring(i, i + k);
        if (set.contains(sub)) {
            ans.add(sub);
        }
        set.add(sub);
    }
    return ans;
}

[Question] All Distinct Subsequences With Given Length

Question

link

Find a polynomial-time algorithm that takes a string of length n, and a number k, output the number of distinct k-character subsequences.

For instance, input “food” and number k=2, output should be 4. There are four distinct 2-character subsequences of “food”: “fo”, “fd”, “oo”, and “od”.

Solution

Similar to [Question] Number of distinct sub-sequence, we solve this problem with DP. The dp equation is a bit difficult to write.

The idea come from comment from gareth_rees:

Let θ(S, k) be the number of distinct k-character subsequences in the string S of length n.

Clearly θ(S, k) = 1 if n = k or k = 0

and θ(S, k) = 0 if n < k.

Otherwise, choose 1 unique char from S, and deduct k by 1, then do the DP calculation with the remaining part of S.

Look at this example:

θ("food", 2) = θ("ood", 1) + θ("od", 1) + θ("", 1)
= (θ("od", 0) + θ("", 0)) + (θ("d", 0) + θ("", 0)) + 0
= (1 + 1) + (1 + 1)
= 4

“food” is divided into 3 parts. First part we choose “f” to be the first char, thus the value is θ(“ood”, 1). Second part we choose “o”, and final part we choose “d”.

Note that when we choose a char, it must never have been chosen before. In case of “food”, we only choose ‘f’, ‘o’, ’d' once for each.

This is a very difficult DP question, but the explanation really makes the answer easier. Read my implementation below.

Code

public int countSubSeq(String input, int k) {
    // assuming all input chars are small letter
    return choose(input, 0, k);
}

private int choose(String input, int start, int numChar) {
    int charLeft = input.length() - start;
    if (charLeft == numChar || numChar == 0) {
        return 1;
    } else if (charLeft < numChar || numChar < 0) {
        return 0;
    }
    // now numChar is smaller than charLeft, and larger than 0
    // start to pick a char (which is at first appearance)
    int total = 0;
    HashSet<Character> chosen = new HashSet<Character>();
    while (start < input.length()) {
        char currentChar = input.charAt(start);
        if (!chosen.contains(currentChar)) {
            // pick the char pointer by 'start'
            total += choose(input, start + 1, numChar - 1);
            chosen.add(currentChar);
        }
        start++;
    }
    return total;
}

[Google] Diameter of a Binary Tree

Question

link

The diameter of a tree (sometimes called the width) is the number of nodes on the longest path between two leaves in the tree.

Solution

This is a similar question to [LeetCode 124] Binary Tree Maximum Path Sum. However there’s a significant difference which might be overlooked while coding.

Look at this example:

     0
       1
        1
       0  1
           1

If we only want to find the max path, that would return result of 5, which is root-to-rightmost-leaf. However, the diameter should be 4, which is the distance between 2 leaf nodes.

A solution is available for reading here.

For [Google] Crazy Distance Between Strings, there is another special case: {“1”, “11”, “10”}. The program will not output correct result (1), because this is not really the diameter of a tree, but instead, a max path from a non-leaf to a leaf. I leave this part for you to finish.

Code

Refer to [Google] Crazy Distance Between Strings for complete code.

[Google] Reverse a Stack Without DS

Question

link

Reverse a stack using recursion. You are not allowed to use loops or data structure, and you can only use the following functions:

isEmpty(S)
push(S)
pop(S)

Solution

Well since we are not allowed to use additional DS or loop, we have to use system stack to help us!

We add a new method: insert at stack bottom. Then we can solve this question recursively. Nice question, and tricky answer!

Code

public void reverse(Stack<Integer> stack) {
    if (stack.isEmpty() || stack.size() == 1) {
        return;
    }
    int top = stack.pop();
    this.reverse(stack);
    this.insertAtBottom(stack, top);
}

private void insertAtBottom(Stack<Integer> stack, int val) {
    if (stack.isEmpty()) {
        stack.push(val);
        return;
    }
    int temp = stack.pop();
    this.insertAtBottom(stack, val);
    stack.push(temp);
}

[Amazon] Mininum Range That Includes at Least One

Question

link

There are many sorted arrays. Find a minimum range, so that in each array there’s at least one integer within this range.

Solution

Min-heap. source

There are k lists of sorted integers. Make a min heap of size k containing 1 element from each list. Keep track of min and max element and calculate the range.

In min heap, minimum element is at top. Delete the minimum element and another element instead of that from the same list to which minimum element belong. Repeat the process till any one of the k list gets empty.

Code

public void printMinRange(int[][] input) {
    Comparator<Pointer> compr = new HeapComparator(input);
    // Note that we pass in 'input' arrays to the comparator
    PriorityQueue<Pointer> heap = new PriorityQueue<Pointer>(SIZE, compr);

    int maxVal = Integer.MIN_VALUE;
    for (int i = 0; i < SIZE; i++) {
        heap.add(new Pointer(i, 0));
        // insert the head of each array into the heap
        maxVal = Math.max(maxVal, input[i][0]);
        // keep additional value to keep track of the max value in heap
    }

    int left = 0;
    int right = Integer.MAX_VALUE;
    while (heap.size() == SIZE) {
        Pointer p = heap.remove();
        // first, update the range
        if (maxVal - input[p.index][p.position] < right - left) {
            right = maxVal;
            left = input[p.index][p.position];
        }
        // then, push the next element after 'p' to the heap
        // meanwhile, update 'maxVal'
        if (p.position + 1 < input[p.index].length) {
            Pointer nextP = new Pointer(p.index, p.position + 1);
            heap.add(nextP);
            maxVal = Math.max(maxVal, input[nextP.index][nextP.position]);
        }
        // when 'p' is the last element in the row, terminate loop
    }
    System.out.println("Left boundary: " + left);
    System.out.println("Right boundary: " + right);
}

class HeapComparator implements Comparator<Pointer> {

    int[][] arrays = null;

    public HeapComparator(int[][] input) {
        arrays = input;
    }

    public int compare(Pointer p1, Pointer p2) {
        return arrays[p1.index][p1.position]
                - arrays[p2.index][p2.position];
    }
}

class Pointer {
    int index, position;

    public Pointer(int x, int y) {
        index = x;
        position = y;
    }
}

[Google] Maximum Count Array in a Queue

Question

link1

给一个数组a[n],令s[i]为a[i+1..n-1]中比a[i]大的数的数量。

求最大的s[i]。要求O(nlogn)

Solution

This is very similar question to [Google] Form a Queue Given Heights. The idea is to insert elements into BST and count number of larger elements.

Naitive solution can be reached with a list.

[Google] Form a Queue Given Heights

Question

link1, link2, link3.

There is an array of integers which represent heights of persons.

Given another array… Let’s call it count-array. It contain how many persons in front of him are greater than him in height.

求原数组。(原数组中元素是从1到n。)

Example:

Input(Count array): 0, 0, 2, 0
Output(原数组): 2, 3, 1, 4

求nlogn的算法。

Solution

This is naive solution from floor 29 of this thread:

总结一下,用一个List存放1…n。

从头到尾扫描给定的数组,每得到一个值,从List里删掉。

因为List里数据是有序的,因此remove操作可以使用二分法,复杂度为O(logn).

这样本算法复杂度为O(nlogn).

Example:

count array 
i C[0,0,2,0]   N[4, 3, 2, 1]
3 C[3] = 0     在N里面删除N[0]=4, N=[3, 2, 1],   Ans=[4]
2 C[2] = 2     在N里面删除N[2]=1, N=[3, 2],   Ans=[1, 4]
1 C[1] = 0     在N里面删除N[0]=3, N=[2],   Ans=[3, 1, 4]
0 C[0] = 0     在N里面删除N[0]=2, N=[], Ans=[2, 3, 1, 4]

But there is a problm here, since removing item from list requires O(n), we will achieve O(n2) time. How do we optimize this?

The answer is BST with each node keeping track of how many nodes is on the left branch, and how many on right branch. For details of this type of TreeNode, refer to [CC150v5] 11.8 Get Rank in Stream of Integers.

The conclusion:

可以化归为这样一道题:

设计一个有序的数据结构,最初里头有自然数1到n这n个元素,

随后这些元素可以被删除,但剩下元素仍然保持有序。

要求实现方法GetKthElement(int k)和RemoveKthElemenet(int k),

使得它们在任意时刻都不超过O(lgN), N为当前的元素个数

感觉要结合BST之类

Code

Naive approach, O(n2):

public int[] form(int peopleCount, int[] countArray) {
    int len = peopleCount;
    int[] heightQueue = new int[len];
    List<Integer> list = new ArrayList<Integer>();
    for (int i = peopleCount; i > 0; i--) {
        list.add(i);
    }
    for (int i = len - 1; i >= 0; i--) {
        heightQueue[i] = list.remove(countArray[i]);
    }
    return heightQueue;
}