Woodstock Blog

a tech blog for general algorithmic interview questions

[Google] Transform a Unbalanced Tree Into Balanced Tree

Question

link

How to transform a unbalanced tree into balanced tree?

Solution

Solution 1, do AVL tree balancing. Refer to [Design] BST Node Insertion / Deletion. This is of course, pretty difficult to code.

Best solution is to convert to DLL and then build a new tree!

Code

not written

[Google] Transform a Unbalanced Tree Into Balanced Tree

Question

link

How to serialize strings and pass it over the network and de-serialize the string?

The string may contain any possible character out of 256 valid characters.

Solution

The difficulty is how to differentiate between data (string) and metadata. Some interesting ideas:

  1. XML
  2. message with header + body

The best is the top answer:

I think that a good answer is to send the length of the string in bytes (coded for example as a 4 bytes unsigned integer) followed by the list of bytes, one per char.

The receiver reads the first 4 bytes and understand the string length (let says L), then it reads the following L bytes and build the string.

Code

eg. 5hello 5world 2my 4name 2is

[Facebook] Generate Number With Given Probability

Question

link

给定一个随机函数可以按照0.5的概率返回true。

要求实现一个函数随机返回任意概率的true。

Solution

Eg. 0.25 true, then should be:

  1. generate once. If false, return false, else…

  2. generate again, directly return.

So, always return result for the possibility larger than 0.5. This recursion might go on forever.

Read code below.

Code

not my code

boolean RNGwithGivenProb(double p, boolean expected)
{
   if(p < 0.5)
       return RNGwithGivenProb(1 - p, !expected);  
   if(RNG() == expected)
       return expected;
   else
       return RNGwithGivenProb((p - 0.5) * 2, expected);
}

boolean RNG()
{
   return T/F with 0.5 probability.
}

[Design] Find Similar Library Books

Question

link

有很大一个电子图书馆,里面每本书的每一页都是OCR转换出来的text,所有每页大 约有5%的error(转换错误,错误分割单词,跳脱。。。)。设计一个方法判定图书馆 里是否有完全一样的书(duplicate),或者将来有书进来时判定同样的书是否已存在。

Solution

Very large string matching, we mush use hashing or similar technique. Since books have error, we need to do fuzzy search/matching. Bloom filter is designed to do this!

Refer to [Design] Big Data - Fuzzy Search Url (Bloom Filter).

[Design] Difference: Internet and the Web

Internet

The Internet is a massive network of networks, a networking infrastructure. It connects millions of computers together globally, and computers talk thru protocols.

The Internet is a Big Collection of Computers and Cables.

Web

The World Wide Web, or simply Web, is a way of accessing information over the medium of the Internet.

It is an information-sharing model that is built on top of the Internet. The Web uses the HTTP protocol, only one of the languages spoken over the Internet, to transmit data.

The Web Is a Big Collection of HTML Pages on the Internet.

[Google] BST Find Ceiling

Question

link

A binary search tree is given. Find the ceiling of given key. Eg.

                    8
               3          12
           2      6    10    15
                4

Ceiling is defined as:

Ceil Value Node: Node with smallest data larger than the key value.

Testing:

key - 13 => 15 
key - 4 =>6 
key - 8 =>10

Solution

A really interesting idea is to go top-to-bottom, and update value in every left turn only. Suggested by the top answer.

It’s a very very tricky solution. I’ve never seen this b4.

Code

Not my code.

public int findCeil(TreeNode node, int num, int found) {
    if (node != null) {
        if (num >= node.val)
            return findCeil(node.right, num, found);
        else
            return findCeil(node.left, num, node.val);
    } else
        return found;
}

Now input:

TreeNode tree = Common.constructBstFromPreorder(new int[] { 40, 20, 10, 30, 60, 50, 70 });

Output:

The ceiling of 0 is  10
The ceiling of 1 is  10
The ceiling of 2 is  10
The ceiling of 3 is  10
The ceiling of 4 is  10
The ceiling of 5 is  10
The ceiling of 6 is  10
The ceiling of 7 is  10
The ceiling of 8 is  10
The ceiling of 9 is  10
The ceiling of 10 is  20
The ceiling of 11 is  20
The ceiling of 12 is  20
The ceiling of 13 is  20
The ceiling of 14 is  20
The ceiling of 15 is  20
The ceiling of 16 is  20
The ceiling of 17 is  20
The ceiling of 18 is  20
The ceiling of 19 is  20
The ceiling of 20 is  30
The ceiling of 21 is  30
The ceiling of 22 is  30
The ceiling of 23 is  30
The ceiling of 24 is  30
The ceiling of 25 is  30
The ceiling of 26 is  30
The ceiling of 27 is  30
The ceiling of 28 is  30
The ceiling of 29 is  30
The ceiling of 30 is  40
The ceiling of 31 is  40
The ceiling of 32 is  40
The ceiling of 33 is  40
The ceiling of 34 is  40
The ceiling of 35 is  40
The ceiling of 36 is  40
The ceiling of 37 is  40
The ceiling of 38 is  40
The ceiling of 39 is  40
The ceiling of 40 is  50
The ceiling of 41 is  50
The ceiling of 42 is  50
The ceiling of 43 is  50
The ceiling of 44 is  50
The ceiling of 45 is  50
The ceiling of 46 is  50
The ceiling of 47 is  50
The ceiling of 48 is  50
The ceiling of 49 is  50
The ceiling of 50 is  60
The ceiling of 51 is  60
The ceiling of 52 is  60
The ceiling of 53 is  60
The ceiling of 54 is  60
The ceiling of 55 is  60
The ceiling of 56 is  60
The ceiling of 57 is  60
The ceiling of 58 is  60
The ceiling of 59 is  60
The ceiling of 60 is  70
The ceiling of 61 is  70
The ceiling of 62 is  70
The ceiling of 63 is  70
The ceiling of 64 is  70
The ceiling of 65 is  70
The ceiling of 66 is  70
The ceiling of 67 is  70
The ceiling of 68 is  70
The ceiling of 69 is  70
The ceiling of 70 is  2147483647
The ceiling of 71 is  2147483647
The ceiling of 72 is  2147483647
The ceiling of 73 is  2147483647
The ceiling of 74 is  2147483647
The ceiling of 75 is  2147483647
The ceiling of 76 is  2147483647
The ceiling of 77 is  2147483647
The ceiling of 78 is  2147483647
The ceiling of 79 is  2147483647
The ceiling of 80 is  2147483647
The ceiling of 81 is  2147483647
The ceiling of 82 is  2147483647
The ceiling of 83 is  2147483647
The ceiling of 84 is  2147483647
The ceiling of 85 is  2147483647
The ceiling of 86 is  2147483647
The ceiling of 87 is  2147483647
The ceiling of 88 is  2147483647
The ceiling of 89 is  2147483647
The ceiling of 90 is  2147483647
The ceiling of 91 is  2147483647
The ceiling of 92 is  2147483647
The ceiling of 93 is  2147483647
The ceiling of 94 is  2147483647
The ceiling of 95 is  2147483647
The ceiling of 96 is  2147483647
The ceiling of 97 is  2147483647
The ceiling of 98 is  2147483647
The ceiling of 99 is  2147483647

[Question] Reservoir Sampling

Reservoir sampling is a family of randomized algorithms for randomly choosing k samples from a list of n items, where n is either a very large number. Typically n is large enough that the list doesn’t fit into main memory. For example, a list of search queries in Google and Facebook.

Question

link

given a big array (or stream) of numbers (to simplify), and we need to write an efficient function to randomly select k numbers where 1 <= k <= n. Let the input array be stream[].

Solution

O(n) time!

  1. Create an array sample[0..k-1] and copy first k items of stream[] to it.

  2. Now one by one consider all items from (k+1)th item to nth item.

    1. Generate a random number ‘j’ from 0 to i where i is index of current item in stream[].

    2. If j is in range 0 to k-1, replace sample[j] with stream[i]

Code

not written

[Design] Multithreading Async Increment Problem

Question

link

If two threads are incrementing a variable 100 times each without synchronization, what would be the possible min and maximum value.

Solution

Well, max is init+200, and min is init+2. Suggested by the top answer:

  1. P1 & P2 copy var
  2. P1 increments 99 times. so var becomes var + 99
  3. P2 increments once. so var becomes var + 1
  4. P1 copies var (value is var + 1)
  5. P2 increments 99 times. so var becomes var + 100
  6. P1 increments once. so var becomes var + 2

[Google] Max Prodcut of Strings That Have No Common Char

Question

link

Given a dictionary of wrods,find the pair of word with following property:

  1. the two word don’t have same letter.

  2. the multiple of the two word’s length is maximum.

I give a simple O(nnk)(k is the average length of word) method.but i think there will be better one.

Solution

Best answer suggest as top comment:

Assuming the word is A-Z/a-z only, use a bitmap to set which letters it contains.

e.g. ca => 000….101

bb => 000…010

This is called Bitmask of string. Read [Question] Check string with no common letters (Bitmask).

Then:

Iterate over the words in decreasing order of length.

for each pair of words, AND the bitmaps.

Return the first pair that gives a 0 result.

This should be nk + nn

Be cautious

For the second part of the solution above, is this code going to work?

Arrays.sort(strs) in descending order;
for (int i = 0; i < strs.length; i++) {
    for (int j = 0; j < i; j++) {
        if (strs[i].bitmask & strs[j].bitmask == 0) {
            // this pair do not have common char
            // since strs in descending order, and i, j start from 0
            // the product of length should be max
            return i + ' ' + j;
        }
    }
}

Well, this is wrong. For example: {“ababa”, “aaa”, “bbb”, “cc”}, if we do longest-string to shorest-string, we would return “aaa”, “bbb” immediately when we found it. However, 5 * 2 > 3 * 3.

So, we have to find largest product using max-heap, like we did in [Google] Top N Values From Sum of 2 Arrays, “pop 1 and push 2”.

Another solution

DP: for each string, find the reversed set of char, and then find the max string using the reversed set. This idea is great, too, but less intuitive.

It is explained here.

Code

not written