Woodstock Blog

a tech blog for general algorithmic interview questions

[Amazon] Grep Command Interview Question

Question

link

You have 50,000 html files, some of which contain phone numbers. How would you create a list of all the files which contain phone numbers?

Solution

This is a famous inteview question by former Amazon engineer Steve Yegge:

About 25% to 35% of all software development engineer candidates, independent of experience level, cannot solve this problem, even given the entire interview hour and lots of hints.

This question tests your understanding of scripting languages.

Code

grep -l -R --perl-regexp "\b(\(\d{3}\)\s*|\d{3}-)\d{3}-\d{4}\b" * > output.txt

[Google] Data Structure of Insert, Remove, GetRandom

Question

link

Design a data structure where the following 3 functions at O(1):

1. Insert(n) 
2. GetRandomElement() 
3. Remove(n) 

Solution

Array is best for:

  1. random access

HashMap is best for:

  1. Searching
    1. insert
    2. remove

So the answer is array + hashmap:

  1. Insertion can be done by appending to the array and adding to the hash-map.

  2. Deletion can be done by first looking up and removing the array index in the hash-map, then swapping the last element with that element in the array.

  3. Get random can be done by returning a random index from the array.

  4. All operations take O(1).

Note how hashmap is used:

insert(value): append the value to array and let i be it’s index in A. Set H[value]=i.

Hashmap stores value’s index in the array - that is to say: this DS does not support inserting duplicating values.

Finally, when we delete, we swap the last element to replace the gap. This is an nice idea!

follow-up

what if we want to get the top x% number?

Well, heap of course. And note that Heap size is auto-increasing:

PriorityQueue is unbounded, it can grow as big as your memory allows, and it will grow automatically when needed. The initialCapacity parameter is just a hint to reserve room for that many elements initially.

[Google] Collatz Conjecture (Oneness Property)

Collatz Conjecture

The Collatz conjecture is a conjecture in mathematics known as the 3n + 1 conjecture.

Take any natural number n.

  1. If n is even, divide it by 2 to get n / 2.
  2. If n is odd, multiply it by 3 and add 1 to obtain 3n + 1.

Repeat the process (which has been called “Half Or Triple Plus One”, or HOTPO) indefinitely. The conjecture is that no matter what number you start with, you will always eventually reach 1.

The property has also been called oneness.

Question

link

Collatz conjecture says that if you do the following

If n is even, replace n by n/2.
If n is odd, replace n by 3n+1.
You ultimately end up with 1.

For instance, 5 -> 16 -> 8 -> 4 -> 2 -> 1

Chain length is the number of steps required to get to 1. (The chain length of 1 is 0).

Now, the problem is given natural numbers n and k, find all numbers between 1 and n, such that the chain length is <= k.

Solution

Generate all numbers in backwards fashion, suggest by templatetypedef:

                  1
                  |
                  2
                  |
                  4
                  |
                  8
                  |
                  16
                  | \
                  32 \
                  |   5
                  64  |
                 /|   10
                / 128 | \
               21     20 3

Implementation: using a queue and keep appending numbers.

Duplication handling?

Assuming that the Collatz conjecture holds true, we’ll never get any duplicates this way.

Time complexity is O(S) time, where S is the number of numbers we need to output.

Code

not written

[Design] Monitor Rps for Past Sec/min/hr

Question

link

Given a server that has requests coming in.

Design a data structure such that you can fetch the count of the number requests in the last second, minute and hour.

Solution 1

Keep a record of all request timestamps, suggested by the top answer by whatevva:

  1. Use a queue implemented as a resizable array to store the timestamps of all new requests

  2. maintain head/tail pointers as usual

  3. Also maintain three pointers, for past sec, past min and past hr.

Whenever a request comes in, update 3 pointers. Then in the for-loop of the thread, remove old entries and also update 3 pointers.

Print Rps in real time. I posted my code below (the code is without thread-safety consideration).

Solution 2

This solution does not store all timestamps, and it does not generate real-time Rps data. But it’s good enough because result is only updated every 1 second, so its performance is better.

Keep an array of int of size 60 * 60. Each second, use the number of request in the past second to update the array values in a rolling way.

Code

Solution 1. this is my code so please correct me!

public class SetRps {

    AtomicInteger count = new AtomicInteger(0);
    int limit = -1;
    int printIndex = 1;
    long startTimestamp = -1;

    void setRPS(int num) {
        limit = num;
    }

    boolean process(long timestamp) {
        // suppose timestamp is ms
        synchronized (this) {
            if (count.get() < limit) {
                // can process
                count.incrementAndGet();
                System.out.println(printIndex++ + ". processing request "
                        + timestamp % 100000 / 1000 + "," + timestamp % 1000);
                return true;
            }
            if (timestamp - startTimestamp >= 1000) {
                // every 1 seconds, reset
                count.set(0);
                startTimestamp = timestamp;
                System.out.println("clear!");
                return true;
            }
        }
        return false;
    }
}

[Question] Stock Span Problem (Couting BST)

Question

Given stock price of Amazon for some consecutive days. Need to find the maximum span of each day’s stock price.

Definition of ‘span’ have got 2 variant:

Variant 1

link

Span is the number of consecutive days right before that day, which have less or equal stock value.

(Or in GFG language): The span Si of the stock’s price on a given day i is defined as the maximum number of consecutive days just before the given day, for which the price of the stock on the current day is less than or equal to its price on the given day.

Solution

Use stack.

Variant 2

link

Span is the amount of days before the given day where the stock price is less than that of given day.

Solution

The top answer in here is wrong. Eg. {1,3,2,4}, the count for 4 would be 2, instead of 3.

Instead, the BST (AVL tree) solution is correct. It’s commented by user zahidbuet106.

insert numbers in a AVL tree one by one from right to left. During each insert we will keep updating the size of left subtree at the node being inserted. This will give us our desired smaller element count.

We also need to handle balancing the tree while insert.

The key of this question is the special BST, where each node stores an additional counting number.

This type of special BST is extremely frequntly used in Computer Science, especially when we want to dynamically insert elements and find out it’s ranking within the past history.

Read another very interesting post: [CC150v5] 11.8 Get Rank in Stream of Integers.

Code

class TreeNodePlus extends TreeNode {
    int leftCount;
    int dupCount;

    public TreeNodePlus(int v, int leftC) {
        super(v);
        this.leftCount = leftC;
        this.dupCount = 1;
    }

    public int findRank(TreeNodePlus node) {
        TreeNodePlus leftBranch = (TreeNodePlus) this.left;
        TreeNodePlus rightBranch = (TreeNodePlus) this.right;

        if (this == node) {
            return 0;
        } else if (node.val < this.val) {
            if (this.left == null) {
                return 0;
            } else {
                return leftBranch.findRank(node);
            }
        } else {
            if (this.right == null) {
                return this.leftCount + this.dupCount;
            } else {
                return this.leftCount + this.dupCount
                        + rightBranch.findRank(node);
            }
        }
    }

    public TreeNodePlus insertNum(int num) {
        TreeNodePlus leftBranch = (TreeNodePlus) this.left;
        TreeNodePlus rightBranch = (TreeNodePlus) this.right;

        if (num == this.val) {
            this.dupCount++;
            return this;
        } else if (num < this.val) {
            // insert num to the left branch
            this.leftCount++;
            if (this.left == null) {
                this.left = new TreeNodePlus(num, 0);
                return (TreeNodePlus) this.left;
            } else {
                return leftBranch.insertNum(num);
            }
        } else {
            // insert num to the right branch
            // this.leftCount does not change
            if (this.right == null) {
                this.right = new TreeNodePlus(num, 0);
                return (TreeNodePlus) this.right;
            } else {
                return rightBranch.insertNum(num);
            }
        }
    }
}

[Design] Limit the Request Per Second

Question

link

有一个接口叫 void setRPS(int num);

接下来不断有request过来,如何实现下面的接口,返回accept或者deny,

bool process(int timestamp){

}

Solution

Suggested by level 5 of this post:

  1. maintain a variable for the number of request processed/rejected.
    1. This variable must be atomic, thus a AtomicInteger.
    2. the variable is called ‘count’
  2. have a method to process request
    1. if count < limit, do it
    2. otherwise, reject
  3. This is the most important: clear the count every 1 seconds!
    1. eg. LIMIT = 5r/s, so:
    2. the first 5 number of requests in every second are getting fulfilled
    3. from 6th request onward, the request all rejected, until the next second.

Code

public class SetRps {

    AtomicInteger count = new AtomicInteger(0);
    int limit = -1;
    int printIndex = 1;
    long startTimestamp = -1;

    void setRPS(int num) {
        limit = num;
    }

    boolean process(long timestamp) {
        // suppose timestamp is ms
        synchronized (this) {
            // to process or not to process
            if (count.get() < limit) {
                // can process
                count.incrementAndGet();
                System.out.println(printIndex++ + ". processing request "
                        + timestamp % 100000 / 1000 + "," + timestamp % 1000);
                return true;
            }
            // to clear or not to clear
            if (timestamp - startTimestamp >= 1000) {
                // every 1 seconds, reset
                count.set(0);
                startTimestamp = timestamp;
                System.out.println("clear!");
                return true;
            }
        }
        return false;
    }
}

[Greedy] Each Employee 2 Events

link

Question

You are given N ranges of date offsets when N employees are present in an organization. Something like

    1-4 (i.e. employee will come on 1st, 2nd, 3rd and 4th day ) 
    2-6 
    8-9 
    .. 
    1-14 

Organize an event on minimum number of days such that each employee can attend the event at least twice.

Solution

Greedy algorithm, according to the top answer:

  1. First, sort all ranges based on ENDING date in increasing order (bucket or counting sort).

  2. For each range, select last two days (because they produce maximum possibility to overlap next range)

  3. Skip following ranges that also contains those two days, until:

    1. a range that either covers only one day (we then select last day of this range) or
    2. does not cover any of the two (we then select last two days of this range).
  4. Then continue.

Code

not written

[Greedy] Activity Selection Problem

link

Question

Given n activities with their start and finish times.

Select maximum number of activities that can be performed in one run.

Example:

     start[]  =  {1, 3, 0, 5, 8, 5};
     finish[] =  {2, 4, 6, 7, 9, 9};

     output[] = {0, 1, 3, 4}

Solution

Greedy algorithm, according to gfg:

  1. Sort the activities according to their finishing time

  2. Select the first activity from the sorted array and print it.

  3. For the rest, if start time is greater than previous finish time, then select this activity.

Code

not written.

[Question] 2D Bin Packing

Question

link

Objects of different volumes must be packed into a finite number of bins or containers each of volume V in a way that minimizes the number of bins used.

Solution

Explanation from here:

  1. Build a binary tree. Each branch in the tree contains a sprite.
  2. Each leaf node represents available space.
  3. Initially the tree has just the root node, which represents all available space.
  4. To add a sprite to the tree, search the tree for an unoccupied (leaf) node big enough to hold the sprite.
  5. Turn that node from a leaf into a branch by setting the sprite as the node’s occupant and giving the node two children.
  6. One child represents the remaining space to the right of the sprite;
  7. the other represents the remaining space below the sprite and the first child.

For detailed implementation and code, refer to this comprehensite guide. BTW, it’s used for auto generating CSS Sprites which puts images into a large graph.

[Question] Product Array Puzzle

Question

link

Given an array of integers , replace each element with the product of the remaining elements.

Eg : Input - 1 2 3 4
Output : 24 12 8 6

Do not use division.

Solution

Store the product of the left side elements for each integer, and also the right side.

For eg : L[]= {1 , 1 , 2 , 6 }

and R[] = { 24 , 12 , 4 , 1}

The multiply R[i] and L[i] to get the resultant array.