Woodstock Blog

a tech blog for general algorithmic interview questions

[Question] Find Cloest Leaf in Binary Tree

Question

link

Given a Binary Tree and a key, find distance of the closest leaf.

Examples:

          1
        /    \    
       2       3
             /   \  
            5     6   
           /       \
          7         8
         / \       /
        9  10     11

Closest key to '8' is '11', so distance is 1 for '8'
Closest key to '3' is '2', so distance is 2 for '3'
Closest key to '5' is either '9' or '10', so distance is 2 for '5'
Closest key to '2' is '2' itself, so distance is 0 for '2' 

Solution

traverse the given tree in preorder and keep track of ancestors (in a caching data struture, either it’s List or an array with a correct pointer)

When we find our target, we do 2 things:

  1. find closest distance on lower subtrees of current node.

  2. for every ancester, find the closest distance on lower subtrees, then add with distance to ancester.

Finally, return the smallest value seen above.

Code

Inspired by the code from G4G

int answer;

public int findClosest(TreeNode root, int key) {
    answer = Integer.MAX_VALUE;
    helper(root, key, new ArrayList<TreeNode>());
    return answer;
}

private void helper(TreeNode node, int key, List<TreeNode> path) {
    if (node == null) {
        return;
    } else if (node.val != key) {
        path.add(node);
        helper(node.left, key, path);
        helper(node.right, key, path);
        path.remove(path.size() - 1);
    } else {
        // key matches with current node value
        answer = lenToLowerLeaf(node);
        // answer initially = cloest leaf from lower

        int len = path.size();
        for (int i = 0; i < len; i++) {
            // for every ancestor, calculate distance and compare
            int ithToLowerLeaf = lenToLowerLeaf(path.get(i));
            answer = Math.min(answer, (len - i) + ithToLowerLeaf);
        }
    }
}

private int lenToLowerLeaf(TreeNode node) {
    if (node == null) {
        return Integer.MAX_VALUE;
    } else if (node.left == null && node.right == null) {
        return 0;
    } else {
        return 1 + Math.min(lenToLowerLeaf(node.left), lenToLowerLeaf(node.right));
    }
}

[Amazon] All Strings by Placing Spaces

Question

link

Given a string, print all possible strings that can be made by placing spaces (zero or one) in between them.

Input: str[] = “ABC”

Output:

    ABC
    AB C
    A BC
    A B C

Solution

recursion.

Code

public void printAll(String input) {
    if (input == null || input.length() <= 1) {
        // since we insert space in-between chars, so
        return;
    }
    int len = input.length();
    // len >= 2
    helper(input, len - 1);
}

private void helper(String s, int p) {
    if (p == 1) {
        System.out.println(s);
        // no insertion
        System.out.println(s.substring(0, 1) + " " + s.substring(1));
        // insert at position 1
    } else {
        helper(s, p - 1);
        helper(s.substring(0, p) + " " + s.substring(p), p - 1);
    }
}

[Fundamental] the 7 Bridges Problem

Overview

In East Prussia(普鲁士), people try to walk all 7 bridges w/o crossing a bridge twice.

Leonhard Euler (pronounced “oiler”) – Swiss

Euler path

An Euler path, also called an Eulerian trail, is a walk on the graph edges of a graph which uses each graph edge in the original graph exactly once.

Degree

Node degree of a vertex: the number of edges incident with it.

Euler Theorem

A graph contains an euler path iffeither of the following cases hold:

  1. All except for two nodes have even degrees – the 2 odd-degree nodes must be start and end points
  2. all nodes have even degrees.

Application

networks, distributed systems, coding theory

[Google] Shortest Manhattan Distance

Question

link

给一个 n*m 的房间,房间里存在各种可能的墙,房间的格子里已经放了 e 个器材,要 求新放一个器材,放置位置距其它 e 个器材的距离最近。Breadth-first search.

Solution

对 e个设备 BFS, 求每个设备到每个可以放新器材的点的距离,然后叠加。

最后O(n2)一遍找最小值。复杂度O(e*n2

As for whether we choose to check each equipment position, or check each vacant position, it’s decided by how many equipment is there. If very little equipments (e is small), then this solution should work.

However, what is there is obstacles in the matrix?

We have to use BFS then. It took more space usage, but the time complexity should be same.

Code

public void findCenter(int[][] input, int numberOfEquip) {
    int m = input.length;
    int n = input[0].length;

    // there's gonna be m * n positions
    // we gonna cumulate (numberOfEquip) distances for each position
    int[] dis = new int[m * n];

    // from the input map, find Equipments
    for (int i = 0; i < m; i++) {
        for (int j = 0; j < n; j++) {
            if (input[i][j] == 1) {
                // 1 represents equipment
                // when found, add the distance to every position
                cumulateDistance(i, j, dis, m, n);
            }
        }
    }

    // find the smallest cumulated distance from dis[].
    int sIndex = 0;
    int smallest = dis[0];
    for (int i = 0; i < m * n; i++) {
        if (dis[i] < smallest) {
            smallest = dis[i];
            sIndex = i;
        }
    }

    // index sIndex is the final answer
    System.out.println("Answer: " + (sIndex / n) + " " + (sIndex % n));
}

private void cumulateDistance(int x, int y, int[] dis, int m, int n) {
    for (int i = 0; i < m * n; i++) {
        int a = i / n;
        int b = i % n;
        dis[i] += Math.abs(a - x) + Math.abs(b - y);
    }
}

[Design] Facebook Photo Storage

Stats

Facebook has 1.5 billion monthly active users, 970 million daily active users as of June 2015.

image from statista.com.

In 2009, Facebook stores 15 billion photos for the user, which grows at 220 million per week, and 550,000 per second at peak.

It’s 2015 now, you might want to mulitply these numbers by 3~6.

I have roughly estimated the statistics of Facebook users, Facebook photos and growth rate, just to give you an idea of the size of data Facebook has got:

Total user: 1.5b

Total photoes: 150b, which is 100 photo/user

Each photo got 4 different sizes, so 600b photos are stored.

New photo per day: 500m

New photo per second: 6,000

Peak incoming photo per second: 3m

Old architecture

3 tiers design:

  1. Upload tier receives users’ photo uploads, scales the original images and saves them on the NFS storage tier.

  2. Photo serving tier receives HTTP requests for photo images and serves them from the NFS storage tier.

  3. NFS storage tier built on top of commercial storage appliances.

Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems in 1984, allowing a user on a client computer to access files over a network much like local storage is accessed.

Problem

  1. there is an enormous amount of metadata

    … so much that is exceeds the caching abilities of the NFS storage tier, resulting in multiple I/O operations per photo upload or read request

Solution

  1. relies heavily on CDNs to serve photos.

  2. Cachr: a caching server tier caching Facebook “profile” images.

  3. NFS file handle cache - deployed on the photo serving tier eliminates some of the NFS storage tier metadata overhead

Haystack Photo Infrastructure

The new photo infrastructure merges the photo serving and storage tier into one physical tier. It implements a HTTP based photo server which stores photos in a generic object store called Haystack.

Goal: eliminate any unnecessary metadata overhead for photo read operations, so that each read I/O operation was only reading actual photo data

5 main functional layers:

  1. HTTP server

  2. Photo Store

  3. Haystack Object Store

  4. Filesystem

  5. Storage

Storage

The commodity machine HW typically is 2x quadcore CPU + 32GB RAM + 512MB NV-RAM cache + 12TB SATA drives.

Non-volatile random-access memory (NVRAM) is random-access memory that retains its information when power is turned off (non-volatile).

This is in contrast to dynamic random-access memory (DRAM) and static random-access memory (SRAM)

So each storage blade is around 10TB. Configured as RAID-6 partition.

RAID 6, also known as double-parity RAID, uses two parity stripes on each disk. It allows for two disk failures within the RAID set before any data is lost.

Pros:

  1. adequate redundancy
  2. excellent read performance
  3. low storage cost down

Cons:

The poor write performance is partially mitigated by the RAID controller NVRAM write-back cache. Since the reads are mostly random, the NVRAM cache is fully reserved for writes.

The disk caches are disabled in order to guarantee data consistency in the event of a crash or a power loss.

Filesystem

How does filesystem read pictures?

Photo read requests result in read() system calls at known offsets in these files.

Each file in the filesystem is represented by a structure called an inode which contains a block map that maps the logical file offset to the physical block offset on the physical volume.

For large files, the block map can be quite large.

Problem

Block based filesystems maintain mappings for each logical block, and for large files, this information will not typically fit into the cached inode and is stored in indirect address blocks instead, which must be traversed in order to read the data for a file.

There can be several layers of indirection, so a single read could result in several I/Os (if not cached).

Solution

Extent based filesystems maintain mappings only for contiguous ranges of blocks (extents). A block map for a contiguous large file could consist of only one extent which would fit in the inode itself.

An extent is a contiguous area of storage reserved for a file in a file system, represented as a range. A file can consist of zero or more extents; one file fragment requires one extent. The direct benefit is in storing each range compactly as two numbers, instead of canonically storing every block number in the range.

Extent-based file systems can also eliminate most of the metadata overhead of large files that would traditionally be taken up by the block allocation tree… saving on storage space is pretty slight, but… the reduction in metadata is significant and reduces exposure to file system corruption as one bad sector in the block allocation tree causes much greater data loss than one bad sector in stored data.

Problem of Extent-based file systems

However, if the file is severely fragmented and its blocks are not contiguous on the underlying volume, its block map can grow large as well.

The solution

Fragmentation can be mitigated by aggressively allocating a large chunk of space whenever growing the physical file.

Currently, the filesystem of choice is XFS, an extent based filesystem providing efficient file preallocation.

XFS is a high-performance 64-bit journaling file system created by Silicon Graphics, Inc (SGI) in 1993.

…was ported to the Linux kernel in 2001. As of June 2014, XFS is supported by most Linux distributions, some of which use it as the default file system.

XFS enables extreme scalability of I/O threads, file system bandwidth, and size of files and of the file system itself when spanning multiple physical storage devices.

Space allocation is performed via extents with data structures stored in B+ trees, improving the overall performance of the file system, especially when handling large files.

Delayed allocation assists in the prevention of file system fragmentation; online defragmentation is also supported. A feature unique to XFS is the pre-allocation of I/O bandwidth at a pre-determined rate, which is suitable for many real-time applications.

Haystack

Haystack is a simple log structured (append-only) object store. Many images store in one Haystack. Typically, Haystack Store is 100GB size.

A Haystack consists of two files:

  1. haystack store file (containing the needles)
    1. superblock
    2. needles (1 needle is 1 image)
  2. an index file

1. haystack store file

  1. The first 8KB of the haystack store is occupied by the superblock.

  2. following the superblock are needles

    Needles represents the stored objects. Needle consisting of a header, the data, and a footer:

    A needle is uniquely identified by its \<Offset, Key, Alternate Key, Cookie> tuple.

    Haystack doesn’t put any restriction on the values of the keys, and there can be needles with duplicate keys.

2. Index File

Needle consisting of a header, the data, and a footer:

The index file provides the minimal metadata required to locate a particular needle in the haystack store file.

The index file is not critical, as it can be rebuilt from the haystack store file if required.

The main purpose of the index: quick loading of the needle metadata into memory without traversing the larger Haystack store file, since the index is usually less than 1% the size of the store file.

Summary

All needles are stored in Haystack store file, and their location information is stored in Index File.

What is Needle? Needles represents the stored objects (1 needle - 1 image).

Haystack Operations

  1. write

    append new needle to haystack store file.

    later, corresponding index records are updated async.

    In case of failure, any partial needles are discarded, and fix index from the end of the haystack.

    You can’t overwrite any needle, but you can insert using same key. Later, the needle with dup keys but largest offset became the most recent one.

  2. read

    parameters passed in: offset, key, alt key, cookie, data size

    if key, alt key and cookie matches, and checksum correct and needle is not marked as deleted, return.

  3. delete

    marks needle as deleted (set 1 bit), but index is not modified.

    the deleted space is not reclaimed unless compact the haystack

Photo Store Server

Photo Store Server is responsible for accepting HTTP requests and translating them to the corresponding Haystack store operations.

In order to minimize the number of I/Os required to retrieve photos, the server keeps an in-memory index of all photo offsets.

At startup, the (photo) server reads the haystack index file and populates the in-memory index. The in-memory index is different from the index file in Haystack, as it stores lesser information, like this:

  1. Photo Store Write/Modify Operation

    1. writes photos to the haystack
    2. updates the in-memory index

    if there are duplicate images, the one stored at a larger offset is valid.

  2. Photo Store Read Operation

    passed in:

    1. haystack id
    2. a photo key,
    3. size
    4. cookie

    The server performs a lookup in the in-memory index based on the photo key and retrieves the offset of the needle containing the requested image.

    Since haystack delete operation doesn’t update the haystack index file record. Therefore a freshly populated in-memory index can contain stale entries for the previously deleted photos. Read such photo will fail and the in-memory index is updated to 0.

  3. Photo Store Delete Operation

    After calling the haystack delete operation, the in-memory index is updated to ‘offset = zero’

  4. Compaction

    Compaction is an online operation which reclaims the space used by the deleted and duplicate needles.

    creates a new haystack

  5. HTTP Server

    evhttp server

    multiple threads, with each serving a single HTTP request. Workload is mostly I/O bound, thus the performance of the HTTP server is not critical.

Summary

Storing photos as needles in the haystack eliminates the metadata overhead by aggregating hundreds of thousands of images in a single haystack store file.

This keeps the metadata overhead very small and allows us to store each needle’s location in the store file in an in-memory index.

This allows retrieval of an image’s data in a minimal number of I/O operations, eliminating all unnecessary metadata overhead.

Ref: https://code.facebook.com/posts/685565858139515/needle-in-a-haystack-efficient-storage-of-billions-of-photos/

[Design] How Google Search Works

Overview

Launched on Sep 15th, 1997

60 trillion individual pages. 100 million GB data.

40,000 search per second, or 3 billion search per day.

As of Feb 2015, 65% market share in US.

1. Crawl

Google crawl from 1 page to another using Googlebot. It starts from previous urls crawled, or augmented with Sitemap data

A sitemap is a list of pages of a web site accessible to crawlers or users. It can be either a document in any form used as a planning tool for Web design, or a Web page that lists the pages on a Web site, typically organized in hierarchical fashion.

2. Indexing

Compile the data (key content tags, atrributes, like title tags, ALT attributes). Google don’t process rich media or dynamic files.

3. Algorithm

When search, pull all relevant results from the Index.

Rank the result based on 200+ factors, one of which is the PageRank for a given page.

PageRank is the measure of the importance of a page based on the incoming links from other pages. In simple terms, each link to a page on your site from another site adds to your site’s PageRank.

Not all links are equal: Google works hard to identify spam links. The best types of links are those that are given based on the quality of your content.

To rank higher, follow Webmaster Guidelines.

[NineChap System Design] Class 4.1: Crawler

Overview

KISS - Keep It Simple, Sweetie.

In Today’s lecture:

  1. write a crawler
  2. thread-saft consumer & producer
  3. GFS, BigTable and MapReduce
  4. Top 10 keyword/anagram using MapReduce
  5. Log analysis

News Aggregator App

  1. Info Collection

    crawler

  2. Info retrieval: rank, search and recommend.

    They are in fact, all related to sorting.

Step 1, Info collection with crawler

crawler code

Python:

import urllib2

# request
url = "www.google.com"
request = urllib2.Request(url)
response = urllib2.urlopen(request)
page = response.read()

# save the file
webFile = open('webpage.html', 'web')
webFile.write(page)
webFile.close()

Network process

Use of socket.

Socket is like the cellphone in the Call Center example.

socket is an endpoint of an inter-process communication across a computer network.

Today, most communication between computers is based on the Internet Protocol; therefore most network sockets are Internet sockets.

What is socket? Where is it?

It’s in-between Application Layer (HTTP, FTP, DNS) and Transport layer (UDP, TCP).

Remembering that socket is like a cellphone. It is an abstraction layer, that hinds the complexity of lower layer, thus making it easier to sende data in application layer.

How is client connected to Server?

3-way handshake. Read [Design] TCP 3-Way Handshake

HTML

DOM tree:

Document Object Model (DOM) is an application programming interface (API) for valid HTML and well-formed XML documents. It defines the logical structure of documents and the way a document is accessed and manipulated.

How to crawler all the news

  1. Go to index page
  2. identify all the links (regex)

Crawl more websites

Simple design

  1. use crawlers to find out all list pages
  2. send the new lists to a Scheduler
  3. Scheduler will use crawlers again, to crawl pages.

This design is bad, cuz there is crawler waste. How can we reuse these crawlers???

Adv design

Design crawler that can crawl both list and pages information.

Look at our crawler: the text extraction logic and Regex for abc.com and bfe.com are totally different. However, they both share the same crawling techniques.

So we pass in all info a crawler task needs. Like:

  1. we gave more priority to list pages than content pages. Otherwise, your content get out of date soon.

  2. Type include both list/content and source info.

  3. status can be done, working, or new.

  4. timestamps helps us make sure each crawler runs every hour (let’s say)

So when schedule pick the next crawler task to run, it will choose based on Priority. However if the timestamp (availableTime) is not yet reached, the job won’t be executed.

If you crawler runs until endTime and haven’t finish, force finish it. We should also add task created time to the info.

How to identify similar news?

Calculate the similarity between pages. More on this subject later.

How to design Scheduler?

Solution with Sleep

Define variables:

taskTable<table> - store task lists
pageTable<page> - store page contents

task.url
task.state = new/working/done
task.type = list/page

code:

while (true) {
    // get 1 task. If can't get, wait
    taskTable.lock();               // IMPORTANT
    newTask = taskTable.findOne(state == 'new')

    if (!newTask) {
        taskTable.unlock();         // IMPORTANT
        sleep(1000);                // IMPORTANT
        continue;
    }

    newTask.state = "working";
    taskTable.unlock();

    // execute the task, and insert to
    // either taskTable or pageTable
    newPage = Crawl(newTask.url);

    if (newTask.state === 'list') {
        // insert all urls to taskTable
        taskTable.lock();
        foreach (url in newPage) {
            taskTable.add(new task(url));
        }

        // mark the task as "done"
        newTask.state = "done";     // IMPORTANT
        taskTable.unlock();
    } else {
        // insert page content to pageTable
        pageTable.lock();
        pageTable.add(newPage.content());
        pageTable.unlock();

        // mark the task as "done"
        taskTable.lock();
        newTask.state = "done";     // IMPORTANT
        taskTable.unlock();
    }
}

Solution with Conditional Variable

What is Conditional Variable:

A condition variable is basically a container of threads that are waiting on a certain condition.

Monitors provide a mechanism for threads to temporarily give up exclusive access in order to wait for some condition to be met, before regaining exclusive access and resuming their task.

Look at last 3 lines of code. Before going to sleep, CV have to release the lock, so that other threads can access the taskTable.

Then CV goes to sleep. Right after CV has been waken up, it has to lock the mutex again.

Solution w/ cv:

while (true) {
    // get 1 task. If can't get, wait
    taskTable.lock();
    newTask = taskTable.findOne(state == 'new')

    if (!newTask) {
        taskTable.unlock();
        Cond_Wait(cond, taskTable);  // Modified
        continue;
    }

    newTask.state = "working";
    taskTable.unlock();

    // execute the task, and insert to
    // either taskTable or pageTable
    newPage = Crawl(newTask.url);

    if (newTask.state === 'list') {
        // insert all urls to taskTable
        taskTable.lock();
        foreach (url in newPage) {
            taskTable.add(new task(url));
            Cond_Signal(cond);       // Modified
        }

        // mark the task as "done"
        newTask.state = "done";
        taskTable.unlock();
    } else {
        // insert page content to pageTable
        pageTable.lock();
        pageTable.add(newPage.content());
        pageTable.unlock();

        // mark the task as "done"
        taskTable.lock();
        newTask.state = "done";
        taskTable.unlock();
    }
}

Why Good: no need to busy-spin (example above have to always wait 1 second). So this solution is better.

Solution with Semaphore

This is better than Condition Variable, cuz it’s easier to implement. And Semaphore not only locks thread, it also can lock process.

CV locks on a certain condition. But Semaphore locks the numbers (of task, or resources etc).

Semaphore implementation (fairly difficult, read for interest):

Wait(semaphore) {
    Lockf(semaphore);
    semaphore.value--;
    if (semaphore.value < 0) {
        semaphore.processWaitList.Add(this.process);
        Release(semaphore);
        Block(this.process);
    } else {
        Release(semaphore);
    }
}

Signal(semaphore) {
    Lock(semaphore);
    semaphore.value++;
    if (semaphore.value <= 0) {
        process = semaphore.processWaitList.pop();
        Wakeup(process);
    }
    Release(semaphore);
}

Note that in Java and C++, Wait() == acquire() and Signal() == release(). Read more Jenkov’s post.

code w/ semaphore:

while (true) {
    // get 1 task. If can't get, wait
    Wait(numberOfNewTask);           // Modified

    taskTable.lock();
    newTask = taskTable.findOne(state == 'new')
    newTask.state = "working";
    taskTable.unlock();

    // execute the task, and insert to
    // either taskTable or pageTable
    newPage = Crawl(newTask.url);

    if (newTask.state === 'list') {
        // insert all urls to taskTable
        taskTable.lock();
        foreach (url in newPage) {
            taskTable.add(new task(url));
            Signal(numberOfNewTask); // Modified
        }

        // mark the task as "done"
        newTask.state = "done";
        taskTable.unlock();
    } else {
        // insert page content to pageTable
        pageTable.lock();
        pageTable.add(newPage.content());
        pageTable.unlock();

        // mark the task as "done"
        taskTable.lock();
        newTask.state = "done";
        taskTable.unlock();
    }
}

What happens in Line 3 ‘Wait(numberOfNewTask)’? Well, the programs checks on the numberOfNewTask (counter) variable, and:

  1. If there is 1 or more tasks, just proceed.
  2. If no tasks available, block itself and wait there. (Later someone will wake it up and it will resume).

Design an consumer-producer

Stay tuned for future post.

[NineChap System Design] Class 3.2: Web Service

Question 4

fix MP3 problem

The process of fetching a MP3 (from CDN):

  1. aquire MP3 link, and send request
  2. send request to CDN
  3. CDN receive request, find MP3
  4. response to client
  5. play the music

Question: in step 2, there’s more Network error, but in step 4, there’s no Network error, but Timeout. Why?

Fix step 2, Network error

Problem is: MP3 url invalid. It actually comes from a failed CDN sever.

Solution: fix the server.

Fix step 3, CDN can’t find MP3

Problem associated with Anti-Leech.

a leech) is one who benefits, usually deliberately, from others' information or effort but does not offer anything in return.

Example: Wi-Fi leeches, Direct linking (or hot-linking) and In most P2P-networks, leeching is whose who download too much.

Anti-Leech specializes in protecting file downloads and stopping bandwidth leeching.

See that some P2P and leeching software will steal your url links, so the MP3 url expiration time is 5 minutes.

So when CDN server’s clock and web server’s clock are not synchronized well, MP3 url can expire.

Solution: every 10 minutes sync CDN clock with web server clock.

Fix step 4, Timeout error

Some MP3 are relatively large. Thus timeout.

MP3 performs better at higher bps, and aac(Advanced Audio Coding) works better at lower bps.

Solution:

  1. compress MP3 to 48bps, or use aac format. So, play lower-rate music first, then switch automatically.

  2. pre-load a music while previous is playing.

  3. optimize CDN

CDN content delivery network is a large distributed system of servers deployed in multiple data centers across the Internet.

The goal of a CDN is to serve content to end-users with high availability and high performance.

Which CDN should client choose?

Not DNS, but web server calculates which to choose. It can be calculated using IP distance, or ISP provider, but not accurate.

We can also use local desktop apps (in different locations) to ping CDN servers. This may violate user privacy, though.

Fix step 5, Fail to play

Problem: some files got wrong decoding.

Fix step 6, unkown error

Problem: some users close the page while MP3 loading.

Question 5

fix player problem

Problem: iOS device can never play Flash.

Solution: develop HTML5 player.

5.2 how to evaluate that you solved the problem

  1. user complains

  2. important: daily retention rate!

We can’t use daily active user, cuz it depends on marketing, competitors, and infrastructure changes.

One day retention rate:

Today’s visitor = {U1, U3, U7, U9, U10}

Tomorrow’s visitor = {U2, U3, U9,}

Today’s one day retention rate = 2/5

Question 6 秒杀

Design

Queue A and Queue B

Queue A

Many queues, each one locates on a individual web server or reverse proxy. It is mainly used to accept large amount of requests coming from the clients.

Each machine may takes 10,000 or more requests per second.

Queue A will redirect most requests to a static page (cached).

Queue B

Queue B is a single machine, to aviod distributed problems. It takes in small amount of requests and decides results (eg. redirect to payment page).

Now, why do we need 2 queues, not 1?

Think about a hospital. Queue A is the hospital lobby and security guard, and Queue B is the queue of patience.

How to reduce traffic

  1. no image
  2. cache more static pages
  3. reverse proxy: batch sending the requests

Also, can use front-end javascript to prevent clicking. There are method to bypass, so we need to check IP or do some filtering.

How to keep it simple?

  1. no DB: basic logic. But rmb to use a log file
  2. no lock

How to improve stability

  1. use new server to do Miao Sha, in case of crash
  2. asyc prcessing everything! Don’t let other people wait, in case of crash.

How to defend hackers

  1. IP address (to defent auto softwares), but it’s easy for hackers to fake IP address
  2. CAPTCHA

CAPTCHA (an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart”) is a type of challenge-response test used in computing to determine whether or not the user is human.

follow-up

How to design 12306 (support several million QPS)

[NineChap System Design] Class 3.1: Web Service

Overview

Today we’ll look at 6 examples of problems associated with Web Service:

  1. how the internet works
  2. DNS
  3. Web server
  4. Music Player
  5. MP3 file
  6. 秒杀

Question 0

how to solve raido-play failures

> Failure rate  = % user who can't listen to music properly

> = # user who fail to plya one song / # total users

Misson: reduce failure rate.

How does server identify a user?

If a server uses Cookie to identify unique users, the result might be > real users.

However, if server uses IP address, the result might be < real users.

How to collect data for failure rate

Version 1

Log:

  1. user send a log to server when it visits
  2. user send another log after it plays a song
  3. we can identify users who failed to play a song

In fact, everything should be logged, including play, pause, switch song, refresh etc.

Version 2

User login are, in fact, automatically logged when user visits. Thus user ONLY have to send log after it plays music.

Summary

  1. define failure rate
  2. user cookie to identify user
  3. use log to collect failure data
  4. analysis pattern of failure againt date/time

Question 1

the process of playing music

  1. Prepare
  2. Send DNS request
  3. Prepare DNS reply
  4. Send DNS reply
  5. Process DNS reply

    -

  6. Send webpage request

  7. Prepare webpage reply
  8. Send webpage reply
  9. Process webpage

    -

  10. Request music player

  11. Prepare music player
  12. Send music player
  13. Process music player

    -

  14. Request MP3

  15. Prepare MP3
  16. Send MP3
  17. Play MP3

What is process Music Play?

Local browser will do rendering, flash decoding etc. If any point of this 17 steps went wrong, the music-play fails.

Is there a system/browser default Music Play?

HTML player is, but flash player is not. So the flash module have to be requested every time.

Real data: failure rate 20%

In practise, the real failure rate is 20%. Which is:

  1. 8% DNS
  2. 5% Web
  3. 5% MP3
  4. 2% Player

Question 2

fix DNS problem

First of all, how to find out DNS failures? There are 2 ways. First way, help desk do it. Second way is to use the Desktop app to help detect the host address.

Step 1. HOSTs hijack

Some users' host file can modified by competitors.

  1. ping the website url
  2. modify host file manually or by desktop app

Step 2. ISP

Each ISP have different DNS service. Eg. CSTNET fails to update the latest DNS, after a server change.

After this step, DNS failure rate fall from 8% to 1%. Why still 1%? Some companies bans music play in company web.

Question 3

fix the web problem

Highest failure rate:

  1. 3pm office hour
  2. 9pm highest bandwidth nation-wide

Solution 1, reverse proxy

Reverse proxy w/ more servers. Reverse proxy acts like a load balancer.

Reverse proxy is a type of proxy server that retrieves resources on behalf of a client from one or more servers. These resources are then returned to the client as though they originated from the proxy server itself.

Common uses for a reverse proxy server include:

  1. Load balancing

    act as a “traffic cop,” sitting in front of your back-end servers and client requests. Try to maximizes speed and capacity utilization while ensuring no one server is overloaded.

    If a server goes down, the load balancer redirects traffic to the remaining online servers.

  2. Web acceleration

    can compress inbound and outbound data, as well as cache commonly requested content

    also perform additional tasks such as SSL encryption to take load off of your web servers

  3. Security and anonymity

    By intercepting requests headed for your back-end servers, a reverse proxy server protects their identities and acts as an additional defense against security attacks.

Solution 2, reduce size of web page

  1. simplify javascript files
  2. compress images (lower dpi)
  3. merge large images to 1 image (less requests)
  4. lazy loading (Pinterest uses it a lot)

Solution 3, more cacheable pages

Change dynamic webpages to static pages. The advantage of this is:

  1. more search engine friendly.
  2. more cache friendly.

Summary on caching

Caching can happen at place Number 1, 2 and 3:

AT Number 4, we can add more servers. Number 3, reverse proxy. Number 2 is caching within the ISP network, which avoids requesting info again from backend. Number 1 is front-end browser cache.

After this step, Web failure rate fall from 7% to 4%. Why still 4%? Well, these failure is mainly from the junk users created by marketing.