System Design Note
Web failure rate analysis(divide and conquer )
failure rate | DNS | Web | Player | MP3 |
---|---|---|---|---|
17 | 1 | 7 | 3 | 6 |
15 | 1 | 4 | 3 | 6 |
web
1. Too many users and requests to handle
Solution:
- Reverse proxy with more servers(dispatcher, cache, package)
- Reduce the size of webpage (simplify content 10%, compress/merge images 40%, lazy load 70%)
- More cacheable pages (change dynamic webpage into static webpage)
there will still some rate of failure on web that we could not reduce. Reason: Crazy user are not patient to wait and click too often
CDN(Content Delivery Network)
rate limit
1. Algorithm of gap
make sure the gap between two request >= 0.2, then we could tell that ups will less than 5.
2. Algorithm of time-bucket
make sure every second have n request.
- Use database
- not use database(set a counter)
3. Algorithm of requestList
Find the fifth old packet and caculate the time diff, it should be large than 1s.
put each request in Queue,
Optimsize: circle queue
Follow up:
How to save space with 10^9 query per hour?
Batch queries
how to support multiple threads?
lock
Producer/conductor
how tp support limiter on users?
<uid, requestList>
how to support query with different quotas?
Token Bucket: space complexity O(1)??????????
秒杀
- 时间
- items
- qps
Target: keep working in good suitation
传统四层代理:
browser, ISP proxy, reverse proxy, load balance
解法二:
二级队列: 一级队列放行一些人,顶冲击流量,二级队列排序
how to reduce traffic?
- reduce page size: no image
- cache more: static page
- Proxy: batch, connection
- Limit request: JS, balancer
how to keep it simple, sweetie?
- no DB: memory + log
- no lock
how to isolate itself?
- new server
- Asynchronous
how to defend?
- IP
- captcha
Lottery algorithm
Distributed System
- Design distributed file system and database(bigtable)
- Calculate word appearance/ inverted index/ anagrams with map reduce
design search engine
Google three key: map reduce / Bigtable/ GFS
not recite details of papers, but find a suitable solution under a scenario
how to save file?
- Disk: (metadata(fileinfo, index), bolcks:())
how to save a large file?
- blocks 变大
how to save an extra-large file?
- master save metadata(only server, not offset), many chunk servers(take care offset itself)
单点失败解决方法:热备份, master任务减少(GFS only one master)
How to find file is broken?
- Checksum put in memory.
how to avoid the loss of data when a chunkserver is down?
- Replicate the chunks, 2 more copy.
how to select chunkservers of a chunk? (启发式)
Server with below-average disk space utilization
limited number of “recent” creation
Across racks:2+1
how to recover when a chunk is broken?
- ask master for help(just to ask for the chunkerver number)
how to find whether a chunkserver is down?
- master keep a chunkservers live log(chunkserver send live message to master, heartbeat, master update record)
how to recover when a chunkserver is down?
- master keep a repair procedure list. (Repair priority is based on the number of replications)
how to avoid hot spot?
- master have a rebalance procedure, record chunk access frequency, replicate a chunk into more replications when it busy
- fill the chunkserver with more space and more bandwidth.
how to read from file?
- ask master for the file chunkserver
- then go to the chunkserver to read the file
how to write into a file?
- ask master where is chunk, master tell the primary chunkserver and replicas chunkserver
- Cache file data from user into chunkserver then chunkservers send data to each other.
- Primary chunkserver tell each replicas to write data into disk.
BigTable
how to save a large table?
- metadata of table-> table
how to save a extra-large table?
- Metadata of table -> table -> ssTable
how to write into a table? (Task: add(b, yeah))
- write into memory first (sstable could not be modified)
Memory | GFS/Disk |
---|---|
table0 | sstable0 |
what if men table is too large?
- When men is full, dump it into disk.
how to avoid system failure?
- wirte log
how to read data?
- Data is ordered inside a sstable
- data is disordered outside sstables
how to read fast?
- build index, put index into memory
- bloomfilter(), can tell data not in this block for sure, tell data in this block possibly, extra check needed.
how to build bloom filter?
how to save a table into GFS?
key, value is the core of database
Key = (row, column, time)
Design Twitter/Instagram/Facebook
product hunt web
design twitter
Q: User case
A: post twtte/ read feed line/ repost / follow/ comment/ like/
Q: Necessaries
A:
read timeline: QPS 300K/
write twtte: QPS = 5K
- Notify: 1 M followers within 1s
- Concurrent users = 150m
- Daily tweets 400m
Application
Q: how to write a tweet
A: push model
Social graph ——|
Write API -> find out -> timeline list
Q: how to read timeline
A: timeline service read O(1)
Q: how to deal with the storm of lady gaga?
A: A big star write a tweet and you will need to send very much data to many follower. We could only notify online users or we could use pull mode
Q: pull mode
A: write in O(1) -> social graph -> read timeline
Q: how to deal with Architect of the MATRIX
A: push & pull mode
Q: how to choose pull and push
A:
- Understand specific scenario
- Balance between your latency, computation & storage
- O(n) is not workable
- Real time with terrible QPS needs pull
- Simplicity asks to focus on one
Q: speed up
A:
- put data into memory
Q: save space
A:
- only cache for the users who is active within 1 week
- only cache the latest 800 tweets
- only save twtte id in memory
Q: how to deal with cache miss
A: timeline builder(load from disk and append to memory)
Q: how to save the tweetlisttable on disk
A: split data into chunk and save by time decreasing.
Q: how to support search?
chat
Q: data structure
A: user table/ friend table/ Channel table / Message table
Q: how to choose the storage
A: SQL: RelationShip/ Edit/ correct/ Transaction/ Combination state
NoSQL: K/V pair, Append, tolerant/ performance/ Sequential record
Chat architecture:
Batch message/ compress message
Delay change/ buffer
Typeahead
Necessaries:
- average latency < 100ms
- 95% < 50ms
Algorithm: binary tree/ trie tree/
Select * from table where name like ‘“.$query.” %’ order by name ASC
Architecture:
Browser <- Aggregator <- many {personalized ranking/ Global ranking}
speed up
- memory
- multi-level cache
- save content id not detail