High Level Designs

High Level Designs

Steps for System Design

Define MVP -> Requirement gathering
Estimate Scale wrt #users, #traffic
Design Goals(CAP theorem)
API Design
System Design

1 Billion is 1 Giga byte.

Databases

HBase - Optimized for high volume of writes. -> columnar DB
MongoDB - Read heavy systems. Has tunable consistency. Due to the same, it can good for high writes as well - consistency and partitioning tolerance - MongoDB can have a single-point failure - MongoDB sacrifices availability
Dynamo DB, MySQL- Read Heavy
Cassandra - Cassandra gives up consistency
Reddis- Global cache

Queues

Kafka/RabbitMQ- Persistent queue
RabbitMQ- allows to push & Kafka only notifies and subscribers have to PULL

Messaging App

Facebook: 20-30 Million messages per day
How to scale any product which can support when the users increase from 100 to 100Million

Messenger

MVP

Ability to send and receive messages
Message History
Login with Facebook accounts
Contacts/Friends
Messages should have max length(250 chars), only text messages
Notifications
UTF-8 Support

Estimation of Scale

Facebook users- 3 billion contacts
10 Billion messages/day

Storage requirement -> ~250 bytes that includes

Message have text, sender, recipient, timestamp, messageId
10Billion * 250 chars ~ 2.5 TB
2.5 TB* 365 * 10yr ~ 10 Peta bytes
Hence, we cannot store complete data on one system and requires splitting across machines. Splitting across multiple machines is called Sharding

Sharding Required
Both READ and Write Heavy

Design Goals

Latency should be low
Consistency vs Availability(Cap Theorem)

API Design

sendMessage(userId, recipient, text, msgId)
getConversations(userId, lastTimeStamp, maxCount);
getMessages(userId,conversationId, #msgs, offsets);

System Design

Client(mobile app/browser)

Loadbalancer(on facebook)

appln servers
~~Sharding based on Conversation~~

~~Can't get most recent convesations easily~~

userId based sharding

Apps server will have cache along with appln logic - write thru cache to achieve highly consistent system(Sticky session).
When message is sent from A to B

Messages goes to A cache, followed by A DB.
Message goes to B Cache, followed by B DB. Also, adds to Queues for notification

Trade OFF: when apps server goes down, it may take 2-3 seconds for new apps server to get assigned and get the details.

Write Request from A to B

load balancer

webserver

B Cache/apps server -> B DB
A Cache/apps server -> A DB
Add to persistent queue(Kafka/RabbitMQ)

Push Notifications

Message is pushed to Machines which are subscribed to Queues

Machines maintain Connection details - Devise DB(user, deviceId, machine)

Message gets notified

Books

https://www.goodreads.com/book/show/23463279-designing-data-intensive-applications

System Types

Read Heavy

Youtube, Instagram, Netflix

Write Heavy

Logs, Rating

Read/Write Heavy

Read Heavy

600 K organizations use slack
5 Step framework

Gather Requirements/MVP(P1, P2, P3)
Estimate Scale //What is the problem

Storage(GB/TB/PB)
Read>Write(QPS)

API Design //how to do it
Design goals //design to remove bottlenecks in 2 &3

Latency
CAP theorem

System Design/High level design

Remove all bottlenecks in step2,3,4(implementation details to address 2,3,4)

MVP/Requirements

Send/Receive messages
Group & 1-1 messages
Fetch all messages for a conversation
adding multimedia

Estimate Scale

Daily Active Users
similarweb.com
competitive analysis
10M users sending 100 messages/day & each message is 1KB size

storage for year = 10M * 100 * 1KB * 365= 1 * 10^9 * 365 = 1TB*365 = 365TB

QPS

If Slack has 10M Daily active users and each users reads about 1000 msgs per day, what will be the number of read queries / sec?

(10M * 1000) / (3600*24) ~ 120K

Inferences:

One server cannot store all messages
One server cannot handle so many reads per sec

API Design

GetConversations(userId, authToken, #conv, offset)
SendMessage(userId, authToken, conversationId, timestamp, messageBody, attachment(blob_url), msg_id)
FetchMsgs(authToken, convId, offset, #msgs, offset)

Design Goals

Latency: low
CAP: CP system(when data is partitioned by network, then 2 options left either consistency or partition.). CAP is concept of distributed system and not just DB system.

System Design

Client -> DNS Server -> ip address of load balancer ->Application Servers(as1, as2...)
Application servers are deployed in private network and cannot be accessed directly
Sharding options

By MessageId -> Requires to visit all shards to get messages for a single conversation and results in high latency
By ConversationId -> Requires to visit multiple data nodes/shards to get messages for a single user and results in high latency
By UserId -> cannot be used for group conversations
By OrgId ->

TCS: 300K * 100 msgs * 1Kb = 30GB/day = 10TB/year

OrgId+Month

check the architecture of DRUID
Read TPS

Read Heavy
Low latency
Replication - copies of data for requirements
300K * 1000 = 3500 message/sec

with 1 replica we can reduce load to 1750 messages/sec

how does replication happen

data nodes talk to each other using Gossip protocol

Caching will further improves the response time

client -> load balancer -> Application server(group) -> Global cache -> Database(cluster) -> DB

Social Media Apps

Signup -> add friends -> post updates(Text, Images, Comments on posts, likes on Images)

Comments