What to consider when estimating scale in system design
- Traffic Estimation
- requests in queries per second
- how many daily users?
- how are the users distributed geographically?
- what are going to be the peak hours
- from this data, estimate concurrent users
- how many requests per sec per user?
- from this data calculate total requests per sec during peak hours.
- throughput of data ingestion
- how often and how much of data is expected to be ingested from the source.
- how many daily users?
- read-write ratio
- which of the data and related objects/services are read intensive, which are write intensive, and what is the ratio?
- bandwidth
- scale of read-write actions on servers, databases
- network bandwidth for data ingestion, output, integrations, pipelines, etc
- user bandwidth expectation
- requests in queries per second
- Storage Estimation
- how much total data is generated and where is what stored? databases/caches/archival/data warehouses/lakes
- what is the data retention policy?
- logging and analytics data.