Important Topics For System Design and Scalability Interview

  1. Horizontal vs Vertical scaling: Horizontal scaling means that you scale by adding more machines into your pool of resources whereas Vertical scaling means that you scale by adding more power (CPU, RAM) to an existing machine.
  2. DB Partitioning(Sharding): Partitioning is more a generic term for dividing data across tables or databases. Sharding is one specific type of partitioning, part of what is called horizontal partitioning
  3. Database Denormalization — Relational DB vs NoSQL: Denormalization is a time-space trade-off. Normalized data takes less space, but may require to join to construct the desired result set, hence more time. If it’s denormalized, data are replicated in several places. It then takes more space, but the desired view of the data is readily available.
  4. Caching: Caching is a technique that stores a copy of a given resource and serves it back when requested.
  5. load balancer — l4 vs l7: Load balancing refers to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or server pool.
  6. map-reduce: MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.
  7. Asynchronous processing and queues: Asynchronous processing enables various workflow processes to run at the same time. In some cases, we can do this in advance. For Example, we might have a queue of jobs to be done that updates some part.
  8. Networking Metrics- Bandwidth, Throughput, Latency: Bandwidth is the maximum data transmission rate possible on a network. Throughput measures your network’s actual data transmission rate, which can vary wildly through different areas of your network. Latency is the delay that happens between a node or device requesting data and when that data is finished being delivered.

Some More Bonus and Related Topics

  1. CAP theorem: The CAP theorem states that a distributed database system has to make a tradeoff between Consistency and Availability when a Partition occurs.
  2. ACID vs BASE: ACID transactions are far more pessimistic i.e. they are more worried about data safety. BASE relaxes consistency to allow the system to process request even in an inconsistent state.
  3. Consistent hashing
  4. optimistic vs pessimistic storage
  5. Strong vs eventual consistency
  6. Types of NoSQL — key-value, wide column, document-based, graph-based
  7. data center/racks/hosts
  8. CPU, memory, hard drive, network bandwidth
  9. random vs sequential read-write on disk
  10. HTTP vs http2 vs WebSockets
  11. TCP/IP model
  12. TCP vs UDP
  13. ipv4 vs ipv6
  14. DNS lookup
  15. HTTPS and TLS
  16. Public key infrastructure and certificate authority
  17. symmetric key vs asymmetric key
  18. CDNS vs edge
  19. bloom filters and count -min sketch
  20. Paxos consensus over distributed hosts. leader election
  21. design patterns and object-oriented design
  22. virtual machines and containers
  23. publisher-subscriber or queue
  24. multithreading
  25. concurrency
  26. locks
  27. CAS: The Central Authentication Service is a single sign-on protocol for the web. Its purpose is to permit a user to access multiple applications while providing their credentials only once.

Source: Various

Software Developer with 4+ years of experience in Java and web technologies. Health and Fitness enthusiast.