

Then based on estimates of scale additionally load balancing can be applied at Though load balancing can be applied at may places, here starting to load balance between the clients and application servers will be sufficient. You can also use the auto increment ID feature of database (for base62 encoding approach as shown in the algorithms section) Load Balancer (LB)Īs scale is important adding load balancer will significantly improve the end user experience. Since our system has 500M urls in a month and has a read:write ratio of 100:1, the below will be the system specifications.īased on the analysis done so far the below will be our database requirements.Įither SQL or NoSQL will work for this solution (But a nosql will be much better) Built in cache of the backend server vs Redis or Memcached.NOTE: Due to duplicate requests, our actual memory usage will be lesser thatn 170GB. To cache 20% of these requests, we will need So, since we have a total of 20K URLs / s (or 20K requests/ s), then, 1 day will have:Ģ0K * 3600 seconds * 24 hours = 1.7B request / day 20% of the short links generate 80% of the system traffic. Here’s let’s follow the, Pareto principle, aka 80:20 rule, i.e. To optimize performance and reduce the DB round trip caching URL’s will be an efficient solution. So a total incoming data for our service will be 100KB per second.įor read requests, since we expect ~20K URL redirections every second, total outgoing data for our service will be 10MB per second. Outgoing data is the amount of data returned from the server to the client (like downloading).Incoming data is the amount of data transferred to the server (same as file upload type).Let’s assume that each stored object will be approximately 500 bytes (just a rough estimate).īandwidth is (network bandwidth) is the term used to denote the amount of data transfer(data size) in a period of 1 second.ĭata transfer will included, incoming as well as outgoing data. Since we have 500M new URLs every month, the total number of objects we expect to store will beĥ00 million * 10 years * 12 months = 60 billion

Let’s assume we store every URL shortening request for 10 years. So, URL redirections per second will be (100:1 read/write ratio) Let’s also calculate the Queries Per Second (QPS) for our new system.New URL Shortening requests per secondĥ00M / (30 days * 24 hours * 3600 seconds) = ~200 URLs/second Assume we will have 500 million new URL shortening request per month, with 100:1 read/write ratio, we can expect 50B redirections. (Or depending on your context you can change this) We can assume the ratio of read to write as 100: 1. The critical factor to consider in this kind of application/service is that the reads will be more than the writes. URL redirection should happen in real time.Īpart from the functional and NFR’s the below analytical requirements are nice to have features.If the service is down, all the URL redirections will fail. The system should be highly available.URL redirection should happen with minimal latency.Our URL shortening service should also adhere to the below NFR’s. Users can optionally be able to create a custom short link for their URL (Note: The custom short link should be unique).Users may optionally be able to specify expiry time. Links can expire after a standard default expiry timespan.When users access a short link, the service should redirect them to the original link.Given a URL, the service should generate a shorter and unique URL from it.Our URL shortening system should meet the following functional requirements. And usually length around 7 to 8 characters.

Short links save a lot of space when displayed, messaged, printed or tweeted or for any other purpose.Īdditionally, users are less likely to mistype shorter URLs and thereby avoiding lot of 404’s.įor example, if we shorten the below page through TinyURL: We call these shortened aliases as “short links.” Users are redirected to the original URL when they hit these short links. URL shortening service or a webservice is used to create shorter/terse aliases for long URLs. Even though the problem statement appears to be too simple the actual implementation has some architectural challenges. This is a very common interview questions for product and FANG companies.
