Imagine you have a computer with you and you’ve built some revolutionary algorithm (something/ anything), now people want to start using it and they are ready to pay you to use your code as it's useful to them, but the algorithm is in your computer. Hmm, that’s quite a huge problem now. There can be two solutions here:
You can lend your computer to people who want to use the algorithm for a limited time, of course at different points.
You can expose your code/ service using something that is going to run on the internet. so basically using this instead of storing the output of the algorithm in a file, you can send it back to people using something called response when your client requests you to do so.
Obviously, there are a lot of disadvantages in following the first solution:
you cannot lend your computer to more than one client at a time, this establishes a huge difference in the money you will be earning Vs the money you can be earning.
it grows increasingly difficult to manage such a business if the number of people wanting to use your service grows.
Now, you finally have an API (Application program interface) service running on the internet to expose your algorithm to people wanting to pay, to use the same. there are lots of factors you might need to consider when you are planning to take a service online in this manner. one major thing is:
- What happens if you have a power cut in your house/ area. ( The system goes down and the users don’t want this to happen frequently/ at all if possible because they are paying you for the service).
You should use a cloud based machine, what’s a cloud based machine ? The cloud is a set of computers that someone provides you to use for your own needs for money (Ex: AWS, Azure, GCP). To put it in a simple manner, you pay and get computational power in return.
How do these machines have your algorithm in them?
The simple answer is it will not, yes you read it right. these machines you are paying for will not have your algorithm. if they don’t what good do these machines do to you? your concern is probably right, so you might be wondering what you’re spending money on.
You are spending money to purchase the ability to run your algorithm and expose it to the internet without you maintaining it all the time. you will have to remote login to some computer and set up all the things that are required to expose your service to the internet just like you did on your computer.
Now that you have your service exposed to the internet through the cloud, now we should start thinking about the business requirements. let’s say that your algorithm went viral on the internet and there are lots of people now wanting to use your algorithm and our only machine on the cloud is not able to handle all those requests (connections). Hmm, that’s a big problem and you do not want to lose/ not accept all those new clients who are willing to pay to use your service just because you cannot handle the requests, we have two ways to handle such scenarios:
you want to increase the capability of the machine (Bigger computer in terms of specs)
you want to increase the number of machines (Increase the count of computers running your algorithm)
Solution-1:
when you buy a bigger machine with huge specifications or processing power, it will be able to process the requests fast and so can handle many more requests in number. This kind of scaling is called vertical scaling.
Solution-2:
when you buy more machines, it ideally means that you are distributing the requests to one among all the available machines, so this will be able to handle more requests too. This kind of scaling is called the Horizontal scaling.
Key differences & Pros and Cons:
Horizontal Scaling:
This will need a Load balancer to route the request to one among the available boxes/servers.
When one of the servers fail, the request can be routed to another available instance running the same algorithm. so horizontal scaling is more resilient than vertical scaling.
Communication if it happens between the server will be through Network calls which can be significantly slow than Inter procedure calls that happen in a vertically scaled system.
If each instance is connected to its own database, it’s really hard to maintain the data consistency. if the system we are working with is a mission-critical system and we expect a lot of database transactions, we might have to lock writing to all the databases connected to all the instances for that operation to take place in all the databases to maintain consistency, this process is error-prone and impractical.
As the number of people using our service grows, this can be scaled well. The number of servers you can add as your user count grows is nearly linear.
Vertical Scaling:
This will not need a Load balancer to route the request like in a horizontally scaled system because there is only one machine and all the requests coming in are routed to the same machine.
When the server fails, there aren’t any other machines running to re-route the requests, so there is a single point of failure.
Communication happens through inter-procedure calls so it is faster significantly.
There is no database consistency problem here because there is only one machine.
As the number of people using our service grows, there is a hardware limit on how much we can vertically scale is a system, we can’t just keep making the system bigger and bigger. at some point in future, this has to be scaled horizontally too.
What should be done in the real world?
It should sort of be a hybrid design, we take all pros from both vertical and horizontal scaling ideas and scale our system to be resilient, consistent in terms of data and fast, the ideal thing to do here is to start with a bigger (in terms of specs not size..xD) box/server as feasible and scale it horizontally as per the business requirements.