Tuesday, January 10, 2012

Designing a scalable application

Imagine your traditional non-Web-based distributed application has ten users. Chances are your application works just fine in this environment - responses to queries are fast, print jobs run on time, etc. Now imagine your company gets absorbed into a much, much larger one and now your distributed application has 10,000 users. Will your application run just as fast and will your users be just as happy?

Well, no. Not without, at the very least, hardware upgrades. But, even after you redo all the server hardware, will your application still perform to specification?

If it doesn't (and chances are it won't), we say that the application does not scale well; in other words the application is not capable of adapting itself to fit into a new environment where there are more users. If the application copes, then we say it does scale well.

Scalability is important in many areas, but the consequences of ignoring it can be particularly acutely felt if you suddenly find yourself with a successful Web-based distributed application, like an e-commerce site, where the original traffic of
1,000 visits a day suddenly jumps to 10,000,000. How do you design your application in such a way that increases of that magnitude are manageable?

Luckily, you're half way there. The first step in managing that kind of increase is to adopt a 3-tier architecture. In short, by separating the business tier away from presentation and database roles, you can build business objects that work smarter
and scale better. The trick with managing scale is to think about the perceived number of users concurrently working on the system at any one time. Say you do have 10,000,000 visits a day. At any given second, 116 users are on that system.
Imagine all of these users are using the same page. How many database connections do you need? The answer: 116. Now, how many database connections will be required at the same time? The answer: less than 116.

The trick to designing business objects to scale well is to understand that the resources you have are limited. Say your installation of SQL Server can handle only 500 simultaneous connections. If a user needs a database connection and all
500 are in use, that user will have to wait in a queue for one to become available. That is why we often see such a reduction in performance when we start to increase the load on distributed applications - the demand for resources
outstrips supply.

Imagine I write a business object that takes 5 seconds to walk through a list of 100 invoices and performs a complex calculation on each one. If I connect to the database as I enter the CalculateRefund method and close it just before I leave, chances are I've wasted a lot of the database's time as I read one record, process it for a little while, get the next record, and so on. If I, instead, design my object to open the database, read and cache all the records, and close the Database, the database resource is being used efficiently. Now, as the resource is being used more efficiently, other pages servicing other users can obtain database connections more readily, reducing the performance lag we were previously feeling.

Coming back to our 10,000,000 visits example: if we can build our business objects to use the database efficiently, we may be able to reduce the number of simultaneous database connections required to always be below 500, and our application won't feel a performance hit when we scale.

Now that we understand the online business model we're after and we have an understanding of what a 3-tier distributed application is, we can move on to designing the software we'll need to make our site a reality.

1 comment:

  1. Thanks Binay for informative article. BTW, can youu suggest a good CMS in classic ASP