Microsoft ArcReady: Architecting Scalable and Usable Web Applications
Presenter: Larry Clarkin
Email: larry.clarkin@microsoft.com
Blog: larryclarkin.com
Podcasts: thirstydeveloper.com
Scalable
Discussed the meaning of scalable
Performance = how app behaves with one user
Scalability = how app behaves with multiple users
Bad scalability is not a totally bad thing, means you are more popular than your setup allows for
Scalability is a step function instead of a linear line
Basically by increasing our hardware, we can support more users, until we can support no more
Our goal is to minimize the amount of money spent per request
Most websites have just 1 application server and one database server
This works for most websites
Where can we go from here if we want to improve performance/scalability?
Basics
- use basic strategies to improve performance
- turn off debugging
- ensure that you understand the network architecture to prevent surprises/problems
- told story of a slow website because someone incorrectly configured a bridge
Scale up
- add hardware to existing hardware (more RAM, etc.) without changing architecture
- improving network connections
- typically applies to database servers
- don’t overlook scaling up with software
- going to next version might be faster
Scale out
- putting more application servers into the system
- offers scalability boost, but starts introducing more complicated issues
- problems include session affinity, load balancing, SSL connection problems
- reduce/eliminate single-point-of-failure problem (SPOF)
- unless your load balancer goes down
- this is unlikely due to simplicity of hardware
- plus, you can have a backup sitting around to swap in
Specialize
- have certain servers be responsible for certain services
- might introduce more SPOF
- helpful to have image server
- doesn’t need to know about session, etc. usually
Split the application
- microsoft.com, msdn.microsoft.com, technet.microsoft.com
- each section has its own DB
- information may be shared between DB/app servers
Split the database 1
- reference data (read) vs. transaction data (write)
- there are some problems with normalizing the database
Split the database 2
- many read databases, fewer write databases
- web 2.0 typically has a lot more reads of data than writes
- news feeds, wall posts, contact information, etc.
- after a write, lazily loaded into read databases
- the small time lag is usually not perceptible to users
- “typically if a user refreshes and they see what they expect, they think it’s a browser problem”
- write database could be setup as a queue
Split the database 3
- essentially sharding
- have all users A-L on one database, M-Z on another
- myspace has 100000 users per database, just keeps adding databases
- talked about bloom filters for hashing into correct database
Geodistribution
- have data centers in various places
- more redundancy, better performance
- obviously this presents new issues to consider
Offload the work
- content distribution network
- might be expensive, but improves performance
- example is silverlight streaming
Anti-patterns
- spending all of your time looking at the code
- caching everything
- services calling services
Discussion
37signals says to scale later, focus on getting things out quickly
one problem is a fickle customer base, don’t want to alienate customers
Retail seems to have more of a valve than purely information/marketing/viral apps
people can only buy so much stuff
Usability
Probably read Don’t Make Me Think and you can get most of the points he made
“A good application can make you cry.”
A good application is: - desirable - usable - useful - adaptive - cost-effective - reliable
Tradeoffs can be made
ProtoXAML = users don’t always respond with good feedback to polished demos
70-20-10 rule
Use the 70-20-10 rule for the home page
70% of information/functionality for new users
20% of information/functionality for returning users
10% of information/functionality for power users
Derek Featherstone is an expert on usability, has a good website