CDN 101

Content Delivery Network 

A CDN or Content Delivery Network is an overlay network that augments an existing web infrastructure. It is typically used to solve several problems for medium to large web properties:

  •  Scale
  •  Reliability
  •  Performance  
  •  Cost
  •  Reporting

Let's use an example, They get many many visitors in a given day. This presents several scalability problems. Broadband DSL and Cable connections mean that even Gigabit Ethernet links quickly fill when a few hundred simultaneous users are browsing large, high quality pictures or video. 

1,000 Mbps (GigE) /  2 Mbps per User = 500 Simultaneous Users

So, we buy another GigE link in our datacenter. Not cheap, and we've only scaled linearly to 1,000 simultaneous users. Yes, I know we get wait_state connections of users looking at a picture, but you get the point. Millions of people a day hit the site, but only a few hundred to a few thousand can come across a GigE link.

It also takes lots of servers to serve all that content, especially when adding in video tours, and the typical dynamic stack of web servers, app servers, and database servers.  These scale issues mean that we need to other devices, like load balancers, firewalls, IDS, and N+1 switches. All this gear is very expensive, and the vaunted economies of scale for serving more content disappear.

Then, when we do a special event, a promotion, or get a mention in the Wall Street Journal, Drudge Report, or (ha!) the site is inundated with visits. We don't want to build the church for Easter Sunday as one of my clients tells me. So we have three problems of scale: Bandwidth, Servers, and Overflow Capacity. They all cost money.

A traditional hosted architecture puts all the eggs in one basket (unless you want to get really fancy and start global load balancing between geographically diverse datacenters) A datacenter is a fairly good backet, until it's not, and goes down as recently happened at the Los Angeles Garland building 2-26-07

 Now that we've looked at the problems facing a decent sized web property, let's look at one of the most common solutions, the Content Delivery Network, or CDN. 

If we took all the techniques discussed above, and built out a series of global datcenters, all load balanced with robust intelligent DNS, and replicated to each other to keep content current at each location it would cost millions. Luckily, CDNs have already done that for us. 

The original CDN is Akamai. Founded in 1998, Akamai has over 20,000 servers deployed across the world. They provide services for the likes of Yahoo!, Apple's iTunes, and You get the picture, really big web properties rely on Akamai because they help solve the problems of scale, reliability, performance, and cost while providing reporting. 

In the last two years, more startup CDNs have been built. Akamai has bought several of them, including Speedera and Netli. Here's my CDN comparison guide:

CDN Guide