In the Clouds…

2009 February 15
by Hamed
Share

The last few weeks I’ve been working at integrating Radio Javan’s website to serve content from Amazon’s new Cloudfront CDN (backed by S3). And in this post I will be sharing some of my experiences…

Amazon S3 is not new, been around even before I started developing the last iteration of our site (RJ3 as I call it…). However, I never really saw a good fit to use it to serve content. Yes, it offloads some web traffic and you don’t have to worry about losing your data as much. But its latency is high and quite honestly our own hosting solution gets extremely much better traffic results to end users than S3 ever did. Not to mention hesitation that S3 servers will go down.

But S3 has definitely evolved since then. Last fall they introduced their CloudFront CDN solution.

Every few weeks, I receive a phone call or email from CDN companies such as Level3, Limelight, Internap, etc because they notice our traffic on RJ and want to offer their services. What they would do is basically automagically host our static content such as images and even Flash movie files. But I really hate their whole sales procedure. You have to deal with a sales person, wait for them to give you a quote, and not really have any technical details on how things work. Unfortunately, the experience is almost like buying a used car, you really don’t know what you’re going to get.

Amazon’s much different. They lay out everything right in front of you. You know the prices ahead of time, and they provide you with a bandwidth calculator to figure things out. But much more importantly than that, examples of their API is everywhere! For me, their Ruby walkthrough is all I needed. And you can also just start small, you don’t need to commit to thousands of dollars every month to find out if the solution is right for you or not. There’s no contract. We’ve actually been using S3 to do our DB backups for over a year now, so I knew what I was getting into.

CloudFront has edge servers in the U.S., Europe, and Asia. When I try to access CloudFront from home in Atlanta, I get routed to a datacenter in St. Louis, when I try from DC, I get routed to a datacenter in Virginia. Yes, they definitely don’t have thousands of servers across all ISPs like Akami does. But it’s ok. All I really want from this deal is to be able to server static content away from my app servers and let it be someone else’s problem to manage the asset servers. As long as the performance is just as good as what our current hosting solution provider, then I will be happy.

The way I ended up integrating CloudFront was fairly easy.

First, it was important to me to keep a local copy of all files. For example when a user uploads a new image, I store that image first locally in a /static directory (just like I used to do), then I upload that file back to S3.

Using ActionController::Base.asset_host in Rails makes it easy to switch between different asset servers (e.g. switch to local storage in case Amazon starts making problems). However, that means I had to go through and convert every img tag to use the Rails “image_tag” method… oops, should’ve had my app like that from the start!

When uploading to CloudFront, I set the max-age header to essentially never expire the file. This way browsers will always cache images and never have to worry about refetching (even to see if it’s been modified). According to the docs, if you upload a new file to S3 with the same name, CloudFront won’t pick up the changes for at least 24 hours. However, I’ve seen it do it within 30 minutes, so I’m not sure what’s up with that.

But the expires issue leads me to one important point — I’m not running my CSS or Javascript files from it. The reason being is that those files get modified all the time as we do updates. Having to manage and keep track of multiple file names will become an enormous hassle. Storing user generated images on CloudFront is OK because all file names are unique (with a MD5 hash), but there’s no real simple way to do that with CSS and Javascript files that are committed to a source repository. What most people are waiting for is for CloudFront to accept ? query params in the URL request to signal to CloudFront that it should refetch a new copy from S3.

Conclusion

I would have to say so far I have been happy with my experience. Even if the performance improvement for end users is only a little (because our host is already super fast), I am happy that it takes so much load off our app servers, allowing us tolerate traffic spikes easier. 

Using S3/CloudFront was actually very simple, just required me to refactor some code. However, I am glad that I did it at the point where I felt our traffic deserved a CDN, and not before. I like to focus only on real problems and not get caught up in the whole buzzword business that many get trapped in.

But because it is so simple, S3 feels almost like a gateway drug to EC2 and other Amazon service… that’s a story another day.

One Response leave one →
  1. 2009 February 16

    Thank you for the article explaining the value of CloudFront and the ease of setup. I hope the readers of this blog will find useful our CloudBerry Explorer a freeware tool the helps to manage Amazon S3 and CloudFront services. check the link in my name

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS