SharePoint 2010 Infrastructure for Amazon EC2 Part I: Storage and Provisioning

The Amazon Web Services (AWS) have been around for a while now but there’s been surprisingly little ­­use or abuse in the SharePoint community, from what I’ve seen. A notable exception to this is Andrew Woodward’s novel and interesting approach to Exchange BPOS migration via Amazon EC2. But that doesn’t talk much about SharePoint on Amazon, so in these posts I’ll give an introduction to the design constraints that pertain to SharePoint 2010 development environments on EC2. Even if the Amazon Web Services aren’t appealing, a lot of the issues discussed here will apply to consumption of other Pay-As-You-Go infrastructure services, presumably including the forthcoming Windows Azure VM role AKA Hyper-V Cloud. In this first post I focus on the platform, storage, snapshots and provisioning.

Other posts in this series:

What are the Amazon Web Services?

AWS is a platform in the cloud, like Windows Azure in some respects. While these web services are distinct from traditional hosting offerings, Amazon also provides Infrastructure as a Service (IaaS) in the form of Elastic Cloud Compute (EC2). This is a Red Hat implementation of the Xen hypervisor, from which virtual machines (instances) can be launched. For accuracy, I should note that Amazon recently launched a second Oracle hypervisor within EC2, but that’s a distraction from this discussion. Amazon have been providing their web services since 2006. For the purposes of these posts I am concerned with the EC2 offering as a cloud-based alternative to desktop development workstations, although there are other scenarios that may be suitable for deployment in EC2, such as demonstrations or large infrastructure tests. For more information on the difference between traditional hosting and EC2, see Amazon’s FAQ on the matter.

What is Elasticity?

This term arises frequently in the Amazon vernacular. In its essence this means that scalability is built in to the platform. Need more CPU or memory? Just re-launch your instance as a larger size. Need more instances? Create them in a few minutes. Need more storage? They got it and then some. IP addresses? They even have Elastic IP addresses. Bandwidth? It’s the cloud, fool.

AWS largely deliver on these promises, although you’ll encounter some provisioning fiddlery before realising it. More importantly, increased size comes at a cost. Nearly all of the Amazon price points are ridiculously low at their smallest, but these costs are not always linear – particularly with CPU and memory. Additionally, cost permeates nearly every design option, and these costs persist over time. Infinitesimally small prices need to be considered over very long periods if IaaS is to become an alternative to hardware. I will discuss costs in more detail later, as this topic is fundamental to the desirability of cloud computing. If it isn’t cost effective, it probably won’t be the right option. But in a nut shell, Elasticity means that if you need more of anything, you can pay for it for the duration that you need it. They definitely intend to say that you can shrink as well.

EC2 Design Complexity

I couldn’t possibly hope to explain everything that’s important to know about AWS in these blog posts and I won’t try. However, it’s important to know that the design constraints that pricing and scalability impose on AWS require a fresh perspective.  Infrastructure Architecture for AWS will require time, testing, piloting and a good understanding of end-user working patterns. Once this configuration and these patterns are clearly understood, the costs need to be projected over long periods. This is likely to be a deep consulting exercise, since so few design options can be left to chance; this will hopefully become clearer as I talk more about pricing later. For now, if you don’t believe this is complicated, have a look at the 237 page User Guide, which I would class as required reading for anyone serious about EC2. The topics covered below are a summary of the areas that I feel are most important to understand with SharePoint on EC2.

Storage

The first thing to understand about AWS is that there are two types of storage, the Simple Storage Service (S3) and the Elastic Block Store (EBS). Some older documents and forum posts were written before EBS was available as a root device, so watch out for potentially misleading information.

All SharePoint 2010 environments need to run on EBS because Windows Server 2008 will chew up more than 10 GiB off the bat (this is the maximum size of S3 volumes). EBS storage costs are more expensive than S3 and you pay for the number of I/O requests, so projecting costs is a fairly inexact science. However, in my brief testing time the I/O charges were relatively small. It’s worth noting that for the extra cost, EBS volumes also persist and they launch faster. I am only briefly touching on this topic, so please review the User Guide if this is insufficient detail. The key points for now are that you must use EBS for SharePoint instances, and EBS is more expensive than S3.

Provisioning
Taking snapshots and creating new images from them is quick and easy in EC2, once you get your head around the key concepts: AMIs, Volumes and Snapshots.

AMIs
An AMI is an Amazon Machine Image. This will be the first design choice you encounter when launching an Instance. Amazon provides a basic Windows Image or you can use an Amazon image with SQL included (at a cost). You can use your own licenses for everything but Windows.

Once an image is running you can modify it to your taste. Once you’ve created a new standard baseline, you can create a new image from your instance, and when you provision new instances you will be able to select this new image rather than the Amazon one you started with. Note: the Amazon Windows license cost is built in to the billing process; your instance costs include the license, even after you’ve created your own new image from the original. Also note: Windows Server 2008 R2 is not available yet.

Volumes
A volume is basically a virtual hard disk. When a new instance is created, the selected AMI is deployed to a new volume – the same size as the image it was created from. A volume can only be attached to one instance at a time, but an instance can have many volumes attached to it if you want to add storage capacity.

Remember that you pay for the storage you use, so size your volumes wisely. 30 GiB is unlikely to last anyone very long with Windows Server 2008, so consider at least 40, if not 50 GiB for any new root volumes. Keep in mind, you may find the less expensive S3 volumes useful as secondary, disposable storage if that suits temporary needs.

Snapshots
A snapshot records the state of a volume at a point in time. Once a snapshot has been taken, a new volume (of equal or greater size) can be created from the snapshot and that new volume can be attached to a new instance. That new instance can be used to create a new AMI at the new size. Snapshots and new volumes together enable you to increase system disk size. Snapshots can also be used for backup.

An example workflow for getting your first image at the right size might go like this:

  • Launch an instance from the default Windows Server 2008 image.
  • Install SQL and SharePoint (this should be possible at just under 30GiB).
  • Configure stuff and shut down the instance.
  • Take a snapshot.
  • Create a new volume at 50GiB based on the snapshot.
  • Detach the existing volume from the instance and attach the new volume.
  • Create an image from the instance.
  • Launch the existing instance and create additional instances from the new AMI as needed.

Note: if you will be including Visual Studio or any other sizeable software, you will need to go through a process like this before installing it, as it will push you over the 30GiB initial size.

This process is oversimplified, but it hopefully illustrates the relationship between AMIs, snapshots and volumes as they relate to provisioning. All told, I think this way of working with images, volumes and snapshots is sensible, not terribly complicated in the EC2 scheme of things, and the choices should be pretty straight-forward once you understand the options and costs. However, this could potentially get more complicated as end-users engage with these decisions. How will they know what to ask for? Will it be necessary to involve EC2 experts in the approval of any new systems? Training, consultancy or winging it all have associated costs and risks. Even though I’m only talking about development environments here, there are still risks  in committing to a Pay-As-You-Go platform where usage is unrestricted. Keep this in mind.

I’m aware this is lengthy already, so I’m going to split this up. In my next post I’ll review Cloning and  Networking.