Amazon's Cool Cloud Move: Adds Public Data Sets to EC2
Amazon has provided a great starting point for enterprises that want to experiment with its EC2 cloud service. Now, cloud neophytes can learn how to build real systems, using data sets that match the size and complexity of those in the enterprise.
Wed, December 10, 2008
Last week Amazon announced something very cool: the availability of public data sets hosted in its Elastic Block Store (EBS) service, part of its Elastic Compute Cloud (EC2) offering. These data sets are available for free, with only typical EC2 runtime charges applied for access and use of the data. If you're not familiar with EBS, it's Amazon's persistent storage service integrated with EC2, making it easy to read and write data sets from applications hosted in EC2.
The first set of data offerings are genome, chemistry, and economic statistics. I'm not very familiar with the first two categories, but have worked a lot with the economic ones Amazon has on tap: Census, Bureau of Labor Statistics, and (coming soon) Bureau of Economic Analysis data. The BLS keeps track of job statistics, and the BEA tracks overall economic data like GDP, capital investment, and so on. Amazon indicates that it plans to grow the number of data sets it will offer.
What is striking about the announcement is how Amazon keeps making unexpected moves in its cloud offerings. It pioneered the category, showing the way. Hundreds of startups have jumped on EC2, using it as the foundation for inexpensive offerings like backup services, image manipulation, and so on. Just as the rest of the industry started to catch up with that, Amazon comes along and offers pre-built data sets, free for the asking.
This is a great initiative, and offers real promise in a number of ways:
It provides a way for neophytes to learn how to build real systems, using the data sets as a jumping-off point. If Amazon wanted to create a perfect testbed for enterprises to get comfortable with EC2, it succeeded. These data sets mirror the most demanding coporate ones in terms of size and complexity. Therefore, it is a great staring point for corporations to experiment with Amazon's EC2 service. This is also a big win for Amazon in that it will induce more people to use its cloud services.
It offers great learning tools for educators. I was talking with a friend and he pointed out that a statistics teacher could easily use the economics data sets as the basis for class assignments. In fact, Amazon notes that the sets can be made even more valuable by creating customized system images that contain preconfigured apps that use one or more of the data sets; these images can be shared. A teacher could easily create a preconfigured system with data for all of the students in a class and allow them to start work immediately.