by Stephanie Overby

How cloud infrastructure helped Instacart focus on business growth

Sep 04, 2015
Cloud Computing

The lead engineer of Instacart Nick Elser discusses how leveraging the cloud has been a powerful and successful experience for the one-hour grocery delivery service.

shoppingcartincloud ts
Credit: Thinkstock

Instacart was founded in July 2012 with a new solution to an old problem—how to get groceries delivered fast. But instead of investing millions in infrastructure—forging relationships with distributors, building warehouses, amassing a fleet of trucks and drivers—the startup instead worked with existing resources, partnering with grocery stores and contracting with individual “personal shoppers” to pick up and deliver orders. And when it came to the technology backbone to support the YCombinator-incubated business, Instacart also eschewed investing its own infrastructure and personnel instead embracing hosted options first from Heroku and now from Amazon Web Services (AWS).

The company has seen its customer base, order volume, number of retail partners, and catalog size explode; every month Instacart is adding terabytes of new data. Yet the engineering team lead Nick Elser hasn’t had to assemble an infrastructure team (he made his first infrastructure hire in May 2015) or build new data centers.

When AWS announced its relational database service (RDS) for PostgreSQL in Nov. 2013, Instacart was one of the first customers to sign up. “Instead of adding infrastructure engineers, we just grow our RDS hardware fleet,” Elser says. His team doesn’t have to worry about back-ups, resizing, security, production monitoring or reporting—“all hard problems for a startup to solve,” Elser says. “Amazon gave us the API to solve it.” Instead, the engineering team can focus on solving business problems.

[ Related: Think you’re agile? You’re probably wrong ] talked to Elser about the infrastructure demands of the one-hour grocery delivery service, the trade-offs and risks of moving from Heroku to AWS, the difficulty of managing production databases in a rapidly changing environment, and the benefits of just-in-time infrastructure decision making.

What is your role as engineering lead at Instacart?

Elser: I’m one of the long-time employees here—which is to say, I’ve been here about two-and-a-half years. I was doing full-stack development and had experience dealing with infrastructure-related issues so I naturally evolved into this role. I lead a bunch of teams and am responsible for planning and executing our infrastructure needs.

Why did you decide to move from Heroku [Platform-as-a-Service for building and deploying Web apps] to AWS RDS as the company began to grow?

Elser: Like a lot of startups, we launched on Heroku. It’s incredibly powerful and very useful. And we still use it for a lot of smaller products. But at a certain point, it wasn’t as powerful as we needed it to be.

We were dealing with impressive growth—not just with our customer base, but also internally. We were hiring so many engineers and pushing out hundreds of updates every day to our servers. One of the big engineering challenges that Instacart faces is managing incredible size and variety of our catalog—inventory information from thousands of stores that need to be kept up to date. As a result, we needed more control over those processes. At the same time, it turns out that managing a production database is really, really hard.

I led the charge to move to AWS more than a year ago and we’ve utilized more and more of their technology and moved more of our stack to the cloud since then. AWS gives us a little bit more control over managing our infrastructure as well as better and more powerful management of our database with RDS.

As soon as Amazon launched RDS for PostgreSQL, we jumped on board. It’s a fully managed, hosted, industrial strength database solution that makes it easy for us to scale up as we grow.

Tell me a little bit about the advance analytics that this infrastructure is supporting?

Elser: There’s an ever-increasing proliferation of technology on the back-end to enable the assignment algorithms we have. One of the big things we do is machine learning with heavy Python and R libraries. Essentially we have a vast number of machines running on the EC2 environment crunching these prediction algorithms that decide which shopper to assign an order to, which varies dramatically from city to city and by time of day. And that’s all enabled by AWS.

How has your embrace of cloud infrastructure influenced the type of IT professionals you need to hire internally?

Elser: We don’t have anyone focused on database administration specifically, although we have people who understand it. It’s changed how we hire and who we hire. We’ve been able to hire more generalists able to solve business problems. Instead of focusing on the fabric of where that data is stored, we can think about things that are actually important to the business like getting information out quickly and optimizing the experience for our shoppers and customers.

How difficult was the transition from Heroku to AWS and how did you manage the risks of the switchover?

Elser: We did extensive testing before we converted over. We had two clusters running for a while to ensure that it would work for us. Based on that testing, we had no worries.

What were the drawbacks to switching from Heroku to AWS?

Elser: Heroku is an extremely powerful platform that hides its complexity beneath the surface. And that’s something you don’t realized until you migrate off of it. We had to write replacements for things like provisioning, scaling and deployment. Luckily, there is a wealth of open source solutions that enabled us to replicate that. And we were able to create a solution that was more customized to our business needs.

They also have an amazing support layer to take care of a lot of the problems you might have with infrastructure. So we no longer have access to that team helping us out all the time. Amazon also has a support team, but it operates at a lower level. So ultimately, moving to AWS gave us a more powerful and flexible solution—but without the safety net that Heroku provided.

Does the business care how the IT infrastructure is provisioned? What are the benefits that they see?

Elser: First, we’re able to support new growth and able to scale up quickly and seamlessly when we add a new city or grow in one of our existing markets. We’ve expanded from one to 17 cities in two years and from a few hundred to several thousand shoppers. Secondly, the platform offers the stability we need to deliver an incredible experience to customers. It’s always up, it always returns quickly, we have no security issues. Our number one priority has always been the customer experience, and a reliable and responsive infrastructure is critical.

What advice would you offer others about leveraging Infrastructure-as-a-Service?

Elser: The biggest lesson for us is that we didn’t have to manage our primary data stores. That means we can worry about the customer experience instead. I can’t overstate how hard it is to manage production databases in a rapidly changing environment. Being able to leverage the cloud for that is crazy powerful.

The second lesson was the benefits of making infrastructure decisions just in time. There’s really no need for massive architecture overhauls every year. Instead you can do capacity planning for the next couple of months, allocating resources to the product while delivering on your service-level agreements. All of our architecture decision making is built around delivering an excellent customer experience.