by Mansour Karam

Build vs buy for your next-gen data center network infrastructure?

Mar 16, 2018
Data CenterIT Strategy

Maybe the answer is...neither?

decision fork in the road
Credit: Thinkstock

As you implement digital transformation strategies, you need to plan for an explosion in traffic growth. Consequently, you need to build data centers using the same state-of-the-art principles the hyperscales have adopted:

  • A leaf spine Clos design, which accommodates large amounts of east-west traffic, is essential to support today’s web applications
  • A multi-hardware vendor strategy, leveraging both established hardware vendors (Arista, Cisco, Dell, Juniper, Mellanox), and open alternatives (Cumulus and OCP hardware)
  • Major investments in automation and analytics for efficient scalable operations.

Doing this right means embracing a disaggregated approach, separating the hardware, choice of network operating system, and automation layer. If you are a regular attendee of the OCP Summit, ONS Conference, or ONUG Conferences, you are familiar with these best-practice approaches deployed by Facebook, Google, Microsoft and Amazon.

Hardware lock-in versus DIY – flawed choices

As for an automated operational model for your data center network, in engagements with customers and prospects – webscale companies, service providers and large enterprises – it is clear that many organizations are facing the same choices:

Choice 1: Should I use the software that’s provided by a hardware vendor?

Choice 2: Or should I build my automation and analytics software myself to avoid vendor lock-in?

Both choices are flawed. With Choice 1, hardware-vendor provided software will lock an organization into that hardware vendor and it is exceedingly hard to pursue a dual-vendor strategy. How can you believe that switch vendor C would support switch vendor A’s hardware even if vendor C claims so, when A is C’s most feared competitor?  

Yet most organizations I have talked to in the past few years want at least two hardware vendors in their data centers.

And with Choice 2, this approach is fraught with danger and risk.

Organizations are advised to investigate DIY and multi-vendor commercial solutions carefully. It is important to size the DIY effort carefully – it requires a large team, and it is easy to underestimate the high cost and the many years required to deliver such a critical undertaking.

In performing this investigation, it is important for organizations to deploy solutions that don’t make you choose between Build and Buy. Organizations should consider commercial solutions that have an extensible platform, so that you can integrate commercial solutions into your environment that meet your requirements.

Whether organizations decide to take a DIY approach, or adopt commercial solutions, they need to make a proper assessment of all the capabilities that are required – in the case of DIY, these are capabilities that organizations will have to build themselves or integrate. Among these capabilities:

  • A highly scalable distributed data store
  • Abstractions that capture user intent
  • A graph representation of all intent and infrastructure state, which captures in real-time all the relationships between objects, e.g. user intent, topology, physical elements (including switches, interfaces, transceivers, links), logical elements (virtual networks, security zones), and telemetry
  • Extensible telemetry agents that can extract telemetry across platforms
  • Device drivers across various vendor devices used to both configure, and extract telemetry from these devices
  • Comprehensive design tools that architecture teams can use to design data center PODs in a matter of minutes
  • Comprehensive build tools to stand up PODs in minutes
  • A continuous validation engine that generates anomaly alerts in real-time anytime infrastructure state deviates from intent
  • Intent-based analytics, an integrated big data pipeline that enables the user to configure signatures pertaining to how they’d like their network to run in a matter of minutes
  • An intuitive web interface that one can use to design, build, deploy, and operate these networks with unmatched simplicity.
  • Extensible architecture with programmatic APIs – e.g. streaming interfaces, graph queries, plugin modules, and other methods to extend the platform and integrate it within their environment.

Further, it is important for organizations to realize that the DIY path is not a one-time investment but a forever internal commitment to keep up with best-of-breed and best practices – essentially, they need to build a stellar development team which has as its sole mission to build, continuously improve, and maintain these solutions.  

Organizations should also plan the roadmap that’s required to keep up with their ever-growing requirements and use cases. They’ll need to have the ability to deliver features at the speed of the business…on an ongoing basis.

I’ve seen organizations spend $20M over 2 years investing in DIY and get nowhere. DIY requires the ability to hire dozens of top software engineers and have them focused on building the solution from the ground up. As important, DIY requires the ability to retain those top engineers, so they are able to support the solution they’ve built over many years. To quote Tsvi Gal, CTO at Morgan Stanley, at a recent ONUG event, “the worst vendor lock-in is our own … We are basically locked into our own environment.”

Innovative organizations like Yahoo Japan Corporation have shown that there is now a much better path – a choice that was not available to the original hyperscales – Facebook, Google, Amazon – when they built out their automation solution.

Choice 3. Use a commercial solution that works across multiple hardware vendors and that would enable you to automate your entire processes involved in designing, building, and operating your data center networks – day 0, day 1 and day 2+

With today’s state-of-the-art multi-vendor commercial solutions, you can meet all the requirements laid out above. Furthermore, with today’s commercial options, organizations have the full range of hardware vendors to choose from in their dual-vendor strategy, and can avoid lock-in. Yet in doing so, you don’t have to build and retain a world leading development team to reinvent the wheel and recreate an integrated solution that enables a fully automated operational model.  

And, you can focus your management and development talent on areas that are strategic to your business.