By Jeff Carpenter \n\nYou might have heard of Apache Cassandra, the open-source NoSQL database. And you might know that some big, very successful companies rely on it, including LinkedIn, Netflix, The Home Depot, and Apple.\n\nBut did you know that Cassandra is used by a huge range of companies \u2014 including small, cloud-native application builders, financial firms, and broadcasters?\n\nHere, I\u2019ll give you an overview of Cassandra, along with a few reasons why this database might just be the right way to persist data at your organization and ensure your data and the apps that your developers build on it are infinitely scalable, secure, and fast.\n\nA (very abridged) look at the database landscape\n\nMany people in technology first became familiar with relational databases like Oracle DB or MySQL. They\u2019re very powerful because they ensure data consistency and availability at the same time, and they\u2019re effective and relatively easy to use \u2014 as long as your databases are running on the same machine.\n\nBut if you need to run more transactions or need more space to store your data, you\u2019ll run into upper limits pretty quickly, as relational databases can\u2019t scale efficiently.\n\nThe solution? Split the data among multiple machines and create a distributed system. NoSQL (\u201cNot only SQL\u201d) databases were invented to cope with these new requirements of volume (capacity), velocity (throughput), and variety (format) of big data.\n\nIt was born out of necessity, as the rise of Big Tech over the past decade has driven the global data sphere to skyrocket 15-fold; relational databases simply can\u2019t cope with the new data volume or new performance requirements. Huge global operations like Google, Facebook, and LinkedIn created NoSQL databases to enable them to scale efficiently, go global, and achieve zero downtime.\n\nCassandra\u2019s early days\n\nIn the mid-2000s, engineers at young, fast-growing Facebook had a problem: how could they store and access the mushrooming data created by Messenger, the platform that enabled users of the social networking site to communicate with one another? Nothing on the market could handle the hundreds of millions of users that were on the platform at peak times, spread across tens of thousands of servers spread across data centers around the world.\n\nSo, Facebook\u2019s team built their own database to enable users to search their Messenger inboxes. It replicated data across geographies to keep latencies down, handled billions of writes per day, and could scale as the number of users grew. (You can geek out on the original Facebook Cassandra paper, authored by its creators, here).\n\nAs it became clear that this technology was suitable for other purposes, the company gave Cassandra to the Apache Software Foundation (ASF), where it became an open-source project (it was voted into a top-level project in 2010).\n\nCassandra\u2019s scalability was impressive, but its reliability also sets it apart among databases. Because of its geographic distribution and the fact that data is replicated across multiple datacenters, Cassandra\u2019s uptime and disaster recovery capabilities are unparalleled. This quickly caught the eye of other rising web stars, like Netflix. The company launched its streaming service in 2007 using an Oracle database housed in a single data center. The company\u2019s rapid growth quickly highlighted the danger of managing data at a single point of failure. By 2013, most of Netflix\u2019s data was housed in Cassandra. \n\nCassandra has become the de facto standard database for high-growth applications that need reliability, high performance, and scalability: it\u2019s used by approximately 90% of the Fortune 100, and a bunch of relatively recent developments are making it even more accessible to a wider range of organizations.\n\nWhy Cassandra?\n\nLet\u2019s quickly recap some of the unique capabilities of Cassandra:\n\nFor more details, see this excellent Cassandra overview provided by the ASF.\n\nWhy Cassandra for your organization?\n\nOnline banking services, airline booking systems, and popular retail apps. These modern applications and workloads \u2014 many of which operate at huge, distributed scale \u2014 should never go down. Cassandra\u2019s seamless and consistent ability to scale to hundreds of terabytes, along with its exceptional performance under heavy loads, has made it a key part of the data infrastructures of companies that operate these kinds of applications.\n\nFor instance, Best Buy, the world\u2019s biggest multichannel consumer electronics retailer, describes Cassandra as \u201cflawless\u201d in how it handles huge spikes in holiday shopping traffic.\n\nBut Cassandra isn\u2019t just for big, established sector leaders like Best Buy or Bloomberg. It\u2019s a powerful data store for developers and architects who build high-growth applications at organizations of all sizes. Consider Praveen Viswanath, a cofounder of Alpha Ori Technologies, which offers an IOT platform for data acquisition from ships and processing and analytics for their operators.\n\nHaving experienced the power of the NoSQL database in earlier roles, Viswanath again turned to Cassandra \u2014 delivered via DataStax\u2019s Astra DB cloud service \u2014 for its distributed reliability and high throughput, as Alpha Ori\u2019s platform required the constant gathering of thousands of data points from the 40 or so major systems aboard the over 260 ships that it served.\n\nBecause of his team\u2019s need to focus on development rather than database operation, Viswanath chose the Astra DB managed service, a serverless solution that scales up and down when needed.\n\nA flourishing ecosystem\n\nThe availability of Cassandra as a managed service is one way that this powerful database is reaching more organizations. But there\u2019s also an ecosystem of complementary open-source technologies that have sprung up around Cassandra to make it simpler for developers to build apps with it.\n\nStargate is an open-source data gateway that provides a pluggable API layer that greatly simplifies developer interaction with any Cassandra database. REST, GraphQL, Document, and gRPC APIs make it easy to just start coding with Cassandra without having to learn the complexities of CQL and Cassandra data modeling.\n\nK8ssandra is another open-source project that demonstrates this approachability, making it possible to deploy Cassandra on any Kubernetes engine, from the public cloud providers to VMWare and OpenStack. K8ssandra extends the Kubernetes promise of application portability to the data tier, making it easier to avoid vendor-lock in.\n\nA vibrant future\n\nAs a highly active open source project, Cassandra is always being updated and extended by a vibrant community of very smart people at companies like Apple, Netflix, and my employer, DataStax. Indeed, the Apache Software Foundation today announced the general availability of Cassandra 4.1. Through exciting innovations like ACID transaction support (long a holy grail of distributed NoSQL databases) and improved indexing, we are working to make Cassandra more powerful, easy to use, and ready for the future.\n\n\n\nWant to learn more about Apache Cassandra? Register now for the Cassandra Summit, which takes place in San Jose, Calif., March 13-14, 2023.\n\nAbout Jeff Carpenter:\n\nJeff has worked as a software engineer and architect in multiple industries and as a developer advocate helping engineers succeed with Apache Cassandra. He's involved in multiple open source projects in the Cassandra and Kubernetes ecosystems including Stargate and K8ssandra. Jeff is coauthor of the O\u2019Reilly books Cassandra: The Definitive Guide and Managing Cloud Native Data on Kubernetes.