The Rise of GraphQL Databases

Sometimes a tech wave starts with the force of a large company. Google, for example, birthed Kubernetes, pushed it forward, and that shove gave it the momentum it needed to become its own industry. Kubernetes has quickly become a technology that all businesses, large and small, aim to adopt. Some require the technology to push their cutting-edge agendas forward and others want to use it merely because it seems to be the thing to do. Regardless, it is a great example of how a company’s idea can be developed and widely adopted with the right push and market needs.

Other times, new technology becomes a force of its own with many voices and ideas within the industry gradually creating an unstoppable wave of adoption. The NoSQL movement was like that. It started as an idea, upon which products were built. Even as fans of relational databases fostered doubt, the flexibility and relative ease for developers, as well as the potential to scale, led to developer adoption that created an inexorable wave of new technology.

The decision to possibly build a product or service on top of a NoSQL database was a tough decision for us as developers and architects. The technology was new and the idea of breaking beyond the SQL and relational world was a bit scary. We went from questioning the inclusion of a NoSQL database in our solutions to making them the default choice in many cases.

Just a few years ago, I saw SQL-based relational databases as the go-to choice. Now, that is just simply not the case. Many viable options are available, an almost limitless amount.

Because of the NoSQL movement, you can now find databases representing the lineage of the NoSQL movement in Google GCP, Amazon AWS, and Microsoft Azure. IBM acquired Cloudant to gain a NoSQL offering, and even Oracle was forced to concede and release their own NoSQL products, rather than try to absorb the movement into their relational databases, as they had done successfully with previous innovations.

Developers have spoken, the choice to pick the technology we use to store our data is a choice we want to make and have options on. This has forced most players in the database space to either alienate developers who want to have a choice or to move forward with their own NoSQL offerings.

Looking forward, I see the same patterns are forming around GraphQL. There’s a swell of developer adoption, with clear technological reasons to adopt GraphQL and new kinds of applications that are enabled. Technologies that allow for developers to more easily develop and maintain applications are frequent but adoption is usually sparse. It has been a long time since a technology like GraphQL swept this way through the developer community.

Unlike the shift from SOAP services to RESTful service adoption, GraphQL is changing not only how we develop apps but also how we architect them. There’s innovation at all levels of the stack, including continued innovation in databases that leverage the paradigms brought forth with GraphQL adoption. It’s not yet clear if it’s a new movement on its own or a continuation of the NoSQL movement, but it is clear that GraphQL databases are here and making a wave.

Before digging further into GraphQL databases, let’s look back on the NoSQL movement which provides some useful background and understanding on why GraphQL databases are looking like a movement too.

NoSQL Database Movement - Development Flexibility and Scale

Our modern understanding of a NoSQL database goes back to the late 2000s as the Web 2.0 push really started to take hold. Fueled by ideas about data management, support, scalability, and how the current norms were not serving these needs as well as they could. Job growth in NoSQL picked up around 2010, slightly after large amounts of VC money started flowing into NoSQL startups. As scale and design flexibility became a necessity, NoSQL really began to take off.

For a very long time, the relational database was everything. The question was never about SQL versus NoSQL, but merely which database provider you would be using. Relational databases were the default for every enterprise and every project within them. Although capable of supporting most use cases, relational databases do have limited flexibility due to their fixed tabular format. A fine solution for pre-2000’s applications but with the growth of the web came new challenges for data storage and persistence.

Emerging giants like Google, Facebook, Twitter, and Amazon had data that did not fit the tabular paradigm - and lots of it. As these companies changed the game in terms of how applications were built and used, most other companies also began to follow these new methodologies as well. We are now in the midst of an era of massive amounts of data, unlike anything we have ever seen before. With this comes the need to push traditional databases to their brink and scale that stretches the engineering limits of databases.

That forced engineering enhancements in existing databases, but NoSQL also changed the landscape and allowed a change in the thought process around databases, development, and data modelling. In assessing the situation in O’Reilly Radar in 2012, Mike Loukides wrote:

“For years, the relational default has kept developers from understanding their real back-end requirements. The NoSQL movement has given us the opportunity to explore what we really require from our databases, and to find out what we already knew: there is no one-size-fits-all solution.”

What the NoSQL movement did was to free developers from engineering around the constraints of existing database solutions. A new type of flexibility was now available, all ushered in by the notion of developing new kinds of databases to serve the needs of modern developers and the modern web.

Relational databases continued to be important, especially in many legacy solutions. But, the field of database flavours continued to grow and gave developers the ability to choose the tech to match the requirements of their applications.

The concurrent rise of cloud-based PaaS (Platform as a Service) services meant developers could also have those database innovations served for them, instantly allowing use in a solution without major configuration upfront while also being scalable across the globe. This amplified the movement, making it even more appealing for developers looking to build apps more quickly with global scaling available out-of-the-box.

The innovations in NoSQL databases and cloud computing meant that sub-second response times for global apps and sharded data to meet regional or scale demands were now in the hands of every developer, startup, and company.

What I witnessed while working in large enterprises around the time of mass adoption of NoSQL, was a major shift in how such technologies were being brought into the technical stack. Developers like myself were no longer content with adhering to the norms but began to voice opinions on how NoSQL could bring value to a place once controlled by vast expanses of relational databases.

As 50+ years of hardened architectures began to loosen and allow new technologies to flow in, I saw this shift improve the lives of developers and the applications we were building. We had a choice to start creating “best” solutions instead of merely “good” ones.

REST APIs - Data and the Web

From a development perspective, NoSQL databases allowed for a more flexible data model. That meant developers could make fast changes and cope with rising demand or quickly changing requirements as their Web applications grew in popularity. Indeed, agile development practices and NoSQL fit so nicely together that the rise of database options allowed developers to be agile throughout their backend tech stacks. Often that also meant databases and modelling that fit their app’s query requirements, instead of database constraints. But, to some extent, that flexibility stopped at the backend APIs that power the apps.

Traditionally, most web applications are built around RESTful APIs: HTTP data services that supply the data for modern apps. REST APIs, like relational databases, do not always lend themselves perfectly to the outcome intended by the developers. This has usually meant that developers must then either bow to the constraints of the technology or find workarounds to manage the requirement. Neither of these solutions is optimal, either leading to an implementation that does not fully handle the requirement or the possibility of heaping more technical debt due to all the required workarounds.

Just as tabular data didn’t represent the changes in data requirements as the web grew, documents also don’t perfectly represent the data requirements of the modern web either.

Modern Web apps are about interlinked data and lots of it. Relationships within the data can be so complex that traditional systems cannot accurately depict them. Sometimes this also puts the burden on the developer to keep track of all these connections, leading to undesirable complexities. When looking at our modern data problems, we are not talking about single pages of data anymore, but rather data and their intrinsic relationships to other data.

Facebook is a great example that requires the rendering of complex interlinked data. The complexity of the relationships within the data is compounded by the sheer amount of data available (and needed). Now, not every app is at Facebook’s scale, but increasingly all web applications are about interlinked data and not single documents. The complexity has seeped into all companies and applications, big and small.

NoSQL databases coupled with RESTful APIs helped to bring a more harmonious approach to data and the applications that used it. But they were far from perfection and developers still needed a truly elegant way to make application development easier. RESTful APIs force programmers to view that data as a set of individual operations that they must link and traverse programmatically.

The document-oriented NoSQL database and RESTful APIs solved many problems. Although this has been the standard for modern web development for many years, the continued growth and feature requirements of modern apps created an API problem - a big one.

The premise of a RESTful API revolves around a contract of request and response. This contract is rigid and leads to minimal flexibility. What if an application needs more or less data than the contract described? It means that you must go one of two routes: amend the contract and risk breaking other applications using the API, or simply build a new API with the received contract. Neither option is optimal. For those situations where you need data from two or more disparate APIs, you can either handle it in the calling applications code or once again create a new API for this need. The lack of flexibility of RESTful APIs became apparent very quickly.

What did all of this mean for developers and our organizations? It meant that we now had teams that spent more time focused on managing and supporting their APIs than actually building new and impactful applications. We were put in a state of perpetual catch-up or waiting where we had to wait on the backend team to build the needed APIs or had to contend with a massive list of APIs the frontend team requested. As application portfolios became increasingly complex, application owners were forced to manage possibly hundreds of APIs. Even though development tools were making it easier to build applications, it wasn’t enough to keep up with the demands put on developers by the constant “API problem”.

GraphQL APIs - The Freedom of the Flexibility of Graphs

In search of a better way to solve the API problem, a rising wave of developer interest is showing that Facebook’s creation of GraphQL could be the answer we have been waiting for.

GraphQL lets a data source expose its data as a single graph. In terms of Facebook, this meant that data involving posts, the authors of the posts, the comments and authors, likes, related posts, etc. could all be available through a single API endpoint. Through this GraphQL API, client web apps can navigate their data requirements with simple queries that traverse the data graph.

Similar to a SQL query in a relational database, this allows the API user to tell the API what data they need and the format in which to return that data. Rather than creating new RESTful APIs for each data requirement, all the necessary data is available and allows the developer to specify what they need.

That difference makes a GraphQL API more flexible. It also allows developers to be more agile in using it as they no longer need to worry about waiting for a new endpoint to be created just to access the data they need.

Similar to how big tech moved with the NoSQL movement, well-known tech innovators are at the fore of the GraphQL wave. Following Facebook’s initiation of the GraphQL movement, Netflix, Twitter, Airbnb, PayPal, Shopify, and many more tech companies have jumped on board. The wave hasn’t stopped with the tech industry - companies like Starbucks, the New York Times, NBC, Fairfax, KLM airlines, Intuit, and Yelp are converting their tech stacks to GraphQL in response to business needs. Even companies that service the tech industry like GitHub and Atlassian have moved to GraphQL. The GraphQL Foundation lists many GraphQL adopters, and the list is growing daily.

It’s easy to see why GraphQL awareness and adoption are growing when companies like Airbnb claim a 10x development improvement. Mark Stuart who leads Web Platform at PayPal calls it a game-changer. He remarked, “At PayPal, GraphQL has been a complete game-changer to the way we think about data, fetch data and build applications”.

Similar to NoSQL, no one company owns GraphQL. This has led to a space filled with innovation and different approaches to adopting the underlying technology. There are now numerous ways of bringing GraphQL into an organization with varying approaches to fitting the technology into your architecture. Because of this, subsequent developer adoption has been exponential.

GraphQL has seemingly solved the API problems that constrained many web applications. Because GraphQL is a specification about how front-end web clients and backends should talk to each other, and not an engineering solution in itself, every organization that adopts GraphQL does it a bit differently. Most of the time this means engineering their own solutions.

When those solutions are layered over traditional relational and NoSQL databases, we again hit the engineering and flexibility limits of such platforms. As apps grow, these platforms must be scaled or improved to maintain the desired performance. We find ourselves in a similar situation to the early stages of the NoSQL movement before technical solutions established themselves to satisfy the developers interested in the potential of the technology.

As much as NoSQL solved many problems developers were facing, when coupled with RESTful APIs we created new problems in the applications API layer. GraphQL has solved many of the most pressing API problems but pushed new engineering challenges into the data layer. Traditional databases that weren’t designed for interlinked, graph-like queries are now being tasked with handling these complexities. The result has been subpar performance and a multitude of workarounds.

Initially, the GraphQL solution didn’t stick out to me as much as it should have. I’ve seen enough technologies come and go that promised a simpler way to do things but ended up being short-lived. I dismissed GraphQL as another promise I wanted to believe in, but couldn’t. This was until I actually started developing apps with it. The learning curve of using GraphQL within the frontend wasn’t very steep and I found it refreshing to stop constantly building new APIs to get to my data.

The downside of these projects was implementing the GraphQL server itself and learning a new way of developing APIs. Using GraphQL seemed to be a lot simpler than implementing it thanks to new-to-me concepts such as implementing resolvers and designing a schema that worked with my current data. Of course, as with learning any new technology it eventually became easier and more routine but still required a fair amount of research upfront and a lot of trial and error. This is likely because I was learning the tech for the first time while implementing it. Mapping my SQL table-based data into a graph-like schema also proved to be tricky.

The Rise of GraphQL Databases - Here Comes the NoSQL Movement 2.0

The rise of the Web forced the engineering innovations that led to the NoSQL movement. With more interlinked, graph-like data and GraphQL, we now increasingly see the API layer in many applications getting ripped out. This is leading to a new trend: the rise of GraphQL databases to support new Web apps - completely redefining the architecture of a modern web application and how we develop them.

Mike Laoukides said the NoSQL movement freed developers to think of their data requirements and choose a database to match. The same is becoming true of new GraphQL databases. Developers can now choose a database tailored to this new era of data requirements, one that goes beyond the bounds of traditional database approaches and technologies.

A few products have entered the space with tech that allows developers to view their relational databases like GraphQL databases. Similarly, Mongo and other NoSQL databases like AWS’ Dynamo now come with cloud offerings of GraphQL support on top of their document databases.

GraphQL has even penetrated the headless-CMS world. Most of these products traditionally offered document stores as serverless backends for websites. Now, GraphQL has become the API of choice. Many new and trending startups in the space have positioned themselves as providers of GraphQL CMS databases. Existing players like Contentful and Jahia have also moved to GraphQL.

Other serverless databases describe themselves, and other web-native GraphQL stores, as aiming squarely at the GraphQL database market. The push for GraphQL functionality is definitely in full swing.

Dgraph - The Native-GraphQL Database and Cloud Platform

Dgraph offers a solution that was built specifically as a GraphQL database. This is unlike other GraphQL database offerings that have added GraphQL to existing databases. Dgraph was engineered around the kinds of data and queries that GraphQL apps need. This is the only “truly native” approach to a GraphQL database.

We believe that Dgraph’s edge over competitors is the fact that it was built as a distributed GraphQL database from the ground up. Dgraph backers and customers agree:

“Graph databases that exist today are not truly distributed: they run fine on one node but rely on a variety of architectural hacks to run on multiple nodes, and are thus not scalable,” said Salil Deshpande, managing director at Bain Capital Ventures

“This was a key consideration for us and one of the main reasons we looked at graphs. Writing queries against graphs is far more natural than writing SQL queries. You don’t need to have a deep understanding of data schema or write across hundreds of joins.” - Mark Boxall, Principal Software Engineer at FactSet

Just as the NoSQL movement built up steam through developer adoption and innovation based on the growing needs of the web, GraphQL databases are seeing a similar path. This is tightly linked to the engineering needs of the changing way modern applications work.

Growing web giants and new application requirements fed the technological need that became the NoSQL movement. This fundamentally changed the way developers think about databases. Now, as modern applications move towards interlinked and graph-like data, GraphQL databases are fundamentally changing how developers access data and are poised to become the backbone of the database movement over the next decade. The shift is already well underway.

Conclusion

This new approach to storing data and building APIs is bringing a new wave of development forward. Will it completely obliterate the paradigms of the past? It’s unlikely. But it does give developers a choice, much like SQL vs NoSQL, to build apps the way we want.

There are always many ways to solve a problem but usually only a few that solve it efficiently. For complex problems with data storage, retrieval, and APIs that have struggled with the current technology stacks, GraphQL and graph databases offer an alternative solution with the ideal balance of efficiency and simplicity.