Introducing Modus, a serverless framework for building with GraphQL
Turning 300 tables into a single graph

How Capventis is Using Dgraph to Streamline Messy Legacy Data

Dgraph is a no-brainer. I can ingest any data and any structure, and I don’t have to worry about it. If I were trying to do this in a SQL table, we would end up with horrible joins and get tied up in knots. Dgraph is a godsend.”

Download Case Study

Summary

Executive Summary

Capventis pulls vast quantities of legacy and real-time data from diverse sources for its top-tier clients. The team needed a solution that could streamline and scale, offering improved insights and performance to deliver their clients the best results. After testing multiple graph databases, Capventis found Dgraph is the easiest to use with the best performance, making the Capventis team more efficient, accurate, and flexible.

Dgraph is a no-brainer. I can ingest any data and any structure, and I don't have to worry about it. If I were trying to do this in a SQL table, we would end up with horrible joins and get tied up in knots. Dgraph is a godsend.

Mike Hawkes

CTO, Capventis

Problem

As a specialist business and technology consulting company based in the UK, Capventis provides solutions for customer engagement, experience management, analytics, and data science. The company’s technology facilitates and simplifies complex integrations from multiple data sources and formats, providing holistic views of customer experiences and engagement.

Customer experience design, support, and sentiment analysis require a range of tools and data. Examples include:

  • Broadcasters might need analytics on how users are consuming media. By discerning consumers’ media preferences, the broadcaster can provide real-time recommendations. This data can also be used to set email cadence and offers.
  • Online retailers may want to track multichannel interactions across email, websites, visits to dealers, feedback from a test drive, and post-purchase sentiment.

With notable customers in all major market segments, Capventis integrates data that might come from legacy datasets with hundreds of tables, as well as modern API-driven Saas applications.

A client might have Zendesk for tickets, Wrike for time management system Qualtrics for surveys but is reliant on a giant old Oracle database. There are typically many legacy systems that need to interact. None have a connector. And those that do have connectors don't handle maintaining integrity between various systems.

Mike Hawkes

CTO, Capventis

This variety of data and integrations required a specific data stack, which Capventis named Glü. Based on the scope and scale of their projects, the team wanted a GraphQL-based interface with a flexible, reliable, high-performance graph database on the back end.

Database design requirements included:

  • Flexible data source integration and ETL
  • Rapid scalability from test applications to global scale production environments
  • Platform agnostic – works on Windows, Linux, macOS, iOS Android
  • Small footprint and efficient data usage
  • Reliable and enterprise-ready

Approach

Capventis’ team used the Go programming language for most integration projects. The company was already bumping against the limitations of scale-out, rapid and flexible integrations using legacy SQL databases with tabular structures that required large numbers of joins to integrate disparate data sources. Moving from a tabular structure to graph-based databases would give Capventis more flexibility and agility to design projects and mix-and-match data sources and types.

With a graph, I can say 'I want that data from there, this data from here, and bring them all together. The linkage and integration can be handled easily by the graph. You don't need to worry about the structure of the data or a schema.

Mike Hawkes

CTO, Capventis

When Dgraph was released into the Go community, Capventis decided to test Dgraph as a solution option, along with two other graph databases: neo4j and Cayley. One of the other databases was complicated to deploy and configure, causing headaches in development that also limited scalability. The other database proved to be extremely inefficient; moving a million records for GDPR compliance required gigabytes of space and exhausted the available memory.

Dgraph testing was far superior to the alternatives. It was configured and ran in less than an hour and quickly scaled from one compute node to dozens with minimal configuration changes. Dgraph also provided global sharding, enhancing scalability. Testing Dgraph, Capventis found it had a tiny data footprint that efficiently handled the million record test and made it suitable for embedding.

The Capventis team preferred the added features of Dgraph’s DQL query language.

It accepted normal native GraphQL queries and had a lot of the features that were missing in the language - like filtering and cascading requests. We could script quickly, pull objects in and out, and all of this was still handled in the Go memory models. It was awesome. The team loved Dgraph and DQL so we decided to adopt it for Glü.

Mike Hawkes

CTO, Capventis

Outcome

A test of Dgraph came with a project for a government agency involving integrating several older proprietary databases that were still in flight and being regularly updated with new information.

The database involved multiple departments that had merged and separated and merged again over the years. As a result, what should have been a dozen tables had exploded into 300, and the existing database vendor refused to collaborate on the project. The government agency couldn’t provide Capventis with an accurate data schema. Meanwhile, another department added data but continued making changes to that data even as the aggregation was underway.

The Capventis team used Dgraph to convert all the legacy data from multiple sources and cleansed it on the fly with no data loss – and generated a clear schema ready for immediate queries.

We linked Glü to their database servers, fetched all the data, and threw it into Dgraph. Within 20 minutes, I had the entire structure of the data set with all the proper interactions captured in Dgraph. I didn't expect it to be as good as it was. We solved this entire problem in one hit. It was months in the planning and minutes in the execution in the final version.

Mike Hawkes

CTO, Capventis

Capventis continues to use Dgraph for a wide range of projects. The team deploys Dgraph in AWS and Microsoft Azure as Docker containers and has Dgraph and Glü development environments that can run on laptops. For Glü, Capventis has added many additional functionalities such as encryption, rate limiting, and data schema visualizations, all leveraging Dgraph. Capventis frequently uses the Dgraph DQL to quickly build interfaces for exporting compound datasets collected from various systems into multiple business intelligence tools simultaneously.

Conclusion

To date, Dgraph has proven highly available, reliable, and fast. Capventis hasn’t encountered any performance issues, despite working with some very messy data integrations and complex graph traversals. In the rare instances when Capventis has had a problem, they found the Dgraph community to be responsive and friendly. That level of support is cited as a significant part of the decision to push Dgraph to all relevant Capventis customers.

For Capventis customers, Dgraph often opens up new horizons in data. By enabling easy linkages between edges and facets and providing flexibility in how queries can be built, Dgraph can pull queries that give users views of data or understandings of relationships between nodes on the graph that was not previously possible to explore.

It's been consistent. It's been performant. And it's been scalable. I can take Dgraph from a demo environment, turn on more nodes, and it immediately becomes an enterprise version. It's been an enjoyable journey.

Mike Hawkes

CTO, Capventis

If the download is not working, please click
here
for the PDF file.