Dgraph: JSON vs. Binary clients

When I started building the initial version of the Dgraph Go client, we were looking for a serialization format which was fast, easy to use and supported multiple language runtimes. We finally implemented our client using Protocol Buffers which gave twice the speed and consumed two-third memory compared to JSON according to our benchmarks.

Dgraph v0.2 already supported serialization to JSON for the HTTP client. For our language specific drivers, we wanted something that would give us some performance improvement over JSON. Though we use Flatbuffers for everything internally, they lacked support for encoding recursive data structures. Protocol buffers seemed the right choice because they worked with most of the modern languages and could encode recursive data structures efficiently.

How to use Protocol Buffers

To use protocol buffers, you define the message (data structures that form the basis of communication) in a .proto file and then compile it using the protocol buffer compiler. For communication, we use gRPC which is an open-source RPC framework by Google. gRPC requires services to be defined in the same .proto file. Using gRPC allows us to communicate in binary format which is faster than retrieving JSON formatted results.

// The Node object which can have other children node and properties.

message Node {
    uint64 uid = 1;
    string xid = 2;
    string attribute = 3;
    repeated Property properties = 4;
    repeated Node children = 5; // Each node can have multiple children
}

message Request {
    string query = 1;
    // and other fields
}

message Response {
    Node n = 1;
    // and other fields
}

// Dgraph
 service used for communication between the Dgraph
 server and client over gRPC.

service Dgraph {
    rpc Query (Request) returns (Response) {}
}

You can find the full .proto file here. The .proto file can be used to generate the corresponding Go code using the protoc compiler and the runtime library.

Benchmarks

In Go, you can easily measure how your algorithm does (in terms of time and space) by writing benchmarks. Go benchmarks are unique in that they’d iterate over the test code b.N number of times, where b.N is adjusted until the benchmark function lasts long enough to be timed reliably. To test how our implementation was doing against our JSON implementation we wrote benchmarks for it. But first, let’s understand what a benchmark is and how can we interpret its results.

Let’s write a simple function, which just adds integers to a list.

func addToList() {
    list := make([]int, 10)
    for i := 0; i < 1000; i++ {
        list = append(list, i)
    }
}

Here’s benchmarking code:

func BenchmarkAddToList(b *testing.B) {
    b.ReportAllocs()
    for i := 0; i < b.N; i++ {
        addToList()
    }
}

We can run the above benchmark using go test -bench=. Here, Go benchmark would repeatedly call the function with different values for b.N until it can be timed reliably.

$ go test -bench=.

BenchmarkAddToList-4  200000  8153 ns/op  22624 B/op  7 allocs/op

Here’s what the output means:

  • 200000 is the number of times the benchmark loop ran.
  • 8153 ns/op represents the time it took on average for an iteration of the loop to finish.
  • 22624 B/op is the number of bytes allocated per iteration.
  • 7 allocs/op is the number of distinct memory allocations per iteration.
  • Note that we get B/op and allocs/op only if we call b.ReportAllocs() as part of the benchmark.

If we change the line which initializes the slice to list := make([]int, 0, 1000) and run the benchmarks again, we get better results:

BenchmarkAddToList-4  1000000  1618 ns/op  0 B/op  0 allocs/op

The allocs/op reduced because we had already initialized the list with the appropriate size and the runtime doesn’t have to reallocate it when we append elements. Also the B/op reduced because the list is not initialized with 0 for all its elements.

Benchmarking ToPB against ToJSON

After implementing serialization using the protocol buffers, to get exact metrics we wrote benchmark tests for our ToJson and ToProtocolBuffer methods. These methods convert the internal SubGraph data structure to a byte array which is transferred over the network. Benchmark tests are an excellent way to compare different implementations or to measure if new code leads to any improvements.

// Benchmark test for ToProtocolBuffer method.

func benchmarkToPB(file string, b *testing.B) {
    b.ReportAllocs()
    var sg SubGraph
    var l Latency

    // Reading the SubGraph data structure from a file.
    f, err := ioutil.ReadFile(file)
    if err != nil {
        b.Error(err)
    }

    buf := bytes.NewBuffer(f)
    dec := gob.NewDecoder(buf)
    err = dec.Decode(&sg)
    if err != nil {
        b.Error(err)
    }

    b.ResetTimer()
    // Running the benchmark tests.
    for i := 0; i < b.N; i++ {
        pb, err := sg.ToProtocolBuffer(&l)
        if err != nil {
            b.Fatal(err)
        }
        r := new(graph.Response)
        r.N = pb
        var c Codec
        if _, err = c.Marshal(r); err != nil {
            b.Fatal(err)
        }
    }
}

// Benchmark test for ToJSON
func benchmarkToJson(file string, b *testing.B) {
    b.ReportAllocs()
    var sg SubGraph
    var l Latency

    f, err := ioutil.ReadFile(file)
    if err != nil {
        b.Error(err)
    }

    buf := bytes.NewBuffer(f)
    dec := gob.NewDecoder(buf)
    err = dec.Decode(&sg)
    if err != nil {
        b.Error(err)
    }

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        if _, err := sg.ToJSON(&l); err != nil {
            b.Fatal(err)
        }
    }
}

You can find the complete benchmark tests here. There are some differences in the algorithm that converts the internal Subgraph structure to JSON/Protocol Buffers. You can have a look at the code responsible for this here.

Using these benchmark tests we were able to improve our metrics by over 50% by switching over to []byte from {}interface for ObjectValue as part of this change. Later when we shifted to Gogo Protobuf, we compared these benchmarks again with the previous ones to confirm improvement.

Marshalling

This is how the final benchmark results compare for a query which returns 1000 entities in the result.

BenchmarkToJSON_1000_Director-2  500  2512808 ns/op  560427 B/op  9682 allocs/op
BenchmarkToPB_1000_Director-2   2000  1338410 ns/op  196743 B/op  3052 allocs/op

The benchmarks show that ToPB method is almost 2x faster than ToJSON as it takes much lesser nanoseconds per operation. The bytes allocated per operation show that ToPB allocates 65% less memory compared to ToJSON. You could find more information about those benchmarks and what we changed to get here in our README.

Unmarshalling

BenchmarkToJSONUnmarshal_1000_Director-4  1000  1279297 ns/op  403746 B/op  5144 allocs/op
BenchmarkToPBUnmarshal_1000_Director-4    3000   489585 ns/op  202256 B/op  5522 allocs/op

We can see that unmarshalling on the client would also be 2.6x faster for protocol buffers compared to JSON. ToPB allocates 50% less memory compared to ToJSON.

Golang protobuf vs. Gogo protobuf

If both your server and client are written in Go, then we recommend Gogo Protobuf instead of Golang Protobuf as the runtime library. Gogo has 2.3x faster marshaling while allocating 80% fewer bytes per operation and 1.5x faster unmarshalling compared to Golang protobuf as shown in the benchmarks below.

BenchmarkToPBMarshal_1000_Director-4    3000  360545 ns/op 226504 B/op   22 allocs/op # Golang protobuf
BenchmarkToPBMarshal_1000_Director-4    10000 156820 ns/op 49152 B/op     1 allocs/op # Gogo protobuf

BenchmarkToPBUnmarshal_1000_Director-4  2000  733481 ns/op 200241 B/op 5523 allocs/op # Golang protobuf
BenchmarkToPBUnmarshal_1000_Director-4  3000  487745 ns/op 202256 B/op 5522 allocs/op # Gogo protobuf

Note that Gogo protobuf has support only for Go as of now. However, if you are using some other language, this isn’t a problem. Gogo protobuf is backward compatible with Golang protobuf. Our Python and Java clients can still interact with the server (which does marshaling using Gogo), hence making Gogo a safe choice.

Recommendations

  • If you are interacting with the Dgraph from the browser directly with Javascript, we recommend that you use the more browser-friendly JSON.
  • If you are interacting with Dgraph through one of our language drivers, we recommend you do it using gRPC and protocol buffers.
  • If you are using Go client with gRPC, you will automatically be using the faster Gogo protobuf.
  • Overall, fastest way to query Dgraph is via a Go client communicating over gRPC.

We would love to hear about your interaction with the Dgraph server.