This document is the second episode of a series to introduce Dgraph concepts through hands-on, and is covering the notion of Dgraph Schema.
The series already covered:
The hands-on examples are a way to better understand each concept by experiencing directly with Dgraph. They are not a substitute for product documentation.
Follow Episode 1 to get a Dgraph instance up and running and load data.
You can perform all the steps of this post using a local Learning Environment with a Dgraph instance, and Ratel UI running in docker containers.
We are continuing the hands-on with the data loaded in Episode 1.
In case you need to re-create the data run the following mutation from a terminal window.
curl "localhost:8080/mutate?commitNow=true" \
-s -H "Content-Type: application/rdf" -X POST -d $'
{
set {
<_:jedi1> <character_name> "Luke Skywalker" .
<_:jedi1> <eye_color> "blue" .
<_:leia> <character_name> "Leia" .
<_:sith1> <character_name> "Anakin" (aka="Darth Vador",villain=true) .
<_:sith1> <has_for_child> <_:jedi1> .
<_:sith1> <has_for_child> <_:leia> .
}
}
' | jq
We will now do some basic queries to understand why we would need to tell Dgraph more about the predicates and nodes in the form of schema metadata.
Let’s find the list of entities having the predicate eye_color
equal “blue” and give the character_name of those entities.
We will explain the query language syntax in episode 3, so just use the following query :
curl "localhost:8080/query" -s \
-H "Content-Type: application/dql" \
-X POST \
--data '
{
characters(func:eq(eye_color,"blue")) {
uid
character_name
}
}' | jq
The response is an error:
{
"errors": [
{
"message": ": Predicate eye_color is not indexed",
"extensions": {
"code": "ErrorInvalidRequest"
}
}
],
"data": null
}
If you are using Ratel UI, the Error
tab of the result panel, should display
Message: : Predicate eye_color is not indexed
Dgraph is complaining that the predicate eye_color
is not indexed.
An index is required to be able to use certain query functions.
Let’s try another query to find the list of entities having the predicate character_name
containing the term “Luke” and give the entity uid
, the character_name
and eye_color
of those entities.
curl "localhost:8080/query" -s \
-H "Content-Type: application/dql" \
-X POST \
--data '
{
characters(func:anyofterms(character_name,"Luke")) {
uid
character_name
eye_color
}
}' | jq
In this case, the error is more specific
Message: : Attribute character_name is not indexed with type term
In order to use the function ‘anyofterms’ on a predicate, Dgraph is expecting a specific index type named term
.
The function documentation specifies which kind of index is needed by each function.
Certain query functions require specific index types.
So let’s add indexes by pushing a Dgraph schema to the /alter
endpoint:
curl "localhost:8080/alter" --silent --request POST \
--data $'
character_name: string @index(term) .
eye_color: string @index(hash) .
' | jq
The response should be:
{
"data": {
"code": "Success",
"message": "Done"
}
}
At this stage, the Dgraph schema is simply a list of predicate names with predicate type and indexes, in the following syntax:
character_name: string @index(term) .
eye_color: string @index(hash) .
We can now re-run the queries.
curl "localhost:8080/query" -s \
-H "Content-Type: application/dql" \
-X POST \
--data '
{
characters(func:eq(eye_color,"blue")) {
uid
character_name
}
}' | jq
The JSON
tab of the Result panel should display the result
{
"data": {
"characters": [
{
"uid": "0x1",
"character_name": "Luke Skywalker"
}
]
},
...
The uid may differ on your system. It is an internal unique id generated by Dgraph.
And the second query
curl "localhost:8080/query" -s \
-H "Content-Type: application/dql" \
-X POST \
--data '
{
characters(func:anyofterms(character_name,"Luke")) {
uid
character_name
eye_color
}
}' | jq
will result in
{
"data": {
"characters": [
{
"uid": "0x1",
"character_name": "Luke Skywalker",
"eye_color": "blue"
}
]
},
...
With the proper indexes declared in the Dgraph schema, the queries are working as expected!
We noticed that you can always add a fact about an entity using a mutation. We did that in Part1 when adding :
{
set {
<0x01> <eye_color> "blue".
}
}
We don’t need to tell Dgraph what the entity <0x01> is. In that sense, triples are schema-less and very flexible.
However, there are two use cases where knowing the expected predicates of a given entity will help:
The later can be done with a query using the expand
function:
First let’s get the internal ID of our entities:
curl "localhost:8080/query" -s \
-H "Content-Type: application/dql" \
-X POST \
--data '
{
characters(func:has(character_name)) {
character_name
uid
}
}' | jq
The result is a list of all entities having a character_name
{
"data": {
"characters": [
{
"character_name": "Luke Skywalker",
"uid": "0x1"
},
{
"character_name": "Leia",
"uid": "0x2"
},
{
"character_name": "Anakin",
"uid": "0x3"
}
]
},
...
Note the uid
for “Luke”, in our example “0x1”.
Replace “0x1” by the correct uid
in the following queries.
curl "localhost:8080/query" -s \
-H "Content-Type: application/dql" \
-X POST \
--data '
{
character(func:uid(0x1)) {
expand(_all_)
}
}' | jq
At this point, the result is empty: Dgraph can find the entity but does not know what to expand, i.e the list of predicates for this entity.
{
"data": {
"character": []
},
...
We have the same issue with the delete operation.
Deleting an entity is done by deleting everything Draph knows about this entity. This is done with a mutation using wildcard delete
.
curl "localhost:8080/mutate?commitNow=true" -s \
-H "Content-Type: application/rdf" \
-X POST \
--data '
{
delete {
<0x01> * * .
}
}' | jq
replace 0x01
by the uid
The mutation is ‘Done’, but a simple query will show that the entity is still there:
curl "localhost:8080/query" -s \
-H "Content-Type: application/dql" \
-X POST \
--data '
{
character(func:uid(0x1)) {
character_name
}
}' | jq
The delete operation using the wildcard, did not delete the predicates. In order to produce the expected result, Dgraph should know the list of the predicates for this entity.
This is the role of of Types
in Dgraph Schema.
Let’s define a type Character
with the list of predicates a Character
may have:
curl "localhost:8080/alter" --silent --request POST \
--data $'
character_name: string @index(term) .
eye_color: string @index(hash) .
has_for_child: [uid] .
type Character {
character_name
eye_color
has_for_child
}
' | jq
Verify that the result is a “Success”:
{
"data": {
"code": "Success",
"message": "Done"
}
}
We just updated the Dgraph schema and declared that an entity of type Character
may have facts about character_name
, eye_color
, has_for_child
.
We need to tell Dgraph that our entities are of type Character
.
To do that we save a fact, i.e a triple, using the reserved predicate dgraph.type
.
curl "localhost:8080/mutate?commitNow=true" \
-s -H "Content-Type: application/rdf" -X POST -d $'
upsert {
query {
characters as var(func: has(character_name))
}
mutation {
set {
uid(characters) <dgraph.type> "Character" .
}
}
}' | jq
At this point, every entity that has a characer_name
is of Type Character
and Dgraph knows the predicates for type Character
.
We can re-test our query using expand:
curl "localhost:8080/query" -s \
-H "Content-Type: application/dql" \
-X POST \
--data '
{
character(func:uid(0x1)) {
expand(_all_)
}
}' | jq
The result should now provides the predicates values:
{
"data": {
"character": [
{
"eye_color": "blue",
"character_name": "Luke Skywalker"
}
]
},
Let’s delete all the facts about this entity in a mutation
curl "localhost:8080/mutate?commitNow=true" -s \
-H "Content-Type: application/rdf" \
-X POST \
--data '
{
delete {
<0x01> * * .
}
}' | jq
and re-run the query
curl "localhost:8080/query" -s \
-H "Content-Type: application/dql" \
-X POST \
--data '
{
character(func:uid(0x1)) {
expand(_all_)
}
}' | jq
to verify that Dgraph has no information anymore about this entity.
type
and list all predicates for this type in the Dgraph schema.type
of entities.It is a best practice to create a schema with proper indexes before saving a large volume of facts. Indexes will improve data ingestion performance.
Photo by cottonbro studio