How The Graph Works

A non-technical deep dive into the "Google of Blockchains"

Sep 28, 2022

What Is The Graph?

The Graph, commonly referred to as The Google of Blockchains, is a complex solution to the problem of blockchain indexing.

You’ve already used an indexer before: the back of a book is an index. It helps you find information in a book more quickly than reading through the book, line by line, until you find the information you seek. Blockchain indexing is similar. A more accurate definition of The Graph would be “An open API that defines how data retrieved from blockchains will be organized and indexed.”

The Graph was quite difficult to understand, but hopefully this article below fixes that.

What Are The Two Problems that The Graph Solves?

The Graph solves multiple problems. One problem is the time it takes to sift through blockchain transactions. All blockchain transactions may be public, but many transactions means that you must now spend time looking for data. Ethereum has a million transactions a day, for example. This poses a problem for app developers who want to access certain transactions only. Maybe they want to send a message to only the addresses that have received or sent 1 billion dollars’ worth of crypto. One way to solve this problem is to look through each transaction until you find a billion dollar transaction. However, this would take hours, even days, to get through. That’s no good. This problem of finding the right blockchain data is the first problem that The Graph solves.

Decentralized app (dapp) developers have faced a dilemma when solving this problem: either prioritize decentralization or speed. Decentralization means that you spend your own money and time indexing the blockchain and making your own convenient database out of it. But it has a cost – the dapp will run more slowly. If you want to be speedy, then you can use someone else’s curated database of blockchain information. The cost to this is that you must trust the other party to not tamper with or incorrectly relay the blockchain information. This dilemma is what The Graph solves through decentralization.

Before we continue, let’s introduce an analogy that we’ll return to repeatedly, courtesy of Kyle Rojas. A blockchain is a lot like a disorganized Amazon Warehouse. Instead of things being arranged, shippers just throw things into one big pile in the middle of the warehouse. Yikes!

The more recent the blockchain data, the easier it is to access, just as things on the top of the pile are easier to see than things at the bottom. If you want to buy … nail polish and see it at the pile’s top, then it’s easy to send it. But if there’s no nail polish at the top of the pile, then you have to go digging for it. But this just means that you must spend time digging for it.

Sound familiar? The transactions in a blockchain are the products being thrown into the big pile. The most recent transactions are the products at the top. In theory, everything in the pile has some value, just as every transaction has value. But without a way to sort everything, they aren’t valuable. That’s how the blockchain works in this analogy.

A subgraph is the product that The Graph makes which makes The Graph so useful. It is a set of data collected from the blockchain that doesn’t have every other bit of data in it too. Keeping with our analogy, a subgraph would be an organized collection of all the different shoes from Amazon’s pile of stuff. That could be one subgraph. Another could be all the different boots, or shampoos, or sex toys. A real-life example of a subgraph could be a list of addresses that have received or sent more than $1 billion dollars’ worth of crypto. They’re a dataset with relevant information.

To recap,

indexing is when you develop a system that lets people find relevant information more quickly.
Developers who want blockchain data had to choose between
- quick performance, compromising decentralization, or
- decentralization (compromising performance).
An analogy for a blockchain is a pile of items in an Amazon Warehouse.
- The biggest issue you’d face would be finding the item you want. The more recently it was thrown into the pile, the more likely it is that you’d find it.
Subgraphs are digestible data sets made from blockchain data.

What Are The Three Roles Played In The Graph?

The Graph solves the indexing and dilemma problem by decentralizing the entire process. It splits up the work into three different roles. Each one helps to make useful blockchain indexes because each can earn a profit from doing so. These three roles are: Indexers, Curators, and Delegators.

Curators are the people who bet on which subgraphs will be in high demand. They bet GRT tokens (The Graph’s native token) on the subgraphs they think will be popular. The more popular a subgraph gets, the more GRT they earn. Keeping with our analogy, we’d say that curators are the people who see that summer is coming up, so they put down money on the subgraphs for slippers and boat shoes because they think these will become popular. Curators may be developers who’ve made their own subgraph for their own purposes and want their subgraph to be indexed. They can also be an end user who thinks a subgraph will be valuable and thus worth the primary investment of GRT.

Indexers are the next group of people. They would be analogous to the small businesses on Amazon that use Amazon products but differentiate in terms of branding. They do three things to earn them GRT tokens. One is that they index relevant sub-graphs. This would mean taking sub-graphs and organizing them together – kind of like how Call It Spring brings shoes, boots, and slippers together in one store to sell. Indexers get GRT for indexing popular subgraphs together, and prices for indexing are kept low because of indexer competition.

The second way Indexers earn GRT is to serve queries on subgraphs. This would be the equivalent of Call It Spring looking through their computers to find if there’s a size 13 in their system. The only difference would be that a customer would pay Call It Spring for every search. Thus, indexers earn money for looking up information in their subgraphs.

The third and final way that indexers make money is by running The Graph’s nodes. The Graph itself is a blockchain and uses proof-of-stake as the consensus mechanism to validate transactions. Anyone who wants to run a node needs to stake GRT, just as anyone who wants to be a validator for Ethereum needs to stake Ether. If indexers correctly validate transactions on The Graph’s blockchain, then they earn GRT. If they do not, and transactions are found to be incorrect, then their stake is slashed.

That was indexers. Last but not least are the delegators. They’re a lot like investors because they stake GRT on behalf of indexers, and they get a part of the rewards from indexers. You as a random member of the public can become a delegator easily. You just need a MetaMask wallet, some GRT, and an internet connection.

How Exactly Does This Prevent The Trust Problem?

After learning all of the above, I couldn’t see how this elaborate setup solved the trust problem we went through earlier. What stopped indexers from modifying data for their own advantage?

I got different answers from different sources. Some YouTube videos say that there is a group of people called Fishermen and Arbitrators who decide whether the information provided is accurate or not. But a developer friend of mine says something else entirely. As a customer, you can apparently look at the code used to suck in data from the blockchain to create a subgraph. And if the code looks like it will pull in the data that you need, then you don’t need to check the subgraph. The data is transparently collected.

But if my developer friend is correct, then why not share your database and the code used to write it? I guess we’ll have to leave this as an open question.

Straight to the Point

Discussion about this post