I hate CRUD a lot. But do you know what I hate more? GraphQL. GraphQL is the absolute bane of my existence. There is not a single word that can properly encapsulate just how much I despise opening an API spec and seeing GraphQL. Like don’t get me wrong, I’ve read Theory (the yearly HN articles on why GraphQL is the Next Big Thing) but it still does nothing to combat the sheer annoyance I have at GraphQL.
I get it, your REST API sucks
GraphQL wasn’t really meant to be “this is better than REST in every way”, and I think that is more an accident than anything else. GraphQL is simple: You send a query, and it sends what data you need. This is great when you have extremely complex data operations (like you are Facebook needing to managing streaming tons of different user data in different circumstances), or when you need the ability to co-fetch or stop N+1 fetching problems.
So in theory, I’m all for that. I hate when I have to make several requests to the same API, and GraphQL gives you an easy way to prevent that. Take an example of using the Stripe API for product pricing, you need to first make a fetch request to get the ID of the Price object, then another fetch to get the Price object. Latency stacks, especially in SSR contexts. In GraphQL, that is one fetch (which is how the BigCommerce product API handles it). The theory here is twofold: You can be more generalized with how you define objects and relations, and you can save on server to server latency by combining things into one request.
But in practice, I think more often than not, this is used to wave away poor documentation. Oh, here is your list of queries and mutations, you can figure it out! Who needs examples, here is a playground where you can view a flow chart nobody will understand than compose your query! There is a better way to do that? Oh who cares. Particularly, I think it ends up as an excuse to not think about data relations or object models as much, since you can just combine things as needed. But at a certain point, you still have to, and this is where you start to see some of the cracks: Several levels deep fetches, pagination horrors, you name it. You end up 6 curly braces deep in a fetch, trying to figure out how to link this and that and whatever nonsense you need to properly fetch everything in one go. Well designed GraphQL interfaces take way more effort than well designed REST APIs, and badly designed GraphQL interfaces are, in a way, worse than a badly designed REST API. I hate to use this cursed phrase, but REST can be fairly self documenting (especially CRUD). GraphQL, by nature, is the opposite of self documenting: Nobody knows what your query is meant for unless it is well documented.
(Not) Cache Money
One of the most significant flaws with GraphQL, in my opinion, is caching. Caching is the lifeblood of the internet, without tiered caches we would see TTFBs explode much higher. The problem is that GraphQL, unless you do a lot of work, is extremely difficult to cache.
There are a few approaches to caching GraphQL that are popular, and IMO all of them kinda stink.
- Just stick the query and response in a KV. This means that you can’t cache individual objects, so a slightly different query can’t be partially cached
- Cache at the object layer. This involves deserializing the request, sticking the objects in some sort of KV, and then grabbing them when you need and sending out partial requests
- Implement a GraphQL server in your cache and only send partial requests for information you need.
That last one is implemented by a few major GraphQL libraries for you, but honestly, this is a horrible approach to caching. In general, I hate black box caches that don’t let you expose down to a basic KV; it means you are dependent on their software in order to maintain your cache infrastructure. When working with serverless stacks like Cloudflare, that can be pretty difficult.
Compare this to CRUD type REST APIs, which is usually pretty simple. For GETs, you can cache the request URL and response in KV. If you need to differentiate by headers (like for user accounts), you can either add this to the key, or create a new mini cache. Partial updates aren’t really a thing, since each endpoint by design will return just its data model.
This also exposes another weakness of GraphQL: If-Modified-Since and Cache Control headers. GraphQL doesn’t have a neat standardized way of doing either of these, which means a server doesn’t have a way of doing return-fast with the headers. Cache Control isn’t as common as it should be, but If-Modified-Since and Last-Modified are staples of good API design, with the paradigm of returning fast when the server doesn’t need to lock the DB and do work. Being able to communicate the TTL of the returned data is extremely good practice as well for CRUD, even if it isn’t always done in practice. I know it seems like I’m giving a win to CRUD/REST when a lot of APIs don’t go the extra mile, but GraphQL doesn’t really have any good way of doing this, so I think it’s fair.
Server Latency
By far, one of the things that annoys me the most about GraphQL, is the server latency that it adds to process requests. This isn’t even about the checks that need to be in place to not overspend resources, the actual time that ~most processors take to actually respond to a single query can get very high.
Take the previous Stripe vs BigCommerce example. It’s not exactly the same, since I don’t know how they cache or are peered internally, but pretty consistently, doing two Stripe API calls is faster than one BigCommerce GraphQL call. If that isn’t enough though, BigCommerce also has a (badly documented) CRUD API, and doing a GET from that is faster than an equivalent GraphQL fetch. It’s by up to 100ms of difference too!
A lot of this also comes down to the actual underlying scope of what needs to be checked: Each additional object and query leads to more possibilities and more work for the parser. We don’t have functional programming GraphQL parsers in wide use yet, so having queries where the user will always query the same values almost all the time is just wasted processing time and bandwidth. In general, unless a REST API will result in significant over-bandwidth usage, it’s probably not worth the extra TTFB latency to add GraphQL. 100ms adds up fast, and you can transfer a lot of data in that time.
I think GraphQL has a place in programming, and it’s with very specific applications. But I also think about 80% of the time, your app does not need GraphQL, and all GraphQL is doing is adding complexity on everyone’s end. Or maybe I’m just crazy, and this is a bad reaction to having to work with the BigCommerce GraphQL API so much. Who knows.