the scapegoat dev

Learning and practicing abstraction

A few days ago, I wrote an article about why I am learning category theory, which made it to the Hacker News front page and thus put a lot of eyes on the article. I enjoy it when that happens because I know that there will be an active comment thread with civil if often heated, debate. In this case, the discussion was fantastic, and I got many additional resources, links, and inspiration from the responses.

A few comments were, however, to the tenor of "what does category theory bring me as a developer concretely," often also phrased as "you should study X instead; it's much more concrete and applicable."

Questions like these are genuinely hard to answer because there is nothing I can point to in my programming that I could explain as coming from category theory. All the code I write is plain old design patterns (I work in a language that uses the vocabulary from category theory only to obfuscate things, vs., say, Haskell, where I could reuse type class machinery). These design patterns can be more easily studied independently or using concepts from algebra or analysis.

After a few discussions and continuing to very slowly read through Eugenia Cheng's "The Joy of Abstraction" (a fantastic book of crystal-clear simplicity, I wonder why I even write this blog post when I could point at the book), I think I can put into words what category theory brings to me as a developer and how it influences the way I program.

Category theory, the final step in the abstraction ladder

Category theory is less about being a "productive" version of mathematics, full of insights and theorems and proving novel properties about things, and more about providing a framework that allows us to be abstract all the other mathematical objects that diverse disciplines study into "simpler" objects.

Computers are ultimately mathematical machines, and every aspect of writing software and running software (think distributed systems, CPU caches, operating systems, not just lambda calculus) can be considered mathematical objects.

If we consider abstraction to be a ladder with many levels, where each level simplifies (or "erases") details of the levels below to expose their commonalities, category theory tries to position itself at the highest level. Categories form a category (or many categories) as well. Every time we form a new abstraction within the language of category theory, we can immediately move on to abstracting said new abstraction.

It's abstraction all the way up, dude.

Abstraction, the intangible beast

I have a hard time thinking about abstraction. As cognitive animals, we are designed to be abstraction machines. The natural world is infinitely complex, and it needs to be broken down into "simpler" (more abstract) concepts for our brains to reason about it.

I am not even an amateur neuroscientist, but I think it's probable that there is a difference between reasoning (carefully working things out) and intuiting (having a gut feeling or intuition about something). While reasoning requires us to abstract things, intuition is a much more diffuse thinking mode where the bandwidth from the natural world to cognition is potentially much broader. Intuition is fast, and reasoning is slow (these are ideas lifted from "Thinking fast and slow" by Daniel Kahneman.)

To turn slow thinking into fast thinking, abstractions need to be internalized. And once they are internalized, we cease to be conscious that we are working with abstractions. After all, our thinking is now intuitive and fast. We spend so much time learning to speak, to count, and as developers, what a function and a variable are, how HTTP works—yet these concepts are almost invisible to us once acquired. Of course, we speak words and form sentences. Of course, we use variables to store values. Of course, 1 + 1 = 2. Of course, HTTP GET 200.

Category theory as deliberate practice for abstraction

I like the act of forming abstractions and working with them until they become intuitive. Abstraction is why I am drawn to computers: I love seeing how things that seem different or tedious can be reformulated in a much more straightforward manner yet work just as well (or better). I love legacy software and extracting its inner meaning and cleaning it up, leaving the areas that can be messy and focusing on its foundational abstractions.

I spent a lot of time learning about many different abstractions. The easiest and most gratifying way for me is to learn other programming languages and write actual, big, real-world systems with them. The next best way is to learn some mathematics. But it wasn't until I started digging deeper into category theory that I realized all these things were about discovering and refining abstractions. Similar to how a musician practices their instrument (playing scales, trying out new fingerings, using a metronome), this is what category theory is for me: an instrument to do deliberate abstraction practice.

Abstraction is a two-edged sword. It makes the world simpler once the abstraction has been well understood (when it can be intuited). But getting to that point requires a lot of effort—we often forget how much effort since its very goal is to make thinking effortless. Speaking in the language of abstraction makes it easy to exchange complex ideas, but it also makes it easy to alienate people who haven't formed the same abstractions.

Furthermore, abstraction is just that: abstract. Abstraction can't directly be applied to something concrete and, as such, can never stand on its own. I can intuit that a sewage system is just a series of connected pipes and is thus very similar to the internet, I'll still have to learn how to change an O-ring if I want to stop my tap from leaking, and all my networking knowledge won't stop me from fucking it up either. However, I might very well be able to provide some thoughtful advice about where a lack of pressure might be coming from since I have a good grasp of bandwidth, back pressure, and other concepts (no idea if this makes any sense because I have no idea how plumbing works, but it sounds cool).

Why people think abstraction is useless

I think so many people believe that learning more abstract things is useless because they haven't realized that all the things they know were abstract to them at some point. Because learning is rarely framed as "learning an abstraction for X, as well as its practical application" and instead just called "learning X," it causes people to write responses such as "you could learn all this from Y instead," where Y is a lower rung on the ladder of abstraction (say, abstract algebra, or software design patterns, or "just write some CRUD code, who needs that fancy stuff anyway").

These people learned X, internalized its abstractions, and think that's all there is to it because, after all, they can get their work done just fine. This kind of anti-intellectualism (because I think it is anti-intellectual to tell other people not to engage in whatever tickles their interest because it is "useless." Things are only useless to the person that has no use for them, and what is joy if not something useful) stems from not having been exposed to the idea that abstraction is a ladder that can be climbed at different speeds and to different heights, that some people genuinely enjoy working at more lofty heights, and that everybody needs to come down to get real work done.

I love category theory because it makes my abstraction muscles stronger. This means that I can make the world I work with (software, computers, networks) simpler by stirring it all into one giant "everything is just a monad, yo" melting pot, and then, through sheer professional practice and muscle memory, bring it back to the real world and write it down as a line of ugly PHP.

The PHP I write today solves the same problem that I used to solve 20 years ago: a user sends a request, I transform their request into some query to storage, and I return the result and handle errors while doing so.

The value of category theory is that I find more and more ways to break down an absurd variety of problems to exactly that formulation: response = lookup(request). It doesn't teach me anything about what response or = or lookup or request or () is, I need to work these things out using other resources (and often just writing a lot of code).

That the formulation doesn't tell me anything about how it works in the real world is precisely why category theory looks both trivial and is so helpful. It means I can forget all the detail and form a naive, simple, intuitive understanding of what works and what doesn't.