Category Theory is an interesting theory all its own. But on top of that, I wanted to learn it because Constructor Theory is (according to Logan Chipkin’s interview with Chiara Marletto) being put into a Category Theory based grammar. (It currently uses Set Theory instead, which is related but not as powerful.)
I’m going to do a series of posts documenting the basics of Category Theory if for no other reason that to help me remember it. I’m mostly taking this from the book Category Theory for Programmers by Bartosz Milewski (CTFP). Learning Haskell was a bonus since I was struggling to understand the concepts without the programming language examples. Learn You a Haskell for Great Good by Miran Lipovaca is the best beginner’s guide to Haskell I’ve found so far.
All tutorials I’ve found so far on Category Theory have one main weakness, they describe the concepts but don’t offer examples. I’m going to try to correct that, but frankly I may not understand it correctly and my examples might be bad. As with everything, if you make a mistake, oh well. Go learn it on your own. Oh, and tell me what I get wrong so I can correct myself.
Objects and Arrows
The first most basic concept in Category Theory is objects and arrows. Yes, that’s what they are apparently called. “An object can be drawn as a circle or a point, and an arrow… is an arrow.” (CTFP, p. 2) Arrows are equivalent to functions and are also called “morphisms” apparently because they represent how an object in one category relates to an object in another category analogously. I.e. how it ‘morphs’ from one object to another via that function.
Category A: The positive integers
Category B: The positive even integers
A morphism / function between these two sets (We’ll call it f) is the function that maps each positive integer to it’s equivalently indexed even number. That function M would be described as:
b = f(a) = a * 2
And it would be drawn as something like:
A -f-> B (That’s supposed to be an arrow labeled f)
So if a = 1, i.e. the first positive integer in category A and we apply the morphism to it we get the first positive even integer of 2. So 1 -> 2; 2 -> 4, 3 -> 6, etc.
Category theory is at heart about composition. Composition is represented by chains of arrows or by showing a single arrow representing the whole chain.
Starting with the same Category A and B above, let’s add a new category:
Category C: The positive odd integers
What is the morphism/function from B to C? It might be:
c = g(b) = b – 1
This would be drawn:
B -g-> C
Now here is the question? Do we have a morphism from A to C?
Well, of course we do by composition of the morphism from A to B and then the one from B to C. Let’s draw it this way:
A -f-> B -g-> C
Now it’s not too hard to see that we could easily get from A to C by composing functions f and g, which we notationally write as “g o f” or “g . f”. (The book says to read this “g after f” to remind yourself that Category theory has right to left order of composition.)
So g o f = g(f(a)) = (a * 2) – 1
Which, if you do the math in your head, is a mapping between the positive integers to the odd integers. This could also be drawn simply as:
A -g o f -> C
Or something like that.
One extra concept here that matters, especially if you are programming this in something like Haskell, though this is part of Category Theory itself: functions have types of input and output and the inputs and outputs have to match. So f input something of type A and output something of type B and g input something of type B and output something of type C and you can compose f and g because the inputs and outputs match. (i.e. f outputs a type B which is the right input for type C).
Properties of Composition
There are two important properties of composition:
- Composition is associative. So: h o (g o f ) = (h o g) o f = h o g o f
- For every object A there is an arrow that loops from the object back to itself. This arrow is a “unit of composition” which means “when composed with an arrow that either starts at A or ends at A, it gives back the same arrow.” (p. 6) The unit arrow for object A is called idA (“identity on A”)
This implies that if f goes from A to B then f o idA = f and idB o f = f.
How does all this relate to functional programming? The book claims that composition is the essence of programming, i.e. “We decompose bigger programs into smaller problems.” (p. 8). Notice that conceptually a function/morphism/arrow is a pure function in functional programming.