Edit: I changed some wording to avoid potential bad connotations.
You don’t have to wander far into the functional programming world before you run into references to “purity”. Worse, you often hear it as an admonishment:
You should factor your code to use pure functions. Or, this function is complicated because it is impure. At the very edges there are even ways to write entire programs as “pure functions”. Those familiar with that practice have a tendency to lay blame for every fault at the feet of impurity.
Even the word is pretty polarizing. It suggests that your impure, dirty code is full of lurking issues and sits far away from some important ideal. But that’s a tragic point of view. Plenty of “impure” code is still simple, clear, and extremely valuable. The implied judgement is misguided at best.
But before we can talk about the tradeoffs of using pure functional style, we need to understand it. What does it mean for a function to be pure anyway?
What does it mean to be a pure function?
“Purity” is a quality that applies to a function. It means that the function uses no information not provided by its arguments and emits no information except its return value. In less abstract terms, it is a dumb pipe concerned exclusively with what goes in and out.
But, no, even dumber than that. A pure function cannot change its behavior over time. You might imagine a flow meter accumulating data over time on a physical dumb pipe, but that is too smart. A “function” like that depends not only on its inputs, but also its previous state. A pure function remains fixed throughout time.
The reason pure functions are interesting at all is that we’re looking for just about the dumbest, most austere notion of programming possible. Dumb, simple code has a harder time getting away from us. In a sense, pure functions play a similar role to Turing Machines as being incredibly simple and still universal. Functional programming is the practice of noticing just how far this utterly stripped down system can go.
Calling pure functions is “side effect” free
Pure functions are also characterized as functions which cause no side effects when they are called.
This description can sometimes be very intuitive. For instance, a function which prints out a message or causes a robot arm to crack and egg is clearly causing an effect. Calling them clearly conveys information beyond just their return values. If it didn’t we’d consider these functions to be broken! Moreover, both of these functions plausibly don’t have return values at all (or only trivial ones). A pure function without a return value might as well never get called.
This example is straightforward because we see the “side effects” as being the principle point of writing these sorts of functions. We wrote them to impact the world and expect to see them “convey information” through that channel.
More challenging side effects
Side effects aren’t always so cut and dry. Some more challenging examples might be:
- Reading from a global variable (definitely a side effect)
- Reading from a global constant (not a side effect!)
- Reading from a global variable that you promise will never change (depends on if you keep your promise)
These are tricky because changing global information may allow a function’s behavior to evolve over time, making it responsive to more than just its inputs. But, there isn’t a hard and fast rule here.
- Writing to a log (side effect, but harmless?)
- Taking no arguments and returning 2+2 (totally pure!)
- Changing the CPU state machine as necessary in order to evaluate 2+2 (obviously a side effect… right?)
- Changing the cache state of your CPU so that the second time you call a pure function it runs more quickly (wait wait wait…)
These examples take this tension to an even greater extreme. We might not care about information leaking through some benign side channels. We might choose to ignore the details of some abstraction—in this case, the CPU—and thus deliberately ignore some side effects. We might also ignore side effects which mostly serve to improve speed or convenience because we’re more concerned with correctness.
In any case, side effects are a useful tool for understanding purity. Simultaneously, purity is an interesting tool for understanding side effects. The way you divide between them underscores major values you bring to bear while programming.
Purity is an agreement
Which sort of gets to the heart of things: purity is an agreement. There isn’t a hard and fast rule for it as it will always depend upon what we agree to care about and what we agree to ignore.
We can reexamine the definition I gave first now in a new light: a pure function uses no information not provided by its arguments and emits no information except as through its return value. But, here we have to choose what we mean by “information”. It’s typical and standard to ignore information such as “how quickly did my function compute” and “what were the exact sequence of CPU instructions executed”, but that’s just a choice we make.
This gives a whole new light to conversations about purity, too. When someone declares that a function is pure or otherwise they’re implicitly talking about what information they want to pay attention to and what information they’re willing to ignore.
A pure function is totally in the hands of the caller
Ultimately, what purity boils down to is a contract: you call me with my expected arguments and I’ll give you back an answer. After that, it’s entirely up to you, the caller, what happens. A pure function will do no funny business, no back channeling, no side deals. You ask it a question, it provides an answer, and that answer won’t ever budge. If you ignore that answer then it’s exactly as though you never called the function in the first place.
That’s a big part of why people get so excited about pure functions. They are totally reliable partners in whatever work you need to get done.
Your new superpower: seeing purity and side effects
So now you have the beginnings of a new superpower: a refined taste for whether a function is pure or whether it will cause side effects when called. Not only will this be a foundation for understanding the point of functional programming, it’s a tool you can already use as you write programs.
If this is the first time you’ve ever dug in and learned what a pure function really is then you might be surprised to find that you already write them regularly. Most code bases accumulate at the very least a “utility drawer” of simple functions that are often pure transformation functions.
Other times people subconsciously tend toward pure functionality because it’s trivial to test. Since a pure function cares only about its input and produces information only in its output, there’s no need to set up context before your tests or tear it down afterward.
Try out your new superpower on code you’re already familiar with in your next project. Just take notice of the pure functions you write automatically, or go several steps further and see if it’s possible to factor your code into pure and impure pieces.
The more you use this superpower the more refined your sense for what side effects are ignorable will become. You’ll also develop a taste for at what points “purifying” a section of code can make it simpler, easier to read, and more maintainable.