Choosing a loss function is an important step in setting up a well-designed machine learning task. It’s a choice that requires domain and business context. It also often requires some amount of technical experience. Finally, it’s something you probably don’t want to change too often or at all.
So a up-front, somewhat irreversible decision that requires expertise and weigh-in from multiple disciplines. Super fun, right? Let’s talk about what goes into this kind of decision, what loss functions entail, and then how you can pick the best one. I’ll also have a list of common loss functions toward the bottom.
Continue reading “Choosing loss functions”
A lot of the most fascinating parts of both practical (and deeply, deeply magical) Haskell programming take at least their names from concepts in category theory. So, maybe you have to learn some CT in order to be good at Haskell? Or, maybe at least you could learn some CT in order to be better at understanding Haskell? Or, at least then I’d understand what the heck a monad is, right?
I have to say, resoundingly, that none of those are true. At least not inherently and directly. If your goal is to better understand some of these perhaps-we-can-call-them advanced FP techniques then CT is not the most important thing for you to learn. I’ll explain, and then talk about what you might want to learn instead.
Continue reading “How to learn enough category theory to be good at Haskell”
Category theory is really cool. You should study it for fun (though not, directly, to be good at programming).
If you’re interested in studying it for fun, I recommend the following resources.
Continue reading “How to learn enough category theory to have fun”
An extremely painful, easily missed issue with machine learning products is that their performance will tend to degrade over time. Generally speaking, the best day of a new model’s life is its last day in development. Performance will likely take a hit the moment it hits production and slowly degrade from there. This is totally normally and simply something to prepare for as your data products become more and more highly developed.
The pain of lost opportunity can be subtle or dramatic. We often spend a lot of time developing data sources and inferential products. We struggled to get them to achieve strong performance in our lab tests. After spending all that time, it can be easy to hold high expectations to the model performance. Really, though, lab performance should be thought of as something closer to a soft upper-bound on live model performance.
In practice, model performance can be severely impacted almost immediately. It can slowly degrade over time in ways that are more subtle but leave just as big of a gap. Even a very advanced model can perform only randomly if the context it has been deployed in changes significantly. Finally, model degradation is difficult and expensive to measure in the lab. It’s possible you won’t even know how bad the degradation problem will be until the model is live. It’ll just show up later in the bottom line.
Continue reading “Model performance often degrades over time”
Types can cause a lot of pain. They tend to inflexibly demand a way of programming which isn’t always effective and can seriously slow down your work. There’s also a big ergonomics problem. Type systems are notorious for verbosity and truly awful error messages.
Yet, people who are familiar with them don’t often have the same feeling. On one hand, familiarity helps soften a lot of that pain. The bad error messages don’t seem so bad once you’re used to them. But there’s something else, too.
Types influence the way you think about programs, and that influence is actually what makes them valuable.
The way to become comfortable with types is to embrace that influence. It can change how you think about and communicate about programs you read and write. Types aren’t a silver bullet—there are problems today where it’s not clear whether they help—but they’re a really good tool to own.
Continue reading “Thinking in types”
It can be a bit of a pain to think about the various “shapes of data” available across languages. Often in learning a new language you might seek a “Rosetta Stone”. It’s valuable to be able to translate common shapes from one syntax to another. At the same time, it’s not always possible to make all of these translations cleanly. It can seem like some shapes are available in some places and not others.
Let me share my Rosetta Stone. It’s more fundamental, and thus widely applicable, than others. It’s also likely to be familiar to you because it’s exactly the same things you learned in high school algebra.
Continue reading “The shapes of data”
I often see questions about what to read to learn type theory and I recommend Harper’s Practical Foundations for Programming Languages (PFPL). To be clear, this book is not a good recommendation if you want to learn more practically how to think in types. It’s instead a deep dive into what are types, why do they work, and how do we design practical type systems which cover the features we see in many programming languages.
Continue reading “My favorite type theory book”