Choosing loss functions

Choosing a loss function is an important step in setting up a well-designed machine learning task. It’s a choice that requires domain and business context. It also often requires some amount of technical experience. Finally, it’s something you probably don’t want to change too often or at all.

So a up-front, somewhat irreversible decision that requires expertise and weigh-in from multiple disciplines. Super fun, right? Let’s talk about what goes into this kind of decision, what loss functions entail, and then how you can pick the best one. I’ll also have a list of common loss functions toward the bottom.

Continue reading “Choosing loss functions”

How to learn enough category theory to be good at Haskell

A lot of the most fascinating parts of both practical (and deeply, deeply magical) Haskell programming take at least their names from concepts in category theory. So, maybe you have to learn some CT in order to be good at Haskell? Or, maybe at least you could learn some CT in order to be better at understanding Haskell? Or, at least then I’d understand what the heck a monad is, right?

(Or the parts of other language ecosystems which have followed this trend. I’m looking at you Scala, OCaml, Javascript, etc)

I have to say, resoundingly, that none of those are true. At least not inherently and directly. If your goal is to better understand some of these perhaps-we-can-call-them advanced FP techniques then CT is not the most important thing for you to learn. I’ll explain, and then talk about what you might want to learn instead.

Continue reading “How to learn enough category theory to be good at Haskell”

Model performance often degrades over time

An extremely painful, easily missed issue with machine learning products is that their performance will tend to degrade over time. Generally speaking, the best day of a new model’s life is its last day in development. Performance will likely take a hit the moment it hits production and slowly degrade from there. This is totally normally and simply something to prepare for as your data products become more and more highly developed.

The pain of lost opportunity can be subtle or dramatic. We often spend a lot of time developing data sources and inferential products. We struggled to get them to achieve strong performance in our lab tests. After spending all that time, it can be easy to hold high expectations to the model performance. Really, though, lab performance should be thought of as something closer to a soft upper-bound on live model performance.

In practice, model performance can be severely impacted almost immediately. It can slowly degrade over time in ways that are more subtle but leave just as big of a gap. Even a very advanced model can perform only randomly if the context it has been deployed in changes significantly. Finally, model degradation is difficult and expensive to measure in the lab. It’s possible you won’t even know how bad the degradation problem will be until the model is live. It’ll just show up later in the bottom line.

Continue reading “Model performance often degrades over time”

Checklists for data product staging

Naming the level of development of a data product is a matter of judgement and experience, but the following list of questions can help you develop that judgement and be consistent in its application. Feel free to use it as a starting point for your own checklist.

Data projects come in many shapes and sizes and forms. If there are questions you use to judge a project’s maturity or if there’s a major aspect of a data project missing below, please email me at and I’ll add it.

Continue reading “Checklists for data product staging”

Avoiding murky data science projects

Paying for data science projects can be stressful.

The dream is straightforward application of existing, high-power statistical and machine learning technologies to relevant, existing data. Flip a switch and out pops tools for better decision making or new products for your customers. The reality is that none of that is straightforward and, in the worst cases, you end up in research hell.

A big part of data science is learning and exploration. You may not know what you don’t know, you may not know what opportunities exist in data that you have access to (or could easily get access to). So, when you charter a data science team to solve a business problem you may be setting off on a long, murky journey.

Research hell is when these projects struggle to deliver, but stay tantalizing. You invest and invest and wait and wait and the project trudges on.

And on. And on. And on… Misery.

Now, instead of jumping on whole new opportunities born of your investment into data you’re nursing a murky plan and managing a distressed, disconnected long-term research team. Or, worse, left judging a science fair.

Continue reading “Avoiding murky data science projects”

Thinking in types

Types can cause a lot of pain. They tend to inflexibly demand a way of programming which isn’t always effective and can seriously slow down your work. There’s also a big ergonomics problem. Type systems are notorious for verbosity and truly awful error messages.

Yet, people who are familiar with them don’t often have the same feeling. On one hand, familiarity helps soften a lot of that pain. The bad error messages don’t seem so bad once you’re used to them. But there’s something else, too.

Types influence the way you think about programs, and that influence is actually what makes them valuable.

The way to become comfortable with types is to embrace that influence. It can change how you think about and communicate about programs you read and write. Types aren’t a silver bullet—there are problems today where it’s not clear whether they help—but they’re a really good tool to own.

Continue reading “Thinking in types”

The shapes of data

It can be a bit of a pain to think about the various “shapes of data” available across languages. Often in learning a new language you might seek a “Rosetta Stone”. It’s valuable to be able to translate common shapes from one syntax to another. At the same time, it’s not always possible to make all of these translations cleanly. It can seem like some shapes are available in some places and not others.

Let me share my Rosetta Stone. It’s more fundamental, and thus widely applicable, than others. It’s also likely to be familiar to you because it’s exactly the same things you learned in high school algebra.

Continue reading “The shapes of data”

My favorite type theory book

I often see questions about what to read to learn type theory and I recommend Harper’s Practical Foundations for Programming Languages (PFPL). To be clear, this book is not a good recommendation if you want to learn more practically how to think in types. It’s instead a deep dive into what are types, why do they work, and how do we design practical type systems which cover the features we see in many programming languages.

Continue reading “My favorite type theory book”