I work in an office with an in-house data science R&D team, so the terms 'precision' and 'recall' get thrown around a lot. These are important concepts for every product manager to understand, especially now that data is an important part of nearly every software product.
What is precision, what is recall, and why do I need to know the difference?
Before we define precision and recall, let's talk about the four possible outcomes of binary classification. Whoa... did I lose you there? I know it sounds dry, but I promise this is relevant, and once you finish reading this post, you'll think, 'I am so glad I learned this.'
Let's take the example of a cancer screening:
There are four possible outcomes.
- The test could come back positive, and it could be correct.
- The test could come back positive, and it could be incorrect.
- The test could come back negative, and it could be correct.
- The test could come back negative, and it could be incorrect.
These are the four possible outcomes of binary classification (Stay with me, people.):
- True positives: data points labeled as positive that are actually positive
- False positives: data points labeled as positive that are actually negative
- True negatives: data points labeled as negative that are actually negative
- False negatives: data points labeled as negative that are actually positive
What is the difference between precision and recall?
Rather than giving you the scientific definitions of these two terms, you should really just understand the following:
- Recall identifies all relevant instances. This means you'll get false positives. In other words, high recall means you're identifying all of the relevant instances. The downside is you're getting some other junk in the mix.
- Precision identifies only relevant instances. This means no false positives. In other words, 'high precision' means you're getting only true positives. The downside is that you're also getting false negatives.
This is easiest to understand precision and recall is to use our cancer screening example:
If recall is high, that means you'll have to tell a lot of patients they have cancer that do not actually have cancer, since you're getting a lot of false positives. The upside is that you're definitely catching all of the patients that actually do have cancer. In other words, you're not missing anyone.
If precision is high, this means that every patient you identify as testing positive does actually have cancer, BUT you may miss some people that DO have cancer, since you are getting false negatives.
Which is the better option? To have high recall and be incorrectly telling people they have cancer, when in fact they do not... OR missing some people that actually do have cancer? Both have different repercussions.
An example from my actual job:
The cancer example is a little depressing, so let's look at another real-life example where the stakes aren't quite as high.
Let's take the example of a not-safe-for-work (NSFW) model. You're throwing in a bunch of images into the NSFW model, and as the PM on the project, you have to decide which images to pass through to customers in your data platform, assuming those customers don't want to see porn. Pretty standard use case.
If recall is high: You're labeling a ton of photos as NSFW, that aren't actually gross/sensitive images. So you're filtering out potentially useful data from your end-users. Yikes.
If precision is high: You're only filtering out sensitive images and only sensitive images, but some questionable images are definitely slipping through to your end-users. i.e. Your end-users might end up seeing some gross images they don't want to see.
This brings up the product question: Which is more important?
- That your end-users see absolutely zero sensitive images, OR
- That you pass through as many useful images as possible, even though some might be sensitive?
There is no 'correct answer.' There is always a trade-off. It depends on your product and the desires of your customers, and your product goals. However, as a product manager, you should be able to tell your engineering teams if your goals lean towards higher precision or higher recall.
High Recall, Low Precision: The system is guessing 'yes' too much
Low Recall, High Precision: The system is more conservative, it wants to be sure that when it does guess 'yes,' it's going to be correct.
Can you have both high precision and high recall? Yes.
This goes beyond the 'basics' that I promised in this post, but you could design a system that is the right amount of conservative but also willing to make guesses at the right time. I may write another post delving into this further.