One Score to Rule Them All

Tom Boates

20 Jun 2023 • 5 min read

Why the "overall score" often comes up short

“It would be great if we could just have a single score to show users how they’re doing.”

Sound familiar? This idea has come up in some form or another at many of the clients I’ve worked with from health and fitness, to home maintenance, to finance. The idea sounds great in theory – a way to simplify all of the complex data and scenarios into a single consumable number. The problem is that when projects I’ve worked on test or even end up shipping something like this there is almost always initial excitement followed by confusion and frustration and ultimately lack of real results.

Here are some of the reasons a single score may not be the magic bullet solution you’re looking for, learned from many tried and failed attempts at implementing it.

“Does this represent where I am currently? what I’m doing currently? or how I will be doing if I stay the course?”

The idea of scope and timeframe is the most common feedback across all versions of a score I’ve tested, especially when the score is meant to be a “live” metric that is shown persistently.

Where someone stands currently reflects the result of their past actions and behavior up to that moment. Once they create a plan and start changing their behavior for the better that single number is often expected to adjust to factor in current actions, and/or reinforce that they are doing the right things going forward. In most cases, meaningful change to this number takes a LONG time.

Consider a person’s weight. When a person steps on the scale with the intention of being healthier going forward, their weight isn’t just a reflection of where they are currently. Its a reflection of their actions and behaviors that lead them to this point. Sometimes after following a new diet plan, exercising regularly, and getting better sleep for weeks their weight will have only decreased a small amount (if any at all!) compared to the level of effort they’ve put in. While this still represents their current weight, there is now a cognitive dissonance because even though that weight is an accurate representation of their current state, it no longer reflects their current actions and behaviors.

While an overall health score may be more “rounded” than that because it may factor in other things like body measurements, resting heart rate, calorie intake, sleep, etc. the same general principal remains true. If the score change is small compared to the effort being put in, it causes confusion and ceases to feel like a proper representation for the user.

“Good” is often different for everyone

Body Mass Index (BMI) is an example of an “overall score.” According to the CDC, it is calculated using a person's weight in kilograms (or pounds) divided by the square of height in meters (or feet). This number then categorizes you into categories like “healthy weight” and “overweight” and “obese”.

The average linebacker in the NFL is 6’2” and 240 lbs. According to this equation their BMI would be 30.8 and they’d be considered “obese” even though its more likely they have a lot more muscle mass than excess fat.

Or consider credit cards. High usage of your revolving credit is often seen as a bad thing with regards to your credit score because it’s an indicator that you are likely to spend beyond your means. But many people use credit cards that have rewards programs so they can earn discounts or miles or cash back on every dollar they spend, and pay it all off every month. So is high usage really a negative indicator? or only in certain instances?

In addition, “good” is also relative. A credit score of 600 might be considered “poor” by standard measures, but someone might see it as “good” because they’ve been working hard over the past year to bring it up from a 520. Or even though 600 is labeled poor, it may be higher than the credit score of everyone else they know in similar circumstances which makes it feel a little different.

The average of too large or too diverse of a data set is meaningless and loses the necessary context. The average income of an American citizen is pointless because it includes extreme outliers like billionaires and a significant population of children who aren’t earning an income yet. It isn’t until you narrow the focus down and filter the data to something that is contextual and relevant (like a certain state, age group, profession, etc.) that the number actually has any meaning. The same is true for most "overall scores."

Why is this my score?

The first thing most people say when they see their score is “that’s great! but what does that mean?” Overall scores are almost always one too many abstractions of many weighted data points and as a result there is no relatable unit of measure. This makes it difficult for people to truly understand the meaning of the number and why it is what it is. If I feel like I’m doing what I need to do, why is my score a 60? If I want to improve my score from a 60 to a 75, what do I need to do? What if the things that knock me down to a 60 are things I don’t care about?

The answer usually lies in surfacing the sub-scores used to calculate the overall score. For fitness, I may see that while I’m exercising regularly and keeping my calorie intake to a minimum, I’m not getting enough sleep. For finance, maybe I’m paying bills on time, contributing to my retirement plan, and have 3 months of emergency savings, but I have no life insurance. In either case if I only have an overall score of 60 out of 100, I have no useful information to help me improve.

So, when are scores useful?

At this point you may be thinking that scores aren’t terribly useful and should be avoided, but this isn’t totally the case. As with most things, the value comes from digging in to get to a deeper level of understanding of what you’re trying to accomplish. When you do, you can start to understand that scores are still incredibly useful when presented with the proper context.

Here are some ways to use scores properly:

When it’s not just a single score

Whether it’s providing a second score that represents adherence to a plan in addition to a score for current state, or a list of sub-scores that clearly show how the overall score was calculated with more tangible granularity, overall scores work best when it has the proper context and support. This is also enhanced by written descriptions for what the score means and how it got calculated, even if this information is obfuscated at first.

When it's a point-in-time score

If someone enters a bunch of their health information into a system and it gives them a general score based on what the system knows in that moment, the result is often exactly what users expect. Users may still even run this assessment periodically to see how they’ve made progress which can seem incredibly similar to just constantly showing the value, but the perception of what that number represents is usually different.

When it still achieves the desired affect

The point of an overall score is usually to motivate positive behavior change. If your users are motivated by whatever score you show them to complete the desired actions or make the necessary behavior changes, then that’s the most important thing.

When it's required for the business

For enterprises and agencies especially, sometimes you are beholden to just building what the customer says. The good news is, its unlikely that an overall score is actually going to cause a negative impact to end user behavior or action. If your customer is adamant that they need an overall score, take heed of the above recommendations and find ways to make the most of it.

Overall scores purport to take incredibly complex systems and data and reduce them into a single data point. If this sounds too good to be true, it’s often because it is. By digging in a little deeper to understand the true actions and behaviors your user is trying to achieve, you can come up with a set of scores that will work really well.