Why Conventional Product Testing Gets Attribute Intensities Wrong

Conventional tests don’t give you an accurate picture of how consumers perceive the intensity of different flavors. If companies want to learn about how consumers perceive flavor so they can create successful products, they should use Gastrograph AI as part of their product development process.

Imagine you make camping chairs. Every year, you sell millions and millions of chairs. It’s gotten to the point where you don’t even need to advertise. Everyone who buys a chair recommends it to their friends. Now, your profits are so high you decide to invest the money in revamping the chair to give even more value to your buyers. You add a cupholder, a footrest, and some clips and hooks on the back. The next year, sales plummet.

What happened?

You didn’t realize that the reason people loved that chair was that it was so lightweight and easy to carry.

By adding all the extra features, you’ve ruined it.

Knowing why people like or dislike your product is essential for repeat sales in any industry. In CPG food and beverage, that means knowing which flavors consumers perceive and how intensely.

In a validation study we conducted last year, we discovered a major flaw in how conventional central location tests (CLTs) collect intensity data. We also verified that our system is at least as accurate at predicting consumer preference as CLTs, but that’s a story for another day.

Conventional tests don’t give you an accurate picture of how consumers perceive the intensity of different flavors. If companies want to learn about how consumers perceive flavor so they can create successful products, they should use Gastrograph AI as part of their product development process.

What Are Attribute Intensities?

Attribute intensities are a score of how strongly a taster perceives a flavor. They’d tell you, for example, if consumers detect a “hint” of strawberry or an overpowering flavor.

How They’re Collected in a Conventional Testing

Our partner for the study conducted conventional tests for nine products, and then we reviewed the same products with the Gastrograph platform.

First, the products went through a round of tasting with expert tasters. The experts decided which attributes were present in the products, and they listed six terms: roasted, dairy, bitter, rich, sugar, and retronasal (that’s “aftertaste” to you and me).

In one of the most popular testing methods (Quantitative Descriptive Analysis), expert tasters generate the list of flavor attributes by majority voting. If only one or two of the experts pick up on a flavor that the others don’t perceive, that term doesn’t end up on the scorecard. So consumers only get to rate the flavors that the majority of the expert tasters noticed and aren’t allowed to add their own terms — this is a constrained lexicon.

Next, our partner brought in the consumers to taste the products and rate the intensity of each of the flavors on a scale of zero to five. Finally, they asked consumers to rate their product preferences on a scale of one to seven.

So, as in most CLTs, experts are responsible for describing what flavors they taste, and consumers rate how much they like the products. Perception data and preference data come from two different rounds of testing.

How They’re Collected by Gastrograph

When we bring products onto the Gastrograph system, there are no separate expert rounds. One consumer taster describes what they taste and the intensity of the flavors and then rates their preference for the product. The Gastrograph process means our perception and preference data come from the same consumer tasters, so we can link between what flavors people taste and which products they like.

We avoid forcing tasters to use a constrained lexicon by letting them describe flavors however they want. The Gastrograph system has 24 attributes that “totally encompass the gustatory flavor space,” says Jason Cohen, CEO and founder of Gastrograph. Within those 24 attributes, tasters select a label to describe a flavor or create their own label. They’re free to add as many labels as they want to describe a product. If they don’t taste a flavor, they don’t have to rate it.

The extended process consumer tasters go through to identify flavors before rating them also helps us predict sustained preference. Tasters spend more time thinking about flavors than in a conventional tasting, so the results we get give us an indication of how much consumers will like the products over time.

What’s the Problem with the Conventional Method?

Forced responses and a constrained lexicon mean that the data companies get from conventional testing is skewed and biased.

False Positives

If you tell people that a product has a flavor attribute, they’re naturally more likely to rate its intensity as at least one out of five. So you get false positives (people say a flavor is there even when they don’t taste it) or an inflated intensity score (people say a flavor is stronger than it is).

In our validation study, we discovered that the consumers’ perceptions of intensity for each of the six attributes were systematically higher in the conventional style test compared to the Gastrograph system.

comparison of intensity

We argue that the higher intensity scores from the conventional test are because the attribute list primed consumers to taste certain flavors. They’re forced to rate the intensity of all the attributes (even if they rate the intensity as zero), and this process “increases the minimum response value and mean response values.” So consumers are more likely to say they perceive a flavor and with a higher intensity because it’s on the list of attributes from the expert tasters.

It’s a version of confirmation bias: the fact the attributes appear on the list encourages the consumers to taste them, so they end up rating them as stronger than they actually are.

Imagine this scenario: I get you to taste some chocolate.

In a conventional CLT, you’re asked to rate the intensity of the bitterness from zero to five.

With Gastrograph, you're asked to describe what you taste and rate the intensity of what you taste.

In the conventional CLT, you’re more likely to rate “bitterness” with an intensity of at least one because you’re expecting to taste it.

Flavor Offloading

When consumers don’t have the freedom to describe what they taste, they “offload” the intensity of those flavors onto flavors that are on the list. The result is that the flavors on the attribute list appear more intense than they are.

When we look in more detail at the different attribute intensities from the study, the biggest disagreement is around the term “rich.”

In the Gastrograph platform, more people gave rich an intensity of zero or one, and in the CLT, more people rated “rich” as a level three, four, or five intensity.

The Gastrograph AI results are gray on the graph

In the Gastrograph system, there are several terms that we can count as variants of, or alternatives to, the term rich. For example mouthfeel, earthy, or spices. But those terms weren't on the questionnaire in the conventional test.

Let’s say someone tasted a flavor they’d categorize as “earthy.” In the conventional test, they had no way to communicate it, so they translated the intensity to a flavor on the list — rich— giving it an artificially high-intensity score.

An Incomplete Picture

When people can’t describe all of what they taste, you don’t get the full picture of how people perceive your product.

Let’s look at an example: marine flavors in beer.

“Marine” is one of the 24 Gastrograph attributes. At a beer tasting (or any other tasting because the 24 attributes are always the same), a taster can list a “marine” flavor.


The 24 Gastrograph Attributes

Jason explains this used to cause some arguments with sensory scientists. They would say, “but beer doesn't have marine flavors.” Perhaps most beers don’t have marine flavors, “but I can certainly think of some counter-examples,” says Jason.

If you mis-ferment beer, it can taste like iodine. We categorize iodine as a marine flavor. “So if your beer was flawed [and] if someone didn’t have a way of marking it, they would have to offload onto the other flavors.” In other words, if you don’t include “marine” on your attributes list, you don’t get to find out if your batch of beer is flawed.

Plus, there are now some new beers that have marine flavors by design. The Kelpie beer brewers use malted barley that they grow in fields fertilized by seaweed. “That certainly tastes fishy,” says Jason.


Kelpie Seaweed Ale

If you don’t include “marine” as a flavor attribute all the time — ”where it’s supposed to be there or even not supposed to be there” — then you’re not able to compare tasting data, explains Jason.

A New Era of Personalized Products

At Gastrograph, because we gather accurate data about what consumers perceive, we’re able to predict consumer preferences more reliably than conventional consumer tests. With our data, you can develop a nuanced understanding of what consumers love about different products. Then you can transform your product development. You'll be able to start creating products that target the preferences of specific demographic groups. 



Similar posts

Stay up to date with Gastrograph AI

Be the first to know about company updates and industry-related news, straight from our internal subject matter experts.