How to build a tagging taxonomy for qualitative research that scales across teams

Qualitative research generates rich, detailed data. It also generates a mess if you do not have a system for organizing it.

When one researcher tags an insight as "onboarding friction" and another tags the same type of observation as "new user confusion," you end up with fragmented data that is difficult to search, aggregate, or act on. Multiply that across five teams and two years of research, and you have a tagging system that actively works against you.

A tagging taxonomy solves this. It gives everyone a shared vocabulary for coding qualitative data so that findings from different studies, researchers, and teams can be compared and combined. But building a taxonomy that actually works at scale—across teams, over time—requires more than a brainstorming session and a spreadsheet.

This guide walks through how to build one, what mistakes to avoid, and how to maintain it as your organization grows.

Why tagging taxonomies matter

Tagging is the primary mechanism for turning unstructured qualitative data into something searchable and analyzable. Every time a researcher highlights a passage in a transcript and assigns it a label, they are making an analytical decision. The taxonomy is the structure that governs those decisions.

Without a taxonomy, teams run into predictable problems:

Duplicate tags proliferate. "Payment issue," "billing problem," and "checkout error" may all refer to the same theme but get counted separately.
Findings cannot be aggregated. If each project uses its own tagging conventions, you cannot look across projects to understand how often a theme appears.
Onboarding new researchers is slow. Without a shared system, every new team member invents their own approach.
Stakeholders lose trust. When the same data gets interpreted differently depending on who tagged it, the credibility of the research function suffers.

A taxonomy addresses all of these by providing a single, documented structure that everyone uses.

Start with your research questions, not your data

The most common mistake teams make is building a taxonomy bottom-up from a single dataset. A researcher finishes a study, looks at the codes they created, and says, "Let's use these as our taxonomy." The problem is that a taxonomy derived from one study reflects the concerns of that study. It may not accommodate the themes that matter in the next five studies.

Instead, start by asking what your organization needs to learn across research efforts. Common starting points include:

What are the recurring questions from product and design teams? If stakeholders consistently want to understand usability issues, unmet needs, and competitive comparisons, those concerns should shape your top-level categories.
What themes keep appearing across past research? Review the last 6–12 months of studies to identify patterns that transcend individual projects.
What decisions does research inform? If research feeds into roadmap prioritization, your taxonomy should make it easy to surface insights by product area or customer segment.

This does not mean you should design the taxonomy in a vacuum. You still need real data to validate and refine it. But the initial structure should reflect organizational priorities, not just whatever happened to come up in the most recent interview study.

Designing the taxonomy structure

A useful taxonomy has layers. Flat tag lists become unmanageable once they exceed a few dozen items. Instead, organize tags into a hierarchy with two or three levels.

Top-level categories

These are broad groupings that represent the major dimensions your research program cares about. Common examples include:

User needs — what people are trying to accomplish
Pain points — where the current experience breaks down
Behaviors — what people actually do (as opposed to what they say)
Sentiment — how people feel about an experience
Feature feedback — reactions to specific product capabilities
Context — situational factors like device, environment, or use case

Four to eight top-level categories is a reasonable starting range. Fewer than four usually means categories are too broad to be useful. More than eight starts to create cognitive overhead for taggers.

Second-level tags

Within each category, define specific tags that represent distinct themes. Under "Pain points," for example, you might have:

Navigation confusion
Slow performance
Missing functionality
Error handling
Unclear terminology

Each tag should be distinct enough that two researchers would independently apply it to the same passage. If you find yourself debating whether an observation is "navigation confusion" or "unclear terminology," the definitions are not precise enough, or you may need a different tag structure.

Third-level tags (use sparingly)

Some organizations add a third level for highly specific subtags. For instance, "Missing functionality" might break into specific feature areas. This is useful for large research operations but adds complexity. Only add a third level when you have enough data volume that the second level is too coarse to be useful.

Writing tag definitions

This is the step that most teams skip, and it is the single biggest cause of taxonomy failure. A tag without a definition is an invitation for inconsistency.

Every tag in your taxonomy should have:

A clear name — short, descriptive, and written in consistent grammatical form (all nouns, or all verb phrases—pick one convention)
A definition — one to two sentences explaining what the tag covers
An inclusion example — a sample quote or observation that should receive this tag
An exclusion example — a sample quote that might seem related but should get a different tag

For example:

Tag: Navigation confusion

Definition: The participant expresses difficulty finding a feature, page, or piece of information within the product's interface, or takes a clearly wrong path to reach their goal.

Include: "I had no idea where to find my billing settings. I looked under my profile first."

Exclude: "The billing page loaded really slowly." (This is a performance issue, not a navigation issue.)

Documenting these definitions takes time upfront. It saves significantly more time downstream by reducing miscoding, tag duplication, and debates about what a tag means.

Getting buy-in across teams

A taxonomy only works if people use it. If researchers feel the taxonomy was imposed on them without their input, adoption will be low. Here is how to build it collaboratively without falling into design-by-committee paralysis.

Involve representatives, not everyone

Invite one or two researchers from each team that will use the taxonomy to participate in its design. They bring knowledge of their team's research focus areas and can advocate for the taxonomy internally. Trying to involve every individual researcher leads to lengthy debates and compromises that water down the structure.

Run a pilot with real data

Before rolling out the taxonomy broadly, have three or four researchers independently tag the same dataset using the draft taxonomy. Compare their results. Where they agree, the taxonomy is working. Where they disagree, the definitions need refinement or the tag structure needs adjustment.

This inter-rater reliability exercise does not need to be statistically rigorous for most applied research contexts. The goal is to catch obvious problems before they propagate across the organization.

Make it accessible

Store the taxonomy and its definitions somewhere everyone can find them without asking. A shared document works. A dedicated page in your research repository is better. If your team uses Dovetail, you can manage your tags and their hierarchy directly within the platform, which means researchers encounter the taxonomy in the same tool where they do their coding rather than having to reference an external document.

Maintaining and evolving the taxonomy

A taxonomy is not a document you write once and forget. Research programs evolve, products change, and new themes emerge. The question is how to accommodate change without losing consistency.

Establish a governance process

Designate a small group—two or three people—responsible for taxonomy decisions. When a researcher encounters a theme that does not fit any existing tag, they should be able to propose a new tag through a lightweight process: describe the theme, suggest where it fits in the hierarchy, and provide a draft definition.

The governance group reviews proposals on a regular cadence, such as monthly or quarterly, and decides whether to add a new tag, merge it with an existing one, or adjust an existing tag's definition.

Resist tag sprawl

The most common failure mode for mature taxonomies is unchecked growth. Every new study surfaces themes that seem unique, and if every unique theme gets its own tag, the taxonomy quickly balloons past the point of usability.

Before adding a tag, ask:

Does this theme genuinely represent something distinct from existing tags?
Will this tag apply to future research, or is it specific to one study?
Can the existing structure accommodate this with a minor definition adjustment?

If a proposed tag only applies to a single project, it may be better handled as a project-level code rather than a taxonomy-level tag.

Audit and clean regularly

At least once or twice a year, review the taxonomy for:

Unused tags — tags that no one has applied in the last six months. Consider retiring or merging them.
Overloaded tags — tags that appear in a high percentage of highlights. These are probably too broad and should be split.
Definition drift — cases where different teams have started interpreting a tag differently. Re-align by updating the definition and communicating the change.

Tools that centralize your qualitative data and tagging make these audits straightforward. Dovetail, for example, lets you see tag usage across all projects in one view, which makes it easy to spot tags that have drifted, duplicated, or fallen out of use.

Common taxonomy structures

There is no single right way to organize a taxonomy. The best structure depends on what your organization needs from its research. Here are three common approaches:

By experience stage

Tags are organized around phases of the customer or user journey: awareness, onboarding, core usage, renewal, and so on. This works well for product-led organizations that want to understand where in the lifecycle problems occur.

By insight type

Tags are organized around the nature of the observation: needs, pain points, behaviors, motivations, and mental models. This approach is more abstract but allows findings to be combined across different product areas.

By product area

Tags map to specific features or sections of the product. This makes it easy for product teams to pull insights related to their area of ownership, but it can be brittle—when the product changes, the taxonomy breaks.

Many organizations use a hybrid, combining one or two of these approaches. For instance, top-level categories might reflect insight types (needs, pain points, behaviors), while second-level tags reflect product areas or journey stages.

Scaling the taxonomy to new teams

When a new team joins the taxonomy—whether it is a research team in a different region, a design team that has started doing its own research, or a customer-facing team logging feedback—there are a few things that smooth the transition.

Onboard with examples, not just definitions. Walk new users through the taxonomy using real tagged data so they can see how each tag is applied in practice.

Allow a grace period. New users will make mistakes. Review their tagging in the first few weeks and provide feedback. This is not punitive; it is how everyone learns a shared coding system.

Expect to adapt. New teams often surface themes that the existing taxonomy does not cover. This is a feature, not a bug. Use the governance process to evaluate proposed additions and expand the taxonomy where it genuinely has gaps.

What to avoid

A few patterns consistently undermine tagging taxonomies:

Building it in isolation. A taxonomy designed by one person without input from the people who will use it rarely survives contact with real research.
Making it too detailed too soon. Start with a structure that covers the most common themes and expand over time. A 200-tag taxonomy on day one will overwhelm researchers and lead to inconsistent coding.
Skipping definitions. A tag named "Communication" means different things to different people. Without a definition, you are not building a taxonomy—you are building a word list.
Treating it as permanent. Research programs change. Products change. A taxonomy that worked two years ago may need significant revision today. Schedule regular reviews and be willing to restructure when needed.

Bringing it all together

A tagging taxonomy is infrastructure. Like any infrastructure, it is invisible when it works well and painfully obvious when it does not. The investment in designing, documenting, and maintaining a taxonomy pays off every time a stakeholder asks, "What are the top pain points across all our research this quarter?" and you can answer that question in minutes rather than days.

The key principles are straightforward: start from organizational priorities, keep the structure simple enough to use consistently, document everything, and establish a governance process that allows the taxonomy to evolve without losing coherence.

If your team is building or rebuilding its research operations, the tagging taxonomy is one of the highest-leverage things to get right. It determines whether your qualitative data becomes a searchable, cumulative knowledge base or a collection of isolated project files that only the original researcher can navigate. Platforms like Dovetail are designed to support exactly this kind of structured, scalable approach to qualitative data—but regardless of the tool you use, the principles in this guide apply.