How to use the 5 Whys Root Cause Analysis (with examples)
Here’s the reality most operations leaders won’t admit: your team is probably spending 15-20% of your revenue dealing with the same quality issues over and over again. Not because your people aren’t capable. Not because you don’t have good intentions. But because you’re treating symptoms instead of finding root causes.
I’ve spent nearly 30 years walking manufacturing floors and service operations, and I can tell you the pattern is always the same. A problem shows up. The team scrambles. They implement a “fix.” Then, three weeks later, or three months later, the same problem is back. Everyone’s frustrated. Nobody knows why.
The 5 Whys method breaks that cycle. When used correctly, it helps you discover the actual problem and implement preventative actions that stop issues from recurring. Less defects, lower quality costs, and stronger systems, all from asking “why” five times.
I’ll walk you through exactly how to use the 5 Whys method, and give you the validation techniques that separate root causes from symptoms. You’ll have a framework you can implement tomorrow morning.
When the 5 Whys Works Best
The 5 Whys is most effective for:
- Internal operational issues
- Quality non-conformances
- Process breakdowns
- Recurring problems that “shouldn’t be happening”
- Situations where you need fast, collaborative problem-solving
It’s less effective for highly complex problems with multiple interacting causes, or when you need extensive documentation for external stakeholders (in those cases, you’ll want to use an 8D or CAR (Corrective Action Request), which can incorporate 5 Whys as part of its process).
The Three Failure Points: What You’re Actually Looking For
Before you start asking “why,” you need to understand what you’re looking for. In my 25 years of doing this work, I’ve found that every problem ultimately traces back to one of three failure points:
1. Equipment Failure
A piece of equipment has broken down, a part is worn out, or a tool isn’t functioning as designed. This is the easiest to identify and usually the least common root cause.
Indicators you’re dealing with equipment failure:
- Recent maintenance records show wear or damage
- The problem correlates with specific machines or tools
- Physical inspection reveals broken or degraded components
2. System Failure
This is where it gets interesting, and where most problems actually live. A system failure means you have a documented process, but people aren’t following it. Or they’re following it, but the process itself is broken.
Here’s the critical distinction most people miss: if someone didn’t follow the process, that’s not automatically a “people problem.” In my experience, the vast majority of people aren’t intentionally causing problems. When someone works around a documented process, there’s almost always a reason.
The most common reason? They don’t understand why that’s the process to begin with. Or they’ve “found a shortcut” that works for them but doesn’t reproduce the desired output. We’re all wired to seek efficiency, and when people don’t understand the bigger picture, they optimize their own work without realizing how it impacts the rest of the system.
Indicators you’re dealing with system failure:
- The process exists on paper but varies in practice
- Different people do the same task differently
- People mention “the way we really do it” versus “the official way”
- Workarounds have become standard practice
3. Human Failure (No System Exists)
This is the only time I call something a human failure: when there is literally no documented process and people have made one up based on their experience. If you have five operators doing the same task five different ways because nobody ever created a standard, that’s not their fault, that’s a system design failure.
Indicators you’re dealing with no system:
- “That’s how we’ve always done it”
- New employees get trained differently by different trainers
- No written procedures or work instructions exist
- Results vary wildly depending on who’s doing the work
If I had to rate these in terms of frequency, it’s almost always a system failure. Next is equipment. And only occasionally do you find there’s truly no system in place.
Step-by-Step: How to Use the 5 Whys Method
Here’s the process I use with clients, refined over hundreds of root cause analysis sessions:
Step 1: Collect All the Information
Before you gather the team, get specific about the problem:
- What exactly happened?
- When did it occur? (Date, time, shift)
- Where did it happen? (Location, process step, equipment)
- How big is the problem? (Scope, frequency, impact)
- What’s the risk to your customer?
- What data do we have? (Inspection reports, production logs, customer complaints)
The more specific you are upfront, the more effective your 5 Whys session will be. Vague problems produce vague root causes.
Step 2: Create Your Team
Don’t do this alone. Ever. I’ve made this mistake and paid the long term price!
Pull in people from the affected department, but also include others who can offer fresh perspective. I generally include people from before or after the specific department we think is causing the issue, then a mixture of leadership depending on the organization.
Sometimes it’s much less obvious where to start, so we begin with a process flow map to capture all touchpoints, then dive into subject matter expert interviews, taking note of things we find until we can do a proper RCA exercise.
Key principle: Follow the value stream when building your team. If you’re investigating a quality issue in production, include someone from quality control, someone from the production department, and someone from either incoming inspection or the next upstream/downstream process.
Step 3: Write a Clear Problem Statement
This is one of the biggest challenges because there really isn’t a perfect formula but here’s a good test: ask someone from an unrelated department to read your problem statement. They should have enough company context that they can understand what you’re describing.
Bad example: “Machine is broken”
Good example: “Intermittent print misregistration when printing on plastic material with plastic liner”
The difference? Specificity. The second statement tells you what’s wrong (misregistration), when it happens (intermittently), what process is affected (printing), and what materials are involved (plastic with liner).
Step 4: Start Asking Why
Now comes the actual 5 Whys. Ask “why” and let the team answer based on facts, not assumptions. After each answer, ask “why” again.
Here’s the part nobody tells you: The biggest mistake I see is teams stopping too early or giving surface-level answers. Both are protection mechanisms. The team feels like this is an “exercise” rather than a fact-finding mission to solve a real problem. That typically stems from leadership treating this as an activity rather than something that’s practiced and celebrated when real problems are solved.
Before you start, set the ground rules:
- This is about fixing the system, not assigning blame
- We’re looking for facts, not opinions
- We’re solving this problem permanently, not just checking a box
- Everyone’s perspective matters—speak up if something doesn’t make sense
As you work through the whys, you’ll notice something: when you hit the actual root cause, it’s usually obvious to everyone in the room. There’s an “aha moment” where people nod and say “yeah, that’s it.” But recognition isn’t enough, you actually need to validate it.
Step 5: Validate Through Testing
This is the step most people completely skip, and it’s why their “solutions” don’t stick. Once you think you’ve found the root cause, you need to test it. Can you recreate the problem by reintroducing the root cause condition? Can you eliminate the problem by addressing the root cause?
The timeline for testing depends on what you’re validating. Sometimes it can be done almost in real time, while other tests may take weeks to coordinate and execute. The whole idea is to outline what’s reasonable based on what you learned in the exercise.
Step 6: Implement Countermeasures and Audit
Select your countermeasure or preventative action and put it in place. But here’s the critical part: you need to audit whether it’s actually working. I recommend checking in at a couple of weeks, then again at a couple of months to ensure nothing else has changed and that the solution is still effective. Your organization structure will determine who does this work. Some organizations have continuous improvement committees, others use outside counsel, and sometimes leadership or appointed team members tackles it directly. The audit phase is where most solutions fail. Don’t let yours be one of them.
Real Example 1: Manufacturing Quality Issue
Let me show you how this works with a real case from a printing operation I worked with.
Problem Statement: Intermittent print misregistration when printing on plastic material with plastic liner.
Why #1: Why is this happening? The material is stretching during the printing process.
Why #2: Why is the material stretching? Could be turnbars causing drag, could be in-feed press tension, could be excessive heat from the curing system.
Why #3: Why are the turnbars causing drag? The web path isn’t optimized, and the turnbars need maintenance.
Why #4: Why is there excessive heat? The run speed is too low for the curing lights to work at their designed parameters.
Why #5: Why is the run speed too low? It’s based purely on operator experience, and it varies with different raw material surfaces. There’s no reference document.
Root Causes Identified:
- No standardized process for setting press speed based on material type
- Turnbar maintenance is reactive, not preventative
- Operators are making equipment decisions without proper parameters
Notice we didn’t stop at “material stretching,” that was just the symptom. We kept going until we found the system failures.
Countermeasures Implemented:
- Created a reference document for press speed, light power, and material specifications as a starting benchmark
- Implemented a turnbar monitoring and maintenance schedule
- Began investigating an auto-adjusting press tension mechanism
The Key to Making It Stick: Here’s what I told the team about that reference document, this is a starting point, not a rule. If operators start with those settings but need to adjust significantly based on their experience, that adjustment gets documented and reviewed. We look at whether we need to update the benchmark or if that particular run was simply different.
The biggest mistake leaders make is turning the “guide” into gospel and discouraging thinking and participation. You want people to think and react, but you also want to give them a prompt that drives the intended outcome.
Result: They went from recurring quality issues costing thousands in waste and rework to a documented, preventable system. The problem was solved permanently. All in, it took about two weeks from identification to validation, but the problem had been ongoing for months prior.
Real Example 2: Service Industry Bottleneck
Now here’s a service industry example that shows the 5 Whys isn’t just for manufacturing.
Problem Statement: Customer orders are consistently being delivered 2-3 days late, causing complaints and lost business.
Why #1: Why are orders late? The warehouse is shipping orders after the promised delivery date.
Why #2: Why is the warehouse shipping late? They’re receiving picking lists from customer service later than expected.
Why #3: Why are picking lists arriving late? Customer service is waiting for credit approval before processing orders.
Why #4: Why is credit approval taking so long? The finance team only reviews credit applications twice per week, on Tuesdays and Thursdays.
Why #5: Why only twice per week? Because there’s no automated system. It requires manual review, and finance thought batching would be more efficient for their department.
Root Cause Identified: An artificial bottleneck created by local optimization. Finance was trying to be efficient within their own silo, with zero regard for how their batching policy impacted the rest of the value stream.
This happens everywhere because we’re all wired to seek efficiency within our own area of expertise. Opening the conversation by talking about the end-to-end value stream helps people across the organization start to see how their work impacts others.
The Key Realization: Here’s what I had to help the team understand, several areas will actually need to be less efficient locally if we’re working toward making the overall system more effective. Finance spending 10 minutes per day on credit checks feels less efficient than batching them twice per week, but system-wide, daily checks were dramatically better for customer satisfaction and revenue.
Countermeasures Implemented:
- Implemented automated credit checks for orders under $5,000
- Daily manual credit review for anything requiring human judgment
- Real-time notification system between finance and customer service
Result: Order processing time cut from 4-5 days to same-day for most orders. Customer complaints dropped 80% in the first month.
Notice something? Same tool, completely different industries. Manufacturing or service, B2B or B2C, the 5 Whys works because you’re following the problem back to the system failure, not just treating the symptom.
The Case That Surprised Me: Looking Beyond Your Four Walls
I’ve been doing this work for 25 years, and I’ve seen hundreds of root cause analyses. But one case last year reminded me that even experienced people can miss the obvious until they follow the process completely.
A client was experiencing an intermittent product quality issue for nearly a decade. Several root cause exercises and 5 Whys sessions had been conducted with multiple teams throughout that time. They felt like they’d get close to solving it, then the problem would disappear for a while. This cycle repeated itself for years.
When I got involved, we started fresh with the 5 Whys process. We mapped the entire value stream, not just their internal operations, but their suppliers too. Ultimately, we traced the issue to a tier 2 supplier. That supplier had very little process controls or documentation in place, which explained why the problem was intermittent.
Here’s what had happened: The company had done extensive RCA and data analysis on themselves and their tier 1 suppliers, but they didn’t have real visibility into their tier 2 suppliers. They’d been looking in the wrong place for a decade.
The lesson: Sometimes the root cause isn’t within your four walls. Don’t limit your 5 Whys to your own processes. Follow the problem wherever it leads, even if that means looking at suppliers, vendors, or upstream processes you don’t directly control.
Common Mistakes That Kill Your Root Cause Analysis
After 25 years of facilitating these sessions, I’ve seen the same mistakes over and over:
1. Stopping at Symptoms
The team gets to a plausible-sounding answer and stops asking why. “Why did the machine break?” “Because the bearing failed.” That’s not the root cause, that’s still a symptom. Why did the bearing fail? Was it not maintained? Is the maintenance schedule inadequate? Is there no maintenance schedule at all?
Keep going until you hit one of the three failure points: equipment, system, or no system.
2. Skipping the Audit Step
You identify the root cause, implement a countermeasure, and assume it’s fixed. Three months later, the problem is back. Why? Because you never validated that your solution actually worked.
Test your countermeasure. Monitor it. Audit it at two weeks, then again at a couple months. Make sure it’s actually solving the problem before you declare victory.
3. Going Solo
When one person conducts the 5 Whys alone, they bring their own biases and blind spots. You need diverse perspectives from across the value stream to see the full picture.
4. Making It About Blame Instead of Systems
The moment someone feels blamed, they shut down. They give you surface-level answers designed to protect themselves or their department. You’ll never find the real root cause in a defensive environment.
Frame the session as “we’re fixing the system” not “we’re finding out who screwed up.” Most problems are system problems, not people problems.
5. Turning Guidelines Into Rules
You create a reference document or standard procedure as a countermeasure, then you punish anyone who deviates from it. This kills continuous improvement.
Your countermeasures should be starting points that get refined based on real-world feedback. If someone has to adjust significantly from the standard, that’s valuable data, document it and review whether the standard needs updating.
When NOT to Use the 5 Whys
The 5 Whys is a powerful tool, but it’s not right for every situation:
Use 8D or CAR (Corrective Action Request) instead when:
- You’re dealing with external customer complaints that require formal documentation
- The problem has already caused significant customer impact
- You need detailed tracking and sign-offs for regulatory or quality system requirements
The good news? 8D and CAR formats can still use the 5 Whys internally to get to the root cause. Think of 8D or CAR as the documentation wrapper, and 5 Whys as the analysis engine inside it.
Use more advanced RCA methods when:
- You’re dealing with complex problems that have multiple interacting root causes
- The problem involves safety or significant financial risk
- Initial 5 Whys sessions aren’t revealing clear root causes
Final Thoughts
Most organizations already know how to put out fires. What they struggle with is slowing down long enough to understand why the fire started in the first place. That’s not a people problem. That’s a system problem.
When the 5 Whys is treated like a checkbox exercise, you’ll get surface-level answers and short-term fixes. When it’s treated like a real investigation with facts, validation, and follow-through, it becomes one of the fastest ways to reduce repeat issues, lower quality costs, and build trust across the team.
The real payoff isn’t just fewer defects. It’s consistency. It’s teams that stop guessing. It’s leaders who don’t have to escalate to get results because the system actually supports good outcomes.
If you’re tired of solving the same problems quarter after quarter, start asking better questions. Then stick around long enough to hear the uncomfortable answers. That’s where real improvement starts.
That’s it for today.
See you all again next week!
Dave
Whenever you're ready, there are 4 ways to start:
- Operations Workbench: Free tools that help you work through your operational challenges the same way we do.
- Operations Diagnostic: Discover your top 3 operational priorities. Personally reviewed and delivered within 24 hours.
- 20-Minute Strategy Call: Talk through your challenges and explore whether working together makes sense.
- Current State Sprint: Get a 90-day action plan to reduce friction, align systems, and unlock sustainable growth.