The Art and Science of Troubleshooting
August 5, 2008
Posted by on
“How do I fix xyz?”
Thinking about how to troubleshoot reminds me of trying to fix my leaky shower.
I want an “instant fix”. My plumber knows better, realizing that you have to go through a systematic process of analysis. In the end, of course, he is right. And after eliminating some of the variables, we indeed locate and fix the problem.
The frustrating aspect of troubleshooting is that sometimes it’s just so darn difficult, if not impossible, to locate the underlying root cause. Often you just have to “make it better”. And, of course, it always takes longer than you–and management–want it to take.
So where do you start? In one instance a client says that an Email node in ProcessFlow takes a while–10 minutes is way too long–so that sounds like a good place to start.
- Does *every* email take a long time?
- Or just ones that use this flow?
- Is it just this flow that is slow?
- A lot of times I find that flows are poorly designed.
- Have you had “another set of eyes” look at this flow?
Sometimes these types of issues are environmental. If every email takes a long time, that would lead me to a network issue.
I have also found that there are some underlying settings which can wreak havoc. For instance, the pfserv config settings for trace and logging. Or JVM memory/thread settings in WebSphere. Or OS kernel settings being too low.
Just some random thoughts for this morning.