In 2001, I was blindsided by a problem that is plaguing the development of
technology in our country. We don't teach our rapidly growing technical community
of students, developers, computer scientists and engineers how to solve problems.
The race that companies face to deliver new products with cutting edge features in
less time is causing us to short change quality. Complexity of products skyrockets
and we end up putting more problems into our products without having the skills and
expertise to take them back out again.
"[U]sers are constantly uncovering defects in our . . . systems. We devote more time
to fixing problems than developing new applications. We don't have time to do it
right the first time, but we always have time to do it again."
[Quality Assurance Institute]
It's all about problem solving. The proliferation of computers into everything from
key chains to vehicles underscores the growing problem of low quality. When problems
are exposed, every developer at every experience level is faced with the same question,
"Okay, what do I try next?" How the developer answers that question is crucial, and
it is the difference between an effective developer and debugger, and a struggling
The origin of the term "bug" is often attributed (incorrectly) after a moth became
trapped in a relay in the Harvard University Mark II Aiken Relay
Calculator (a primitive computer) in 1945. The poor moth has been immortalized in
a lab notebook (below) and now lives in the National
Museum of American History of the Smithsonian Institution.
First Computer "Bug"
While the moth-in-the-relay story is true, the term "bug" was actually in use before
that time. The story and uses of the term are explored in an interesting article,
Computer Bug, by James S. Huggins.
A very early reference to "bugs" (1896) appears in the electrical handbook,
"Hawkin's New Catechism of Electricity":
"The term "bug" is used to a limited extent
to designate any fault or trouble in the connections or working of electric
apparatus." (Theo. Audel & Co.)
Thomas Edison referred to problems in his inventions in a letter to a friend
as "..difficulties arise-this thing gives out and [it is] then that "Bugs"-as such
little faults and difficulties are called-show themselves.." Ref
How did the concept first come about?
"As soon as we started programming, we found to our surprise that it wasn't as
easy to get programs right as we had thought. We had to discover debugging. I
can remember the exact instant when I realized that a large part of my life from
then on was going to be spent in finding mistakes in my own programs."
-Maurice Wilkes, 1949
Just like the term "embedded systems," the term "debugging" also means different things
to different people. It is an activity, a process, a consuming mystery.
But contrary to most references, debugging is NOT just a list of common mistakes
and available tools.
Debugging is a methodical process that requires
contentious thought, analysis, and logical activities.
During the process,
different types of tools can be used, but the process is not defined by these
tools. Although debugging a difficult problem can be challenging, the
process of debugging is not hard.
Lists of common coding mistakes are available in books and on the internet.
Debugging tools are also commonly listed. However, very few sources
actually introduce methods to identify problems and to select the
appropriate tools to use. The process involves considering the evidence and
hypothesizing possible cause of the problem.
My goal is to help you understand
some logical steps to take when debugging a problem, and how to identify
what tools you might need and which common problems may be in action.
However, before we start, I want to explore the common thoughts about debugging to
tease out the useful elements and find out why debugging has such a
voodoo mystery aura surrounding it.
"If debugging is the process of removing bugs, then programming must
be the process of putting them in." [unknown]
"Part of the challenge of debugging embedded systems is in out-thinking
the program and devising new means of getting useful information out of
the limited I/O found in the system. Be creative."
The Art of Programming Embedded Systems by Jack Ganssle.
"Debugging presents a continuing sequence of riddles to the developer.
Each riddle is a consuming mystery, right up to the point when it is
discovered to result from an obvious mistake. Coding, Debugging, and
Integrating often merge into a single confusing process."
How the Pros Develop
Embedded Software, by David Clifton.
Debugging isn't this mysterious, and the ultimate root cause
is not always an obvious mistake. An effective problem solver learns to identify
and isolate the root causes of the bug without understanding what
caused it first. Otherwise we'd end up with a bunch of technical folks who
can't fix a bug unless they've seen it before!
You will seldom find a difficult bug without first forming a hypothesis as to its cause.
"The most important debugging tool is a healthy dose of common sense,
a sometimes rare commodity in the frenzy of debugging."
The Art of Programming Embedded Systems by Jack Ganssle.
A very interesting thought comes from Stephen
W. Draper of the University of Glasgow. In the early days teaching
programming, students created sloppy ad hoc programs and then compiled them,
reacting to the list of errors and modifying the program until it worked,
and hence learned to debug problems. But in
the more recent age of top-down structured programming and smart editors, fewer errors
are introduced into code, significantly reducing the opportunity for students
to be trained as debuggers.
"Your mind is organized to make sense out of poor quality, poorly perceived,
and poorly organized input. This hurts you badly during debugging, because
you tend to read code in such a way that you see what it should say,
rather than what it actually says...
The key way to protect yourself
...is to spend only limited time studying your code,
before making the program tell you what it is actually doing."
- Clayton Lewis, University of Colorado at Boulder
1. Intuition and hunches are great-you just have to test them out. When a hunch
and a fact collide, the fact wins.
2. Don't look for complex explanations. Even the simplest omission or typo can lead
to very weird behavior. Everyone is capable producing extremely simple and obvious
errors from time to time. Look at code critically-don't just sweep your eye over that
series of simple statements assuming that they are too simple to be wrong.
3. The clue to what is wrong in your code is in the values of your variables and the
flow of control. Try to see what the facts are pointing to. The computer is not trying to
mislead you. Work from the facts.
4. Be systematic. Be persistent. Don't panic. The bug is not moving around in your
code, trying to trick or evade you. It is just sitting in one place, doing the wrong thing
in the same way every time.
5. If you code was working a minute ago, but now it doesn't-what was the last thing
you changed? This incredibly reliable rule of thumb is the reason your section leader
told you to test your code as you go rather than all at once.
6. Do not change your code haphazardly trying to track down a bug. This is sort of
like a scientist who changes more than one variable at a time. It makes the observed
behavior much more difficult to interpret, and you tend to introduce new bugs.
7. If you find some wrong code which does not seem to be related to the bug you
were tracking, fix the wrong code anyway. Many times the wrong code was related to
or obscured the bug in a way you had not imagined.
8. You should be able to explain in Sherlock Holmes style the series of facts, tests,
and deductions which led you to find a bug. Alternately, if you have a bug but cant
pinpoint it, then you should be able to give an argument to a critical third party
detailing why each one of your procedures cannot contain the bug. One of these
arguments will contain a flaw since one of your procedures does in fact contain a bug.
Trying to construct the arguments may help you to see the flaw.
9. Be critical of your beliefs about your code. It's almost impossible to see a bug in a
procedure when your instinct is that the procedure is innocent. In that case, only when
the facts have proven without question that the procedure is the source of the problem
will you be able to see the bug.
10. You need to be systematic, but there is still an enormous amount of room for
beliefs, hunches, guesses, etc. Use your intuition about where the bug probably is to
direct the order that you check things in your systematic search. Check the procedures
you suspect the most first. Good instincts will come with experience.
11. Debugging depends on an objective and reasoned approach. It depends on overall
perspective and understanding of the workings of your code. Debugging code is more
mentally demanding than writing code. The longer you try to track down a bug without
success, the less perspective you tend to have. Realize when you have lost the
perspective on your code to debug. Take a break. Get some sleep. You cannot debug
when you are not seeing things clearly. Many times a programmer can spend hours late
at night hunting for a bug only to finally give up at 4:00 A.M. The next day, they find the
bug in 10 minutes. What allowed them to find the bug the next day so quickly? Maybe
they just needed some sleep and time for perspective. Or maybe their subconscious
figured it out while they were asleep. In any case, the "go do something else for a
while, come back, and find the bug immediately" scenario happens too often to be an