| 1 |
fox2mike |
1.1 |
<?xml version="1.0" encoding="UTF-8"?> |
| 2 |
|
|
<!DOCTYPE guide SYSTEM "/dtd/guide.dtd"> |
| 3 |
fox2mike |
1.4 |
<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/draft/debugging-howto.xml,v 1.3 2005/07/14 09:42:27 swift Exp $ --> |
| 4 |
fox2mike |
1.1 |
|
| 5 |
|
|
<guide link="/doc/en/debugging-howto.xml"> |
| 6 |
|
|
<title>Gentoo Linux Debugging Guide</title> |
| 7 |
|
|
|
| 8 |
|
|
<author title="Author"> |
| 9 |
|
|
<mail link="chriswhite@gentoo.org">Chris White</mail> |
| 10 |
|
|
</author> |
| 11 |
|
|
<author title="Editor"> |
| 12 |
|
|
<mail link="fox2mike@gentoo.org">Shyam Mani</mail> |
| 13 |
|
|
</author> |
| 14 |
|
|
|
| 15 |
|
|
<abstract> |
| 16 |
|
|
This document aims at helping the user debug various errors they may encounter |
| 17 |
|
|
during day to day usage of Gentoo. |
| 18 |
|
|
</abstract> |
| 19 |
|
|
|
| 20 |
|
|
<!-- The content of this document is licensed under the CC-BY-SA license --> |
| 21 |
|
|
<!-- See http://creativecommons.org/licenses/by-sa/2.5 --> |
| 22 |
|
|
<license/> |
| 23 |
|
|
|
| 24 |
|
|
<version>1.0</version> |
| 25 |
|
|
<date>2005-07-13</date> |
| 26 |
|
|
|
| 27 |
|
|
<chapter> |
| 28 |
|
|
<title>Introduction</title> |
| 29 |
|
|
<section> |
| 30 |
|
|
<title>Preface</title> |
| 31 |
|
|
<body> |
| 32 |
|
|
|
| 33 |
|
|
<p> |
| 34 |
|
|
One of the factors that delay a bug being fixed is the way it is reported. By |
| 35 |
|
|
creating this guide, we hope to help improve the communication between |
| 36 |
|
|
developers and users in bug resolution. Getting bugs fixed is an important, if |
| 37 |
|
|
not crucial part of the quality assurance for any project and hopefully this |
| 38 |
|
|
guide will help make that a success. |
| 39 |
|
|
</p> |
| 40 |
|
|
|
| 41 |
|
|
</body> |
| 42 |
|
|
</section> |
| 43 |
|
|
<section> |
| 44 |
|
|
<title>Bugs!!!!</title> |
| 45 |
|
|
<body> |
| 46 |
|
|
|
| 47 |
|
|
<p> |
| 48 |
|
|
You're emerge-ing a package or working with a program and suddenly the worst |
| 49 |
|
|
happens -- you find a bug. Bugs come in many forms like emerge failures or |
| 50 |
|
|
segmentation faults. Whatever the cause, the fact still remains that such a bug |
| 51 |
|
|
must be fixed. Here is a few examples of such bugs. |
| 52 |
|
|
</p> |
| 53 |
|
|
|
| 54 |
|
|
<pre caption="A run time error"> |
| 55 |
|
|
$ <i>./bad_code `perl -e 'print Ax100'`</i> |
| 56 |
|
|
Segmentation fault |
| 57 |
|
|
</pre> |
| 58 |
|
|
|
| 59 |
|
|
<pre caption="An emerge failure"> |
| 60 |
|
|
/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.2/include/g++-v3/backward/backward_warning.h:32:2: |
| 61 |
|
|
warning: #warning This file includes at least one deprecated or antiquated |
| 62 |
|
|
header. Please consider using one of the 32 headers found in section 17.4.1.2 of |
| 63 |
|
|
the C++ standard. Examples include substituting the <X> header for the <X.h> |
| 64 |
|
|
header for C++ includes, or <sstream> instead of the deprecated header |
| 65 |
|
|
<strstream.h>. To disable this warning use -Wno-deprecated. |
| 66 |
|
|
In file included from main.cc:40: |
| 67 |
|
|
menudef.h:55: error: brace-enclosed initializer used to initialize ` |
| 68 |
|
|
OXPopupMenu*' |
| 69 |
|
|
menudef.h:62: error: brace-enclosed initializer used to initialize ` |
| 70 |
|
|
OXPopupMenu*' |
| 71 |
|
|
menudef.h:70: error: brace-enclosed initializer used to initialize ` |
| 72 |
|
|
OXPopupMenu*' |
| 73 |
|
|
menudef.h:78: error: brace-enclosed initializer used to initialize ` |
| 74 |
|
|
OXPopupMenu*' |
| 75 |
|
|
main.cc: In member function `void OXMain::DoOpen()': |
| 76 |
|
|
main.cc:323: warning: unused variable `FILE*fp' |
| 77 |
|
|
main.cc: In member function `void OXMain::DoSave(char*)': |
| 78 |
|
|
main.cc:337: warning: unused variable `FILE*fp' |
| 79 |
|
|
make[1]: *** [main.o] Error 1 |
| 80 |
|
|
make[1]: Leaving directory |
| 81 |
|
|
`/var/tmp/portage/xclass-0.7.4/work/xclass-0.7.4/example-app' |
| 82 |
|
|
make: *** [shared] Error 2 |
| 83 |
|
|
|
| 84 |
|
|
!!! ERROR: x11-libs/xclass-0.7.4 failed. |
| 85 |
|
|
!!! Function src_compile, Line 29, Exitcode 2 |
| 86 |
|
|
!!! 'emake shared' failed |
| 87 |
|
|
</pre> |
| 88 |
|
|
|
| 89 |
|
|
<p> |
| 90 |
|
|
These errors can be quite troublesome. However, once you find them, what do |
| 91 |
|
|
you do? The following sections will look at two important tools for handling |
| 92 |
|
|
run time errors. After that, we'll take a look at compile errors, and how to |
| 93 |
|
|
handle them. Let's start out with the first tool for debugging run time |
| 94 |
|
|
errors -- <c>gdb</c>. |
| 95 |
|
|
</p> |
| 96 |
|
|
|
| 97 |
|
|
</body> |
| 98 |
|
|
</section> |
| 99 |
|
|
</chapter> |
| 100 |
|
|
|
| 101 |
|
|
|
| 102 |
|
|
<chapter> |
| 103 |
|
|
<title>Debugging using GDB</title> |
| 104 |
|
|
<section> |
| 105 |
|
|
<title>Introduction</title> |
| 106 |
|
|
<body> |
| 107 |
|
|
|
| 108 |
|
|
<p> |
| 109 |
|
|
GDB, or the (G)NU (D)e(B)ugger, is a program used to find run time errors that |
| 110 |
|
|
normally involve memory corruption. First off, let's take a look at what |
| 111 |
|
|
debugging entails. One of the main things you must do in order to debug a |
| 112 |
|
|
program is to <c>emerge</c> the program with <c>FEATURES="nostrip"</c>. This |
| 113 |
|
|
prevents the stripping of debug symbols. Why are programs stripped by default? |
| 114 |
|
|
The reason is the same as that for having gzipped man pages -- saving space. |
| 115 |
|
|
Here's how the size of a program varies with and without debug symbol stripping. |
| 116 |
|
|
</p> |
| 117 |
|
|
|
| 118 |
|
|
<pre caption="Filesize Comparison"> |
| 119 |
|
|
<comment>(debug symbols stripped)</comment> |
| 120 |
|
|
-rwxr-xr-x 1 chris users 3140 6/28 13:11 bad_code |
| 121 |
|
|
<comment>(debug symbols intact)</comment> |
| 122 |
|
|
-rwxr-xr-x 1 chris users 6374 6/28 13:10 bad_code |
| 123 |
|
|
</pre> |
| 124 |
|
|
|
| 125 |
|
|
<p> |
| 126 |
|
|
Just for reference, <e>bad_code</e> is the program we'll be debugging with |
| 127 |
|
|
<c>gdb</c> later on. As you can see, the program without debugging symbols is |
| 128 |
|
|
3140 bytes, while the program with them is 6374 bytes. That's close to double |
| 129 |
|
|
the size! Two more things can be done for debugging. The first is adding ggdb3 |
| 130 |
|
|
to your CFLAGS and CXXFLAGS. This flag adds more debugging information than is |
| 131 |
|
|
generally included. We'll see what that means later on. This is how |
| 132 |
|
|
<path>/etc/make.conf</path> <e>might</e> look with the newly added flags. |
| 133 |
|
|
</p> |
| 134 |
|
|
|
| 135 |
|
|
<pre caption="make.conf settings"> |
| 136 |
|
|
CFLAGS="-O2 -pipe -ggdb3" |
| 137 |
|
|
CXXFLAGS="${CFLAGS}" |
| 138 |
|
|
</pre> |
| 139 |
|
|
|
| 140 |
|
|
<p> |
| 141 |
swift |
1.3 |
Lastly, you can also add debug to the package's USE flags. This can be done |
| 142 |
|
|
with the <path>package.use</path> file. |
| 143 |
fox2mike |
1.1 |
</p> |
| 144 |
|
|
|
| 145 |
|
|
<pre caption="Using package.use to add debug USE flag"> |
| 146 |
|
|
# <i>echo "category/package debug" >> /etc/portage/package.use</i> |
| 147 |
|
|
</pre> |
| 148 |
|
|
|
| 149 |
|
|
<note> |
| 150 |
|
|
The directory <path>/etc/portage</path> does not exist by default and you may |
| 151 |
|
|
have to create it, if you have not already done so. If the package already has |
| 152 |
|
|
USE flags set in <path>package.use</path>, you will need to manually modify them |
| 153 |
|
|
in your favorite editor. |
| 154 |
|
|
</note> |
| 155 |
|
|
|
| 156 |
|
|
<p> |
| 157 |
|
|
Then we re-emerge the package with the modifications we've done so far as shown |
| 158 |
|
|
below. |
| 159 |
|
|
</p> |
| 160 |
|
|
|
| 161 |
|
|
<pre caption="Re-emergeing a package with debugging"> |
| 162 |
|
|
# <i>FEATURES="nostrip" emerge package</i> |
| 163 |
|
|
</pre> |
| 164 |
|
|
|
| 165 |
|
|
<p> |
| 166 |
|
|
Now that debug symbols are setup, we can continue with debugging the program. |
| 167 |
|
|
</p> |
| 168 |
|
|
|
| 169 |
|
|
</body> |
| 170 |
|
|
</section> |
| 171 |
|
|
<section> |
| 172 |
|
|
<title>Running the program with GDB</title> |
| 173 |
|
|
<body> |
| 174 |
|
|
|
| 175 |
|
|
<p> |
| 176 |
|
|
Let's say we have a program here called "bad_code". Some person claims that the |
| 177 |
|
|
program crashes and provides an example. You go ahead and test it out: |
| 178 |
|
|
</p> |
| 179 |
|
|
|
| 180 |
|
|
<pre caption="Breaking The Program"> |
| 181 |
|
|
$ <i>./bad_code `perl -e 'print Ax100'`</i> |
| 182 |
|
|
Segmentation fault |
| 183 |
|
|
</pre> |
| 184 |
|
|
|
| 185 |
|
|
<p> |
| 186 |
|
|
It seems this person was right. Since the program is obviously broken, we have |
| 187 |
|
|
a bug at hand. Now, it's time to use <c>gdb</c> to help solve this matter. First |
| 188 |
|
|
we run <c>gdb</c> with <c>--args</c>, then give it the full program with |
| 189 |
|
|
arguments like shown: |
| 190 |
|
|
</p> |
| 191 |
|
|
|
| 192 |
|
|
<pre caption="Running Our Program Through GDB"> |
| 193 |
|
|
$ <i>gdb --args ./bad_code `perl -e 'print Ax100'`</i> |
| 194 |
|
|
GNU gdb 6.3 |
| 195 |
|
|
Copyright 2004 Free Software Foundation, Inc. |
| 196 |
|
|
GDB is free software, covered by the GNU General Public License, and you are |
| 197 |
|
|
welcome to change it and/or distribute copies of it under certain conditions. |
| 198 |
|
|
Type "show copying" to see the conditions. |
| 199 |
|
|
There is absolutely no warranty for GDB. Type "show warranty" for details. |
| 200 |
|
|
This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1". |
| 201 |
|
|
</pre> |
| 202 |
|
|
|
| 203 |
|
|
<note> |
| 204 |
|
|
One can also debug with core dumps. These core files contain the same |
| 205 |
|
|
information that the program would produce when run with gdb. In order to debug |
| 206 |
|
|
with a core file with bad_code, you would run <c>gdb ./bad_code core</c> where |
| 207 |
|
|
core is the name of the core file. |
| 208 |
|
|
</note> |
| 209 |
|
|
|
| 210 |
|
|
<p> |
| 211 |
|
|
You should see a prompt that says "(gdb)" and waits for input. First, we have to |
| 212 |
|
|
run the program. We type in <c>run</c> at the command and receive a notice like: |
| 213 |
|
|
</p> |
| 214 |
|
|
|
| 215 |
|
|
<pre caption="Running the program in GDB"> |
| 216 |
|
|
(gdb) <i>run</i> |
| 217 |
|
|
Starting program: /home/chris/bad_code |
| 218 |
|
|
|
| 219 |
|
|
Program received signal SIGSEGV, Segmentation fault. |
| 220 |
|
|
0xb7ec6dc0 in strcpy () from /lib/libc.so.6 |
| 221 |
|
|
</pre> |
| 222 |
|
|
|
| 223 |
|
|
<p> |
| 224 |
|
|
Here we see the program starting, as well as a notification of SIGSEGV, or |
| 225 |
|
|
Segmentation Fault. This is GDB telling us that our program has crashed. It |
| 226 |
|
|
also gives the last run function it could trace when the program crashes. |
| 227 |
|
|
However, this isn't too useful, as there could be multiple strcpy's in the |
| 228 |
|
|
program, making it hard for developers to find which one is causing the issue. |
| 229 |
|
|
In order to help them out, we do what's called a backtrace. A backtrace runs |
| 230 |
|
|
backwards through all the functions that occurred upon program execution, to the |
| 231 |
|
|
function at fault. Functions that return (without causing a crash) will not show |
| 232 |
|
|
up on the backtrace. To get a backtrace, at the (gdb) prompt, type in <c>bt</c>. |
| 233 |
|
|
You will get something like this: |
| 234 |
|
|
</p> |
| 235 |
|
|
|
| 236 |
|
|
<pre caption="Program backtrace"> |
| 237 |
|
|
(gdb) <i>bt</i> |
| 238 |
|
|
#0 0xb7ec6dc0 in strcpy () from /lib/libc.so.6 |
| 239 |
|
|
#1 0x0804838c in run_it () |
| 240 |
|
|
#2 0x080483ba in main () |
| 241 |
|
|
</pre> |
| 242 |
|
|
|
| 243 |
|
|
<p> |
| 244 |
|
|
You can notice the trace pattern clearly. main() is called first, followed by |
| 245 |
|
|
run_it(), and somewhere in run_it() lies the strcpy() at fault. Things such as |
| 246 |
|
|
this help developers narrow down problems. There are a few exceptions to the |
| 247 |
|
|
output. First off is forgetting to enable debug symbols with |
| 248 |
swift |
1.3 |
<c>FEATURES="nostrip"</c>. With debug symbols stripped, the output looks |
| 249 |
|
|
something like this: |
| 250 |
fox2mike |
1.1 |
</p> |
| 251 |
|
|
|
| 252 |
|
|
<pre caption="Program backtrace With debug symbols stripped"> |
| 253 |
|
|
(gdb) <i>bt</i> |
| 254 |
|
|
#0 0xb7e2cdc0 in strcpy () from /lib/libc.so.6 |
| 255 |
|
|
#1 0x0804838c in ?? () |
| 256 |
|
|
#2 0xbfd19510 in ?? () |
| 257 |
|
|
#3 0x00000000 in ?? () |
| 258 |
|
|
#4 0x00000000 in ?? () |
| 259 |
|
|
#5 0xb7eef148 in libgcc_s_personality () from /lib/libc.so.6 |
| 260 |
|
|
#6 0x080482ed in ?? () |
| 261 |
|
|
#7 0x080495b0 in ?? () |
| 262 |
|
|
#8 0xbfd19528 in ?? () |
| 263 |
|
|
#9 0xb7dd73b8 in __guard_setup () from /lib/libc.so.6 |
| 264 |
|
|
#10 0xb7dd742d in __guard_setup () from /lib/libc.so.6 |
| 265 |
|
|
#11 0x00000006 in ?? () |
| 266 |
|
|
#12 0xbfd19548 in ?? () |
| 267 |
|
|
#13 0x080483ba in ?? () |
| 268 |
|
|
#14 0x00000000 in ?? () |
| 269 |
|
|
#15 0x00000000 in ?? () |
| 270 |
|
|
#16 0xb7deebcc in __new_exitfn () from /lib/libc.so.6 |
| 271 |
|
|
#17 0x00000000 in ?? () |
| 272 |
|
|
#18 0xbfd19560 in ?? () |
| 273 |
|
|
#19 0xb7ef017c in nullserv () from /lib/libc.so.6 |
| 274 |
|
|
#20 0xb7dd6f37 in __libc_start_main () from /lib/libc.so.6 |
| 275 |
|
|
#21 0x00000001 in ?? () |
| 276 |
|
|
#22 0xbfd195d4 in ?? () |
| 277 |
|
|
#23 0xbfd195dc in ?? () |
| 278 |
|
|
#24 0x08048201 in ?? () |
| 279 |
|
|
</pre> |
| 280 |
|
|
|
| 281 |
|
|
<p> |
| 282 |
|
|
This backtrace contains a large number of ?? marks. This is because without |
| 283 |
|
|
debug symbols, <c>gdb</c> doesn't know how the program was run. Hence, it is |
| 284 |
|
|
crucial that debug symbols are <e>not</e> stripped. Now remember a while ago we |
| 285 |
|
|
mentioned the -ggdb3 flag. Let's see what the output looks like with the flag |
| 286 |
|
|
enabled: |
| 287 |
|
|
</p> |
| 288 |
|
|
|
| 289 |
|
|
<pre caption="Program backtrace with -ggdb3"> |
| 290 |
|
|
(gdb) <i>bt</i> |
| 291 |
|
|
#0 0xb7e4bdc0 in strcpy () from /lib/libc.so.6 |
| 292 |
|
|
#1 0x0804838c in run_it (input=0x0) at bad_code.c:7 |
| 293 |
|
|
#2 0x080483ba in main (argc=1, argv=0xbfd3a434) at bad_code.c:12 |
| 294 |
|
|
</pre> |
| 295 |
|
|
|
| 296 |
|
|
<p> |
| 297 |
|
|
Here we see that a lot more information is available for developers. Not only is |
| 298 |
|
|
function information displayed, but even the exact line numbers of the source |
| 299 |
|
|
files. This method is the most preferred if you can spare the extra space. |
| 300 |
|
|
Here's how much the file size varies between debug, strip, and -ggdb3 enabled |
| 301 |
|
|
programs. |
| 302 |
|
|
</p> |
| 303 |
|
|
|
| 304 |
|
|
<pre caption="Filesize differences With -ggdb3 flag"> |
| 305 |
|
|
<comment>(debug symbols stripped)</comment> |
| 306 |
|
|
-rwxr-xr-x 1 chris users 3140 6/28 13:11 bad_code |
| 307 |
|
|
<comment>(debug symbols enabled)</comment> |
| 308 |
|
|
-rwxr-xr-x 1 chris users 6374 6/28 13:10 bad_code |
| 309 |
|
|
<comment>(-ggdb3 flag enabled)</comment> |
| 310 |
|
|
-rwxr-xr-x 1 chris users 19552 6/28 13:11 bad_code |
| 311 |
|
|
</pre> |
| 312 |
|
|
|
| 313 |
|
|
<p> |
| 314 |
swift |
1.3 |
As you can see, -ggdb3 adds about <e>13178</e> more bytes to the file size |
| 315 |
|
|
over the one with debugging symbols. However, as shown above, this increase |
| 316 |
|
|
in file size can be worth it if presenting debug information to developers. |
| 317 |
|
|
The backtrace can be saved to a file by copying and pasting from the |
| 318 |
|
|
terminal (if it's a non-x based terminal, you can use gpm. To keep this |
| 319 |
|
|
doc simple, I recommend you read up on the documentation for gpm to see |
| 320 |
|
|
how to copy and paste with it). Now that we're done with <c>gdb</c>, we |
| 321 |
|
|
can quit. |
| 322 |
fox2mike |
1.1 |
</p> |
| 323 |
|
|
|
| 324 |
|
|
<pre caption="Quitting GDB"> |
| 325 |
|
|
(gdb) <i>quit</i> |
| 326 |
|
|
The program is running. Exit anyway? (y or n) <i>y</i> |
| 327 |
|
|
$ |
| 328 |
|
|
</pre> |
| 329 |
|
|
|
| 330 |
|
|
<p> |
| 331 |
swift |
1.3 |
This ends the walk-through of <c>gdb</c>. Using <c>gdb</c>, we hope that you |
| 332 |
|
|
will be able to use it to create better bug reports. However, there are other |
| 333 |
|
|
types of errors that can cause a program to fail during run time. One of the |
| 334 |
|
|
other ways is through improper file access. We can find those using a nifty |
| 335 |
|
|
little tool called <c>strace</c>. |
| 336 |
fox2mike |
1.1 |
</p> |
| 337 |
|
|
|
| 338 |
|
|
</body> |
| 339 |
|
|
</section> |
| 340 |
|
|
</chapter> |
| 341 |
|
|
|
| 342 |
|
|
<chapter> |
| 343 |
|
|
<title>Finding file access errors using strace</title> |
| 344 |
|
|
<section> |
| 345 |
|
|
<title>Introduction</title> |
| 346 |
|
|
<body> |
| 347 |
|
|
|
| 348 |
|
|
<p> |
| 349 |
|
|
Programs often use files to fetch configuration information, access hardware or |
| 350 |
|
|
write logs. Sometimes, a program attempts to reach such files incorrectly. A |
| 351 |
|
|
tool called <c>strace</c> was created to help deal with this. <c>strace</c> |
| 352 |
|
|
traces system calls (hence the name) which include calls that use the memory and |
| 353 |
|
|
files. For our example, we're going to take a program foobar2. This is an |
| 354 |
swift |
1.3 |
updated version of foobar. However, during the change over to foobar2, you |
| 355 |
|
|
notice all your configurations are missing! In foobar version 1, you had it |
| 356 |
|
|
setup to say "foo", but now it's using the default "bar". |
| 357 |
fox2mike |
1.1 |
</p> |
| 358 |
|
|
|
| 359 |
|
|
<pre caption="Foobar2 With an invalid configuration"> |
| 360 |
|
|
$ <i>./foobar2</i> |
| 361 |
|
|
Configuration says: bar |
| 362 |
|
|
</pre> |
| 363 |
|
|
|
| 364 |
|
|
<p> |
| 365 |
|
|
Our previous configuration specifically had it set to foo, so let's use |
| 366 |
|
|
<c>strace</c> to find out what's going on. |
| 367 |
|
|
</p> |
| 368 |
|
|
|
| 369 |
|
|
</body> |
| 370 |
|
|
</section> |
| 371 |
|
|
<section> |
| 372 |
|
|
<title>Using strace to track the issue</title> |
| 373 |
|
|
<body> |
| 374 |
|
|
|
| 375 |
|
|
<p> |
| 376 |
|
|
We make <c>strace</c> log the results of the system calls. To do this, we run |
| 377 |
|
|
<c>strace</c> with the -o[file] arguments. Let's use it on foobar2 as shown. |
| 378 |
|
|
</p> |
| 379 |
|
|
|
| 380 |
|
|
<pre caption="Running foobar2 through strace"> |
| 381 |
|
|
# <i>strace -ostrace.log ./foobar2</i> |
| 382 |
|
|
</pre> |
| 383 |
|
|
|
| 384 |
|
|
<p> |
| 385 |
|
|
This creates a file called <path>strace.log</path> in the current directory. We |
| 386 |
|
|
check the file, and shown below are the relevant parts from the file. |
| 387 |
|
|
</p> |
| 388 |
|
|
|
| 389 |
|
|
<pre caption="A Look At the strace Log"> |
| 390 |
|
|
open(".foobar2/config", O_RDONLY) = 3 |
| 391 |
|
|
read(3, "bar", 3) = 3 |
| 392 |
|
|
</pre> |
| 393 |
|
|
|
| 394 |
|
|
<p> |
| 395 |
|
|
Aha! So There's the problem. Someone moved the configuration directory to |
| 396 |
|
|
<path>.foobar2</path> instead of <path>.foobar</path>. We also see the program |
| 397 |
|
|
reading in "bar" as it should. In this case, we can recommend the ebuild |
| 398 |
|
|
maintainer to put a warning about it. For now though, we can copy over the |
| 399 |
|
|
config file from <path>.foobar</path> and modify it to produce the correct |
| 400 |
|
|
results. |
| 401 |
|
|
</p> |
| 402 |
|
|
|
| 403 |
|
|
</body> |
| 404 |
|
|
</section> |
| 405 |
|
|
<section> |
| 406 |
|
|
<title>Conclusion</title> |
| 407 |
|
|
<body> |
| 408 |
|
|
|
| 409 |
|
|
<p> |
| 410 |
swift |
1.3 |
<c>strace</c> is a great way at seeing what the kernel is doing to with the |
| 411 |
|
|
filesystem. Another program exists to help users see what the kernel is doing, |
| 412 |
|
|
and help with kernel debugging. This program is called <c>dmesg</c>. |
| 413 |
fox2mike |
1.2 |
</p> |
| 414 |
|
|
|
| 415 |
|
|
</body> |
| 416 |
|
|
</section> |
| 417 |
|
|
</chapter> |
| 418 |
|
|
|
| 419 |
|
|
<chapter> |
| 420 |
|
|
<title>Kernel Debugging With dmesg</title> |
| 421 |
|
|
<section> |
| 422 |
|
|
<title>dmesg Introduction</title> |
| 423 |
|
|
<body> |
| 424 |
|
|
|
| 425 |
|
|
<p> |
| 426 |
fox2mike |
1.4 |
<c>dmesg</c> is a system program created with debugging kernel operation. It |
| 427 |
fox2mike |
1.2 |
basically reads the kernel messages and keeps them in buffer, letting the user |
| 428 |
fox2mike |
1.4 |
see them later on. Here's an example of what a dmesg output looks like: |
| 429 |
fox2mike |
1.2 |
</p> |
| 430 |
|
|
|
| 431 |
|
|
<pre caption="dmesg sample output"> |
| 432 |
|
|
SIS5513: IDE controller at PCI slot 0000:00:02.5 |
| 433 |
|
|
SIS5513: chipset revision 208 |
| 434 |
|
|
SIS5513: not 100% native mode: will probe irqs later |
| 435 |
|
|
SIS5513: SiS 961 MuTIOL IDE UDMA100 controller |
| 436 |
|
|
ide0: BM-DMA at 0x4000-0x4007, BIOS settings: hda:DMA, hdb:DMA |
| 437 |
|
|
ide1: BM-DMA at 0x4008-0x400f, BIOS settings: hdc:DMA, hdd:DMA |
| 438 |
|
|
Probing IDE interface ide0... |
| 439 |
|
|
input: ImPS/2 Generic Wheel Mouse on isa0060/serio1 |
| 440 |
|
|
hda: WDC WD800BB-60CJA0, ATA DISK drive |
| 441 |
|
|
hdb: CD-RW 52X24, ATAPI CD/DVD-ROM drive |
| 442 |
|
|
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 |
| 443 |
|
|
Probing IDE interface ide1... |
| 444 |
|
|
hdc: SAMSUNG DVD-ROM SD-616T, ATAPI CD/DVD-ROM drive |
| 445 |
|
|
hdd: Maxtor 92049U6, ATA DISK drive |
| 446 |
|
|
ide1 at 0x170-0x177,0x376 on irq 15 |
| 447 |
|
|
hda: max request size: 128KiB |
| 448 |
|
|
hda: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=65535/16/63, |
| 449 |
|
|
UDMA(100) |
| 450 |
|
|
hda: cache flushes not supported |
| 451 |
|
|
hda: hda1 |
| 452 |
|
|
hdd: max request size: 128KiB |
| 453 |
|
|
hdd: 39882528 sectors (20419 MB) w/2048KiB Cache, CHS=39566/16/63, |
| 454 |
|
|
UDMA(66) |
| 455 |
|
|
hdd: cache flushes not supported |
| 456 |
|
|
hdd: unknown partition table |
| 457 |
|
|
hdb: ATAPI 52X CD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33) |
| 458 |
|
|
Uniform CD-ROM driver Revision: 3.20 |
| 459 |
|
|
hdc: ATAPI 48X DVD-ROM drive, 512kB Cache, UDMA(33) |
| 460 |
|
|
ide-floppy driver 0.99.newide |
| 461 |
|
|
libata version 1.11 loaded. |
| 462 |
|
|
usbmon: debugs is not available |
| 463 |
|
|
</pre> |
| 464 |
|
|
|
| 465 |
|
|
<p> |
| 466 |
fox2mike |
1.4 |
The dmesg displayed here is my machine's bootup. You can see the hard disks and |
| 467 |
fox2mike |
1.2 |
input devices being initialized. While what you see here seems relatively |
| 468 |
fox2mike |
1.4 |
harmless, <c>dmesg</c> is also good at showing when things go wrong. Let's take |
| 469 |
|
|
for example an IPAQ 1945 I have. After a couple of minutes of inactivity, the |
| 470 |
|
|
device powers off. Now, I have the device connected into the USB port in the |
| 471 |
|
|
front of my system. Now, I want to copy over some files using libsynCE, so I go |
| 472 |
fox2mike |
1.2 |
ahead and initiate a connection: |
| 473 |
|
|
</p> |
| 474 |
|
|
|
| 475 |
|
|
<pre caption="IPAQ connection attempt"> |
| 476 |
|
|
# <i>synce-serial-start</i> |
| 477 |
|
|
/usr/sbin/pppd: In file /etc/ppp/peers/synce-device: unrecognized option |
| 478 |
|
|
'/dev/tts/USB0' |
| 479 |
|
|
|
| 480 |
|
|
synce-serial-start was unable to start the PPP daemon! |
| 481 |
|
|
</pre> |
| 482 |
|
|
|
| 483 |
|
|
<p> |
| 484 |
|
|
The connection fails, as we see here, and we assume that only the screen is in |
| 485 |
|
|
powersave mode, and that maybe the connection is faulty. In order to see what |
| 486 |
|
|
truly happened, we can use <c>dmesg</c>. Now, <c>dmesg</c> tends to give a |
| 487 |
fox2mike |
1.4 |
rather large ammount of output. One can use the <c>tail</c> command to help |
| 488 |
fox2mike |
1.2 |
keep the output down: |
| 489 |
|
|
</p> |
| 490 |
|
|
|
| 491 |
|
|
<pre caption="Adjusting the output ammount with tail"> |
| 492 |
|
|
$ <i>dmesg | tail -n 4</i> |
| 493 |
|
|
usb 1-1.2: PocketPC PDA converter now attached to ttyUSB0 |
| 494 |
|
|
usb 1-1.2: USB disconnect, address 11 |
| 495 |
|
|
PocketPC PDA ttyUSB0: PocketPC PDA converter now disconnected from ttyUSB0 |
| 496 |
|
|
ipaq 1-1.2:1.0: device disconnected |
| 497 |
|
|
</pre> |
| 498 |
|
|
|
| 499 |
|
|
<p> |
| 500 |
fox2mike |
1.4 |
This gives us the last 4 lines of the <c>dmesg</c> output. Now, this is enough |
| 501 |
|
|
to give us some information on the situation. It seems that in the first 2 |
| 502 |
|
|
lines, the pocketpc is recognized as connected. However, in the last 2 lines, it |
| 503 |
|
|
appears to have been disconnected. With this information we check the pocketpc |
| 504 |
|
|
again, and find out it is powered off, and now know about the powersave mode. We |
| 505 |
|
|
can use this information to turn the feature off, or be aware of it next time. |
| 506 |
|
|
While this is a somewhat simple example, it does go to show how well |
| 507 |
|
|
<c>dmesg</c> can work. However, in more complex examples (such as kernel bugs), |
| 508 |
|
|
the entire <c>dmesg</c> output may be required. To obtain that, simple redirect |
| 509 |
|
|
to a log file as such: |
| 510 |
fox2mike |
1.2 |
</p> |
| 511 |
|
|
|
| 512 |
|
|
<pre caption="Saving dmesg output to a log"> |
| 513 |
|
|
$ <i>dmesg > dmesg.log</i> |
| 514 |
|
|
</pre> |
| 515 |
|
|
|
| 516 |
|
|
<p> |
| 517 |
|
|
You can then attach this to a bug report, or post it online somewhere for |
| 518 |
|
|
collaborative debugging sessions. |
| 519 |
|
|
</p> |
| 520 |
|
|
|
| 521 |
|
|
</body> |
| 522 |
|
|
</section> |
| 523 |
|
|
<section> |
| 524 |
|
|
<title>Conclusion</title> |
| 525 |
|
|
<body> |
| 526 |
|
|
|
| 527 |
|
|
<p> |
| 528 |
|
|
Now that we've taken a look at a few ways to debug runtime and kernel errors, |
| 529 |
|
|
let's take a look at how to handle emerge errors. |
| 530 |
fox2mike |
1.1 |
</p> |
| 531 |
|
|
|
| 532 |
|
|
</body> |
| 533 |
|
|
</section> |
| 534 |
|
|
</chapter> |
| 535 |
|
|
|
| 536 |
|
|
<chapter> |
| 537 |
|
|
<title>Handling emerge Errors</title> |
| 538 |
|
|
<section> |
| 539 |
|
|
<title>Introduction</title> |
| 540 |
|
|
<body> |
| 541 |
|
|
|
| 542 |
|
|
<p> |
| 543 |
|
|
<c>emerge</c> errors, such as the one displayed earlier, can be a major cause |
| 544 |
|
|
of frustration for users. Reporting them is considered crucial for maintaining |
| 545 |
|
|
the health of Gentoo. Let's take a look at a sample ebuild, foobar2, which |
| 546 |
|
|
contains some build errors. |
| 547 |
|
|
</p> |
| 548 |
|
|
|
| 549 |
|
|
</body> |
| 550 |
|
|
</section> |
| 551 |
|
|
<section id="emerge_error"> |
| 552 |
|
|
<title>Evaluating emerge Errors</title> |
| 553 |
|
|
<body> |
| 554 |
|
|
|
| 555 |
|
|
<p> |
| 556 |
|
|
Let's take a look at this very simple <c>emerge</c> error: |
| 557 |
|
|
</p> |
| 558 |
|
|
|
| 559 |
|
|
<pre caption="emerge Error"> |
| 560 |
|
|
gcc -D__TEST__ -D__GNU__ -D__LINUX__ -L/usr/lib -I/usr/include -L/usr/lib/nspr/ -I/usr/include/fmod -c -o foobar2-7.o foobar2-7.c |
| 561 |
|
|
gcc -D__TEST__ -D__GNU__ -D__LINUX__ -L/usr/lib -I/usr/include -L/usr/lib/nspr/ -I/usr/include/fmod -c -o foobar2-8.o foobar2-8.c |
| 562 |
|
|
gcc -D__TEST__ -D__GNU__ -D__LINUX__ -L/usr/lib -I/usr/include -L/usr/lib/nspr/ -I/usr/include/fmod -c -o foobar2-9.o foobar2-9.c |
| 563 |
|
|
gcc -D__TEST__ -D__GNU__ -D__LINUX__ -L/usr/lib -I/usr/include -L/usr/lib/nspr/ -I/usr/include/fmod -c -o foobar2.o foobar2.c |
| 564 |
|
|
foobar2.c:1:17: ogg.h: No such file or directory |
| 565 |
|
|
make: *** [foobar2.o] Error 1 |
| 566 |
|
|
|
| 567 |
|
|
!!! ERROR: sys-apps/foobar2-1.0 failed. |
| 568 |
|
|
!!! Function src_compile, Line 19, Exitcode 2 |
| 569 |
|
|
!!! Make failed! |
| 570 |
|
|
!!! If you need support, post the topmost build error, NOT this status message |
| 571 |
|
|
</pre> |
| 572 |
|
|
|
| 573 |
|
|
<p> |
| 574 |
swift |
1.3 |
The program is compiling smoothly when it suddenly stops and presents an error |
| 575 |
|
|
message. This particular error can be split into 3 different sections, The |
| 576 |
|
|
compile messages, the build error, and the emerge error message as shown below. |
| 577 |
fox2mike |
1.1 |
</p> |
| 578 |
|
|
|
| 579 |
|
|
<pre caption="Parts of the error"> |
| 580 |
|
|
<comment>(Compilation Messages)</comment> |
| 581 |
|
|
gcc -D__TEST__ -D__GNU__ -D__LINUX__ -L/usr/lib -I/usr/include -L/usr/lib/nspr/ -I/usr/include/fmod -c -o foobar2-7.o foobar2-7.c |
| 582 |
|
|
gcc -D__TEST__ -D__GNU__ -D__LINUX__ -L/usr/lib -I/usr/include -L/usr/lib/nspr/ -I/usr/include/fmod -c -o foobar2-8.o foobar2-8.c |
| 583 |
|
|
gcc -D__TEST__ -D__GNU__ -D__LINUX__ -L/usr/lib -I/usr/include -L/usr/lib/nspr/ -I/usr/include/fmod -c -o foobar2-9.o foobar2-9.c |
| 584 |
|
|
gcc -D__TEST__ -D__GNU__ -D__LINUX__ -L/usr/lib -I/usr/include -L/usr/lib/nspr/ -I/usr/include/fmod -c -o foobar2.o foobar2.c |
| 585 |
|
|
|
| 586 |
|
|
<comment>(Build Error)</comment> |
| 587 |
|
|
foobar2.c:1:17: ogg.h: No such file or directory |
| 588 |
|
|
make: *** [foobar2.o] Error 1 |
| 589 |
|
|
|
| 590 |
|
|
<comment>(emerge Error)</comment> |
| 591 |
|
|
!!! ERROR: sys-apps/foobar2-1.0 failed. |
| 592 |
|
|
!!! Function src_compile, Line 19, Exitcode 2 |
| 593 |
|
|
!!! Make failed! |
| 594 |
|
|
!!! If you need support, post the topmost build error, NOT this status message |
| 595 |
|
|
</pre> |
| 596 |
|
|
|
| 597 |
|
|
<p> |
| 598 |
|
|
The compilation messages are what lead up to the error. Most often, it's good to |
| 599 |
|
|
at least include 10 lines of compile information so that the developer knows |
| 600 |
|
|
where the compilation was at when the error occurred. |
| 601 |
|
|
</p> |
| 602 |
|
|
|
| 603 |
|
|
<p> |
| 604 |
|
|
Make errors are the actual error and the information the developer needs. When |
| 605 |
|
|
you see "make: ***", this is often where the error has occurred. Normally, you |
| 606 |
|
|
can copy and paste 10 lines above it and the developer will be able to address |
| 607 |
|
|
the issue. However, this may not always work and we'll take a look at an |
| 608 |
|
|
alternative shortly. |
| 609 |
|
|
</p> |
| 610 |
|
|
|
| 611 |
|
|
<p> |
| 612 |
|
|
The emerge error is what <c>emerge</c> throws out as an error. Sometimes, this |
| 613 |
|
|
might also contain some important information. Often people make the mistake of |
| 614 |
|
|
posting the emerge error and that's all. This is useless by itself, but with |
| 615 |
|
|
make error and compile information, a developer can get what application and |
| 616 |
|
|
what version of the package is failing. As a side note, make is commonly used as |
| 617 |
|
|
the build process for programs (<b>but not always</b>). If you can't find a |
| 618 |
|
|
"make: ***" error anywhere, then simply copy and paste 20 lines before the |
| 619 |
|
|
emerge error. This should take care of most all build system error messages. Now |
| 620 |
|
|
let's say the errors seem to be quite large. 10 lines won't be enough to catch |
| 621 |
|
|
everything. That's where PORT_LOGDIR comes into play. |
| 622 |
|
|
</p> |
| 623 |
|
|
|
| 624 |
|
|
</body> |
| 625 |
|
|
</section> |
| 626 |
|
|
<section> |
| 627 |
|
|
<title>emerge and PORT_LOGDIR</title> |
| 628 |
|
|
<body> |
| 629 |
|
|
|
| 630 |
|
|
<p> |
| 631 |
|
|
PORT_LOGDIR is a portage variable that sets up a log directory for separate |
| 632 |
|
|
emerge logs. Let's take a look and see what that entails. First, run your emerge |
| 633 |
|
|
with PORT_LOGDIR set to your favorite log location. Let's say we have a |
| 634 |
|
|
location <path>/var/log/portage</path>. We'll use that for our log directory: |
| 635 |
|
|
</p> |
| 636 |
|
|
|
| 637 |
|
|
<note> |
| 638 |
|
|
In the default setup, <path>/var/log/portage</path> does not exist, and you will |
| 639 |
|
|
most likely have to create it. If you do not, portage will fail to write the |
| 640 |
|
|
logs. |
| 641 |
|
|
</note> |
| 642 |
|
|
|
| 643 |
|
|
<pre caption="emerge-ing With PORT_LOGDIR"> |
| 644 |
|
|
# <i>PORT_LOGDIR=/var/log/portage emerge foobar2</i> |
| 645 |
|
|
</pre> |
| 646 |
|
|
|
| 647 |
|
|
<p> |
| 648 |
|
|
Now the emerge fails again. However, this time we have a log we can work with, |
| 649 |
|
|
and attach to the bug later on. Let's take a quick look at our log directory. |
| 650 |
|
|
</p> |
| 651 |
|
|
|
| 652 |
|
|
<pre caption="PORT_LOGDIR Contents"> |
| 653 |
|
|
# <i>ls -la /var/log/portage</i> |
| 654 |
|
|
total 16 |
| 655 |
|
|
drwxrws--- 2 root root 4096 Jun 30 10:08 . |
| 656 |
|
|
drwxr-xr-x 15 root root 4096 Jun 30 10:08 .. |
| 657 |
|
|
-rw-r--r-- 1 root root 7390 Jun 30 10:09 2115-foobar2-1.0.log |
| 658 |
|
|
</pre> |
| 659 |
|
|
|
| 660 |
|
|
<p> |
| 661 |
|
|
The log files have the format [counter]-[package name]-[version].log. Counter |
| 662 |
|
|
is a special variable that is meant to state this package as the n-th package |
| 663 |
|
|
you've emerged. This prevents duplicate logs from appearing. A quick look at |
| 664 |
|
|
the log file will show the entire emerge process. This can be attached later |
| 665 |
|
|
on as we'll see in the bug reporting section. Now that we've safely obtained |
| 666 |
|
|
our information needed to report the bug we can continue to do so. However, |
| 667 |
|
|
before we get started on that, we need to make sure no one else has reported |
| 668 |
|
|
the issue. |
| 669 |
|
|
</p> |
| 670 |
|
|
|
| 671 |
|
|
</body> |
| 672 |
|
|
</section> |
| 673 |
|
|
</chapter> |
| 674 |
|
|
</guide> |