This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
haas:spring2015:common:intro-to-gdb [2012/09/29 16:12] – external edit 127.0.0.1 | haas:spring2015:common:intro-to-gdb [2015/04/04 15:00] (current) – [Viewing program data] wedge | ||
---|---|---|---|
Line 25: | Line 25: | ||
There are many, many more features, but we're just getting started, and these are by far the most useful for us in our endeavors. | There are many, many more features, but we're just getting started, and these are by far the most useful for us in our endeavors. | ||
+ | |||
+ | =====Compile-time Support===== | ||
+ | To take full advantage of the debugging environment, | ||
+ | |||
+ | We can do this by adding a **-g** to the compiler' | ||
+ | |||
+ | <cli> | ||
+ | $ gcc -g -o hello hello.c | ||
+ | </ | ||
+ | |||
+ | =====Using a debugger during program execution===== | ||
+ | A debugger acts as a sort of wrapper when running a program. It runs the desired program within it, so that we can use the debugger' | ||
+ | |||
+ | As such, we need to start a debugging session as follows: | ||
+ | |||
+ | <cli> | ||
+ | $ gdb ./hello | ||
+ | </ | ||
+ | |||
+ | Note we run the program as usual, but we take care to prefix it with " | ||
+ | |||
+ | =====Segfault mitigation===== | ||
+ | If your program is segfaulting, | ||
+ | |||
+ | Let's take a program that will segfault on execution: | ||
+ | |||
+ | <code c 1> | ||
+ | #include < | ||
+ | |||
+ | struct thing { | ||
+ | int val; | ||
+ | struct thing *other; | ||
+ | }; | ||
+ | |||
+ | int main() | ||
+ | { | ||
+ | struct thing *stuff; | ||
+ | char c, *s, hi = 0, len = 0; | ||
+ | while ((c = fgetc(stdin)) != ' | ||
+ | { | ||
+ | *(s+len) = c; | ||
+ | fprintf(stdout, | ||
+ | len = len + 1; | ||
+ | |||
+ | if (c > hi) | ||
+ | hi = c; | ||
+ | } | ||
+ | fprintf(stdout, | ||
+ | |||
+ | stuff -> val = hi; | ||
+ | |||
+ | fprintf(stdout, | ||
+ | |||
+ | return(0); | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | Type this in and name it (I'll use **input.c** as my example name). | ||
+ | |||
+ | Compile it with debugging support: | ||
+ | |||
+ | <cli> | ||
+ | $ gcc -g -o input input.c | ||
+ | $ | ||
+ | </ | ||
+ | |||
+ | The debugger is only useful with code free of syntax errors (because it requires the code to successfully compile to work). If your code does not compile, you cannot use the debugger to help fix the problem. | ||
+ | |||
+ | Now, let us start gdb with **input** as our debug target: | ||
+ | |||
+ | <cli> | ||
+ | $ gdb ./input | ||
+ | </ | ||
+ | |||
+ | A smallish banner message will appear, and at the very bottom will be a " | ||
+ | |||
+ | For starters, let us just run the program and see what happens. We do this by issuing the " | ||
+ | |||
+ | <cli> | ||
+ | (gdb) run | ||
+ | </ | ||
+ | |||
+ | It'll appear to pause; that's because it is expecting input... so type in something (hello) and hit ENTER to allow it to proceed, you should then see something resembling the following: | ||
+ | |||
+ | <cli> | ||
+ | (gdb) run | ||
+ | Starting program: / | ||
+ | hello | ||
+ | just read: ' | ||
+ | just read: ' | ||
+ | just read: ' | ||
+ | just read: ' | ||
+ | just read: ' | ||
+ | hello | ||
+ | |||
+ | Program received signal SIGSEGV, Segmentation fault. | ||
+ | 0x0000000000400682 in main () at input.c:23 | ||
+ | 23 stuff -> val = hi; | ||
+ | (gdb) | ||
+ | </ | ||
+ | |||
+ | Aha! A segfault! And look at what the debugger just told us... the EXACT line that, when processed, results in a segfault: | ||
+ | |||
+ | <cli> | ||
+ | 0x0000000000400682 in main () at input.c:23 | ||
+ | 23 stuff -> val = hi; | ||
+ | </ | ||
+ | |||
+ | In fact, there are 3 important pieces of information that are immediately useful to us: | ||
+ | |||
+ | - This problem occurred within the **main()** function (narrowing our search) | ||
+ | - The problem manifested itself specifically on line 23 of input.c, within the main() function | ||
+ | - That the problem is this piece of code: **stuff -> val = hi;** | ||
+ | |||
+ | Now, we also know that the code compiled cleanly-- no warnings or errors. So there are no syntax errors. | ||
+ | |||
+ | So what could the problem be? | ||
+ | |||
+ | For that, more debugging steps are in order. | ||
+ | |||
+ | First, if there were multiple function calls at work, it might help to know the function call order that took place (how did we get here-- the problem may not be here, but in something that came before). It is a good idea to perform a function call backtrace, showing where we are back to where we started (we always start at **main()**).. so if there do not appear to be any problems here, we can come up with strategies for testing prerequisite functions. | ||
+ | |||
+ | To do a backtrace, simply type **bt** at the " | ||
+ | |||
+ | <cli> | ||
+ | (gdb) bt | ||
+ | #0 0x0000000000400682 in main () at input.c:23 | ||
+ | (gdb) | ||
+ | </ | ||
+ | |||
+ | =====Setting a breakpoint===== | ||
+ | Now that we know our problem is on line 23 of main(), and if just knowing that didn't lead to identifying and fixing the problem (you did go back and take a look, right? The debugger assists you in solving problems, it does not solve problems for you), we'll have to dig a little deeper. | ||
+ | |||
+ | The next approach we should take is setting a break point. A breakpoint is essentially a cue given to the debugger to STOP execution once it reaches a given line. It is important to realize that the line in question **HAS NOT** yet been run, but is **ABOUT** to be run. | ||
+ | |||
+ | ====set a breakpoint==== | ||
+ | So, we know the problem is on line 23, so let us set a breakpoint there: | ||
+ | |||
+ | <cli> | ||
+ | (gdb) break 23 | ||
+ | Breakpoint 1 at 0x40067a: file input.c, line 23. | ||
+ | (gdb) | ||
+ | </ | ||
+ | |||
+ | ====re-run the program==== | ||
+ | Now, let us start execution once again: | ||
+ | |||
+ | <cli> | ||
+ | (gdb) run | ||
+ | The program being debugged has been started already. | ||
+ | Start it from the beginning? (y or n) y | ||
+ | Starting program: / | ||
+ | hello | ||
+ | just read: ' | ||
+ | just read: ' | ||
+ | just read: ' | ||
+ | just read: ' | ||
+ | just read: ' | ||
+ | hello | ||
+ | |||
+ | Breakpoint 1, main () at input.c:23 | ||
+ | 23 stuff -> val = hi; | ||
+ | (gdb) | ||
+ | </ | ||
+ | |||
+ | Okay... we've done it... re-run the program, and this time stopped just short of where the segfault seems to be taking place. | ||
+ | |||
+ | =====Viewing program data===== | ||
+ | Now it is time to take a look at what is actually going on. We THINK we know what is going on, but clearly what we think and what is actually are two different things (we think there shouldn' | ||
+ | |||
+ | So, looking at our suspect line: | ||
+ | |||
+ | <code c> | ||
+ | 23 stuff -> val = hi; | ||
+ | </ | ||
+ | |||
+ | Let us see what the states of these variables are. | ||
+ | |||
+ | ====printing values during debug==== | ||
+ | To check the current state of a variable, we can use the **print** or **display** command to gdb. | ||
+ | |||
+ | **print** will do a one time display of the state of a variable. | ||
+ | |||
+ | **display** will set a display point, printing that variable state out after any further gdb commands (very useful for watching a loop play out) | ||
+ | |||
+ | For now, let us take a look at both the **hi** and **stuff -> val** variables: | ||
+ | |||
+ | <cli> | ||
+ | (gdb) print hi | ||
+ | $1 = 111 ' | ||
+ | (gdb) print stuff -> val | ||
+ | $2 = -1991643855 | ||
+ | (gdb) print stuff | ||
+ | $3 = (struct thing *) 0x4004d0 < | ||
+ | </ | ||
+ | |||
+ | That value of **hi** should make sense (it should be set to the highest character value encountered during execution (user input)... if you typed in " | ||
+ | |||
+ | The stuff struct prints out seemingly random stuff. But we know that it is a pointer, and we didn't initialize it, so we're seeing whatever initial garbage values were at that memory location. | ||
+ | |||
+ | Nothing seemingly out of place... let's check out the **stuff** variable itself: | ||
+ | |||
+ | <cli> | ||
+ | (gdb) print stuff | ||
+ | $3 = (struct thing *) 0x4004d0 < | ||
+ | </ | ||
+ | |||
+ | Even that seems okay... it is a pointer, it should have an address. | ||
+ | |||
+ | Okay, so everything seems in order... let's try executing this line (and just this one line) and see what happens. | ||
+ | |||
+ | =====Single-Stepping===== | ||
+ | The debugger allows us to ' | ||
+ | |||
+ | There are 2 stepping commands: | ||
+ | |||
+ | * **s**tep: execute the next instruction | ||
+ | * **n**ext: execute the next instruction, | ||
+ | |||
+ | The **step** command lets us follow the thread of program execution, whereever it may lead. This can have its uses, but we have to be careful, we can only go where there is debugging support- while we compiled our program with debugging support, we linked against a non-debug C library. So any of those functions (**fgetc()**, | ||
+ | |||
+ | When faced with a function call without debug symbols, or we simply do not wish to follow the thread of execution into that function, we can instead opt to step over it as if it were just a simple instruction. This is where the **next** command comes in handy. | ||
+ | |||
+ | Let us execute that variable assignment, by issuing a **step** command: | ||
+ | |||
+ | <cli> | ||
+ | (gdb) n | ||
+ | |||
+ | Program received signal SIGSEGV, Segmentation fault. | ||
+ | 0x0000000000400682 in main () at input.c:23 | ||
+ | 23 stuff -> val = hi; | ||
+ | (gdb) | ||
+ | </ | ||
+ | |||
+ | Everything seemed fine, but then when we tried to run it, bam- segfault. | ||
+ | |||
+ | So something is clearly awry here. | ||
+ | |||
+ | Knowing what those two variables are, **hi** likely isn't the problem, it is just a regular scalar variable. | ||
+ | |||
+ | But **stuff** is a pointer. We know that when using pointers, we open the door to these kinds of problems. | ||
+ | |||
+ | So what might the problem be? | ||
+ | |||
+ | =====Solution===== | ||
+ | This solution requires knowledge of the program itself-- its purpose, and the code contained therein. So clearly, if you aren't familiar with the code, not even the debugger can help you get to some solutions. | ||
+ | |||
+ | In this case, the problem was that while we declared **stuff** as a pointer to a thing struct, we neglected to **allocate** memory, or point it at an existing instance of a thing struct. | ||
+ | |||
+ | Adding this line up top would clear up the problem: | ||
+ | |||
+ | <code c> | ||
+ | stuff = (struct thing *) malloc (sizeof (struct thing)); | ||
+ | </ | ||
+ | |||
+ | Also what could have helped better identify this problem would have been to initialize **stuff** to NULL (one should ALWAYS set their variables to sane initial values).. setting it to NULL would have shown **stuff** to have been NULL, so there would NOT have been a **val** element to access (which would have caused a segfault). | ||
+ | |||
+ | As it was, **stuff** WAS pointing somewhere, but an invalid location... so trying to modify the data there resulted in the operating system yelling at us. | ||
+ | |||
+ | Seeing the NULL would have better clued us in that we had forgotten to **malloc()** the space, and could have more easily come to that solution. As it was, we had to do a little bit of detective work to eventually figure out it was the lack of memory allocation (and default invalid pointing of pointer) that created our problem. |