Another viable use of gdb is helping you get a detailed picture of what a program is actually doing.
Instead of using it to hunt out segfaults, we will more commonly be utilizing it to chase down logic errors in our code.
The following example is exploring why there is a difference in the data/dll1 unit-compare unit test, specifically, test 8:
Test 8: Comparing populated lists that differ ... List1: 64 -> 48 -> 32 -> 37 -> 8 -> 4 -> 2 -> 1 -> NULL List2: 64 -> 48 -> 32 -> 37 -> 8 -> 4 -> 2 -> 1 -> NULL you have: CMP_EQUALITY should be: CMP_L2_GREATER | CMP_L1_LESS
For the best gdb experience, always compile with debug support. If using a Makefile, typically the debug recipe will have been rigged up, which should take care of this for you (pretty much, adding -g onto the compiler command-line).
We are often not interested in ALL the processing that takes place from start to finish, but instead a particular sub-section of code. So, for our sanity, instead of manually sifting through EACH and EVERY instruction from the program's commencement, we will instead let it start up as usual, and go until it reaches a point we designate.
We will do this by setting a breakpoint, which will stop automatic execution, and allow us to have a more fine-grained look at everything (step-by-step).
So, we load the program into gdb:
dll1$ gdb bin/unit-compare
At the (gdb) prompt, we will list the main function and navigate to the code in question:
(gdb) list 1 #include <stdio.h> 2 #include "list.h" 3 #include "support.h" 4 5 int main() 6 { 7 ////////////////////////////////////////////////////////////////// 8 // 9 // Declare variables 10 //
Hitting enter repeats the last command, which in this case will cause it to list the next 10 lines, and again, and again.
In this case, we are interested in test #8, so we are scanning the printf() messages for that “Comparing populated lists that differ” message, which occurs here:
(gdb) 101 lscodes(result); 102 fprintf(stdout, "should be: "); 103 lscodes(CMP_L2_EMPTY); 104 fflush (stdout); 105 106 fprintf(stdout, "\nTest %d: Comparing populated lists that differ ...\n", testno++); 107 myList2 = NULL; 108 cplist(myList1, &myList2); 109 myList2 -> lead -> right -> right -> right -> VALUE = 37; 110 result = compare(myList1, myList2, &pos);
Line 106 would be an excellent stopping point, don't you think?
So, we'll set a breakpoint there:
(gdb) break 106 Breakpoint 1 at 0x40124b: file unit-compare.c, line 106. (gdb)
Now, we run the program as usual and let it get us to the breakpoint:
(gdb) run Starting program: /home/wedge/src/repos/mfaucet2/2020/data/dll1/bin/unit-compare ========================================== UNIT TEST: list library compare() function ========================================== ... Test 7: Comparing populated list against empty list ... List1: 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL List2: -> NULL you have: CMP_L2_EMPTY should be: CMP_L2_EMPTY Breakpoint 1, main () at unit-compare.c:106 106 fprintf(stdout, "\nTest %d: Comparing populated lists that differ ...\n", testno++); (gdb)
OF IMPORTANCE: it has STOPPED at line 106, but it has NOT YET EXECUTED line 106.
There are 2 gdb commands for proceeding to the next instruction:
Both will execute the current instruction and proceed onto the next, but there is an important difference between them: step will single step into a function, whereas next will step over it (letting it run as usual).
Since the next instruction to be run is fprintf(), a function call, and one we did not write, we do NOT want to step into it (no debugging symbols are there for it), so we will step OVER it with next:
Another valuable use of gdb is to give us the current values of things, so that we can determine if they are currently correct (was the problem before now, or after now).
Our next instruction to be executed is an assignment of myList2 to NULL, so we may want to check the state of both myList1 and myList2.
There are 2 commands to tell us useful things:
I'm going to display things about myList1 and myList2:
(gdb) display myList1 1: myList1 = (List *) 0x6c1f50 (gdb) display *myList1 2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0} (gdb) display *myList2 3: *myList2 = {lead = 0x0, last = 0x0} (gdb) display myList2 4: myList2 = (List *) 0x6c21b0 (gdb)
As we can see here, myList1 and myList2 both point to different list structures in memory, and when we dereference them, their contents are pointing at unique things (myList2 is currently an empty list, which we are about to set to a NULL list):
(gdb) n 108 cplist(myList1, &myList2); 1: myList1 = (List *) 0x6c1f50 2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0} 3: *myList2 = <error: Cannot access memory at address 0x0> 4: myList2 = (List *) 0x0 (gdb)
Now, myList2 is NULL, and we're ABOUT to call cplist().
Another piece of useful information to have would be a display output of both lists (since we are about to copy myList1). I'm going to set two display values working with the dllX display() function:
(gdb) display (void *) display (myList1, 0) 5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL (void *) 0x10000 (gdb) display (void *) display (myList2, 0) 6: (void *) display (myList2, 0) = NULL (void *) 0x80000 (gdb)
A quick test to see if cplist() is the problem, let it run as usual, and we'll compare the results:
(gdb) n 109 myList2 -> lead -> right -> right -> right -> VALUE = 37; 1: myList1 = (List *) 0x6c1f50 2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0} 3: *myList2 = {lead = 0x6c2170, last = 0x6c1fb0} 4: myList2 = (List *) 0x6c21d0 5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL (void *) 0x10000 6: (void *) display (myList2, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL (void *) 0x10000 (gdb)
Take a look here: both myList1 and myList2 show the same nodes.
BUT: look at the pointer addresses: the lead and last pointers of BOTH lists are identical.
This means cplist() isn't actually copying the list contents.
Okay, so there's a potential problem with cplist(), what do we do?
We're going to first start by re-running the program, and then step into cplist() versus next over it:
(gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/wedge/src/repos/mfaucet2/2020/data/dll1/bin/unit-compare ========================================== UNIT TEST: list library compare() function ========================================== ... Test 7: Comparing populated list against empty list ... List1: 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL List2: -> NULL you have: CMP_L2_EMPTY should be: CMP_L2_EMPTY Breakpoint 1, main () at unit-compare.c:106 106 fprintf(stdout, "\nTest %d: Comparing populated lists that differ ...\n", testno++); 1: myList1 = (List *) 0x6c1f50 2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0} 3: *myList2 = {lead = 0x0, last = 0x0} 4: myList2 = (List *) 0x6c21b0 5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL (void *) 0x10000 6: (void *) display (myList2, 0) = -> NULL (void *) 0x200000 (gdb) n Test 8: Comparing populated lists that differ ... 107 myList2 = NULL; 1: myList1 = (List *) 0x6c1f50 2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0} 3: *myList2 = {lead = 0x0, last = 0x0} 4: myList2 = (List *) 0x6c21b0 5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL (void *) 0x10000 6: (void *) display (myList2, 0) = -> NULL (void *) 0x200000 (gdb) n 108 cplist(myList1, &myList2); 1: myList1 = (List *) 0x6c1f50 2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0} 3: *myList2 = <error: Cannot access memory at address 0x0> 4: myList2 = (List *) 0x0 5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL (void *) 0x10000 6: (void *) display (myList2, 0) = NULL (void *) 0x80000 (gdb) s cplist (oldList=0x6c1f50, newList=0x7fffffffe168) at cp.c:28 28 code_t code = DLL_ERROR; (gdb)
We're now into cplist(), and can carry on similar actions as needed (displaying, setting breakpoints, stepping/nexting) to determine the source of the problem.
Even if we next through until we encounter the return, we can see what logic the function follows, which can help us determine if what we WANT to happen is what is ACTUALLY happening, and do further testing if need be.