User Tools

Site Tools


haas:fall2020:common:projects:gdbprogramflow

gdb program flow

Another viable use of gdb is helping you get a detailed picture of what a program is actually doing.

Instead of using it to hunt out segfaults, we will more commonly be utilizing it to chase down logic errors in our code.

The following example is exploring why there is a difference in the data/dll1 unit-compare unit test, specifically, test 8:

Test 8: Comparing populated lists that differ ...
List1: 64 -> 48 -> 32 -> 37 -> 8 -> 4 -> 2 -> 1 -> NULL
List2: 64 -> 48 -> 32 -> 37 -> 8 -> 4 -> 2 -> 1 -> NULL
 you have: CMP_EQUALITY
should be: CMP_L2_GREATER | CMP_L1_LESS

compile with debugging support

For the best gdb experience, always compile with debug support. If using a Makefile, typically the debug recipe will have been rigged up, which should take care of this for you (pretty much, adding -g onto the compiler command-line).

getting in position: setting a breakpoint

We are often not interested in ALL the processing that takes place from start to finish, but instead a particular sub-section of code. So, for our sanity, instead of manually sifting through EACH and EVERY instruction from the program's commencement, we will instead let it start up as usual, and go until it reaches a point we designate.

We will do this by setting a breakpoint, which will stop automatic execution, and allow us to have a more fine-grained look at everything (step-by-step).

So, we load the program into gdb:

dll1$ gdb bin/unit-compare

At the (gdb) prompt, we will list the main function and navigate to the code in question:

(gdb) list
1       #include <stdio.h>
2       #include "list.h"
3       #include "support.h"
4
5       int main()
6       {
7           //////////////////////////////////////////////////////////////////
8           //
9           // Declare variables
10          //

Hitting enter repeats the last command, which in this case will cause it to list the next 10 lines, and again, and again.

In this case, we are interested in test #8, so we are scanning the printf() messages for that “Comparing populated lists that differ” message, which occurs here:

(gdb)
101         lscodes(result);
102         fprintf(stdout, "should be: ");
103         lscodes(CMP_L2_EMPTY);
104         fflush (stdout);
105
106         fprintf(stdout, "\nTest %d: Comparing populated lists that differ ...\n", testno++);
107         myList2 = NULL;
108         cplist(myList1, &myList2);
109         myList2 -> lead -> right -> right -> right -> VALUE = 37;
110         result                    = compare(myList1, myList2, &pos);

Line 106 would be an excellent stopping point, don't you think?

So, we'll set a breakpoint there:

(gdb) break 106
Breakpoint 1 at 0x40124b: file unit-compare.c, line 106.
(gdb)

running the program as usual

Now, we run the program as usual and let it get us to the breakpoint:

(gdb) run
Starting program: /home/wedge/src/repos/mfaucet2/2020/data/dll1/bin/unit-compare
==========================================
UNIT TEST: list library compare() function
==========================================
...
Test 7: Comparing populated list against empty list ...
List1: 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL
List2: -> NULL
 you have: CMP_L2_EMPTY
should be: CMP_L2_EMPTY

Breakpoint 1, main () at unit-compare.c:106
106         fprintf(stdout, "\nTest %d: Comparing populated lists that differ ...\n", testno++);
(gdb) 

OF IMPORTANCE: it has STOPPED at line 106, but it has NOT YET EXECUTED line 106.

Proceeding step by step

There are 2 gdb commands for proceeding to the next instruction:

  • step
  • next

Both will execute the current instruction and proceed onto the next, but there is an important difference between them: step will single step into a function, whereas next will step over it (letting it run as usual).

Since the next instruction to be run is fprintf(), a function call, and one we did not write, we do NOT want to step into it (no debugging symbols are there for it), so we will step OVER it with next:

getting gdb to tell us things

Another valuable use of gdb is to give us the current values of things, so that we can determine if they are currently correct (was the problem before now, or after now).

Our next instruction to be executed is an assignment of myList2 to NULL, so we may want to check the state of both myList1 and myList2.

There are 2 commands to tell us useful things:

  • print - one-time reveal of information
  • display - constant (each prompt) reveal of information

I'm going to display things about myList1 and myList2:

(gdb) display myList1
1: myList1 = (List *) 0x6c1f50
(gdb) display *myList1
2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0}
(gdb) display *myList2
3: *myList2 = {lead = 0x0, last = 0x0}
(gdb) display myList2
4: myList2 = (List *) 0x6c21b0
(gdb) 

As we can see here, myList1 and myList2 both point to different list structures in memory, and when we dereference them, their contents are pointing at unique things (myList2 is currently an empty list, which we are about to set to a NULL list):

(gdb) n
108         cplist(myList1, &myList2);
1: myList1 = (List *) 0x6c1f50
2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0}
3: *myList2 = <error: Cannot access memory at address 0x0>
4: myList2 = (List *) 0x0
(gdb) 

Now, myList2 is NULL, and we're ABOUT to call cplist().

Another piece of useful information to have would be a display output of both lists (since we are about to copy myList1). I'm going to set two display values working with the dllX display() function:

(gdb) display (void *) display (myList1, 0)
5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL
(void *) 0x10000
(gdb) display (void *) display (myList2, 0)
6: (void *) display (myList2, 0) = NULL
(void *) 0x80000
(gdb) 

stepping OVER the function

A quick test to see if cplist() is the problem, let it run as usual, and we'll compare the results:

(gdb) n
109         myList2 -> lead -> right -> right -> right -> VALUE = 37;
1: myList1 = (List *) 0x6c1f50
2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0}
3: *myList2 = {lead = 0x6c2170, last = 0x6c1fb0}
4: myList2 = (List *) 0x6c21d0
5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL
(void *) 0x10000
6: (void *) display (myList2, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL
(void *) 0x10000
(gdb) 

Take a look here: both myList1 and myList2 show the same nodes.

BUT: look at the pointer addresses: the lead and last pointers of BOTH lists are identical.

This means cplist() isn't actually copying the list contents.

stepping INTO the function

Okay, so there's a potential problem with cplist(), what do we do?

We're going to first start by re-running the program, and then step into cplist() versus next over it:

(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/wedge/src/repos/mfaucet2/2020/data/dll1/bin/unit-compare
==========================================
UNIT TEST: list library compare() function
==========================================
...
Test 7: Comparing populated list against empty list ...
List1: 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL
List2: -> NULL
 you have: CMP_L2_EMPTY
should be: CMP_L2_EMPTY

Breakpoint 1, main () at unit-compare.c:106
106         fprintf(stdout, "\nTest %d: Comparing populated lists that differ ...\n", testno++);
1: myList1 = (List *) 0x6c1f50
2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0}
3: *myList2 = {lead = 0x0, last = 0x0}
4: myList2 = (List *) 0x6c21b0
5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL
(void *) 0x10000
6: (void *) display (myList2, 0) = -> NULL
(void *) 0x200000
(gdb) n

Test 8: Comparing populated lists that differ ...
107         myList2 = NULL;
1: myList1 = (List *) 0x6c1f50
2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0}
3: *myList2 = {lead = 0x0, last = 0x0}
4: myList2 = (List *) 0x6c21b0
5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL
(void *) 0x10000
6: (void *) display (myList2, 0) = -> NULL
(void *) 0x200000
(gdb) n
108         cplist(myList1, &myList2);
1: myList1 = (List *) 0x6c1f50
2: *myList1 = {lead = 0x6c2170, last = 0x6c1fb0}
3: *myList2 = <error: Cannot access memory at address 0x0>
4: myList2 = (List *) 0x0
5: (void *) display (myList1, 0) = 64 -> 48 -> 32 -> 16 -> 8 -> 4 -> 2 -> 1 -> NULL
(void *) 0x10000
6: (void *) display (myList2, 0) = NULL
(void *) 0x80000
(gdb) s
cplist (oldList=0x6c1f50, newList=0x7fffffffe168) at cp.c:28
28              code_t code = DLL_ERROR;
(gdb) 

We're now into cplist(), and can carry on similar actions as needed (displaying, setting breakpoints, stepping/nexting) to determine the source of the problem.

Even if we next through until we encounter the return, we can see what logic the function follows, which can help us determine if what we WANT to happen is what is ACTUALLY happening, and do further testing if need be.

haas/fall2020/common/projects/gdbprogramflow.txt · Last modified: 2020/10/15 10:34 by wedge