This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
haas:spring2014:data:projects:nodes [2014/01/27 00:35] – created wedge | haas:spring2014:data:projects:nodes [2014/01/27 10:58] (current) – [Submission] wedge | ||
---|---|---|---|
Line 11: | Line 11: | ||
To review structs and pointers, and see how these two concepts, when combined, produces the core element of our class explorations. | To review structs and pointers, and see how these two concepts, when combined, produces the core element of our class explorations. | ||
+ | =====Reference===== | ||
+ | You absolutely, positively, MUST watch this video: http:// | ||
=====Structures===== | =====Structures===== | ||
As we learned in C, there are two main composite data types available to us: | As we learned in C, there are two main composite data types available to us: | ||
Line 21: | Line 23: | ||
And what's more, structs are the basis for classes (they are essentially structs with some relaxed rules (i.e. they can have functions) and additional syntax and abilities (constructors, | And what's more, structs are the basis for classes (they are essentially structs with some relaxed rules (i.e. they can have functions) and additional syntax and abilities (constructors, | ||
- | Structure elements are accessed with either the " | + | Structure elements are accessed with either the " |
- | Structs declared as pointers use the " | + | Structs declared as pointers use the "< |
For example: | For example: | ||
Line 58: | Line 60: | ||
</ | </ | ||
- | |||
=====Pointers===== | =====Pointers===== | ||
+ | Pointers are commonly referred to as one of the main features that gives C its power (and, by relation, C++ too). | ||
+ | |||
+ | Pointers are actually just another type of variable: a memory variable. | ||
+ | |||
+ | Now you might be one to say- but aren't all variables " | ||
+ | |||
+ | The distinction lies in the primary purpose of the variable, which we will take a closer look at. In essence, variables have 3 attributes: | ||
+ | |||
+ | * name | ||
+ | * contents | ||
+ | * address | ||
+ | |||
+ | Whether we use it or not, everything has an address. This is how the computer keeps track of things. We use the name, but that gets translated back to the address. | ||
+ | |||
+ | A pointer merely adds in another capability- the ability to dereference. You see, since pointers are " | ||
+ | |||
+ | However, because C is a typed language (variables have specific data types), certain properties are applied to a declared variable, and therefore memory. So, instead of having a unique data type for a pointer, it ends up being a modification of an existing variable declaration (an integer pointer, a char pointer, a pointer to an array of short ints, a pointer to a struct, and even- a function pointer). | ||
+ | |||
+ | By being associated with an existing type, the compiler knows what rules to apply to some destination memory address. | ||
+ | |||
+ | So, with a pointer we have a name, we have the contents (which instead of a number in accordance with a data type, we instead store a memory address), and we have the address. | ||
+ | |||
+ | To make pointers useful, we **dereference** them, or access the memory address stored in the contents of our pointer. This may be described as " | ||
+ | |||
+ | There are two operators in C associated with memory access: | ||
+ | |||
+ | * < | ||
+ | * & - address of | ||
+ | |||
+ | We've used both of these before-- to pass by address to a function, we pass the address of a variable (effectively turning it into a pointer for the uses of the called function). We dereference to get at the contents of what a pointer points to, for if we forgot or didn't dereference, | ||
+ | |||
+ | To declare a pointer, we merely specify a < | ||
+ | |||
+ | <code c> | ||
+ | int *var; | ||
+ | </ | ||
+ | |||
+ | Now, a common mistake is to try and either store a value in our variable (again, as a pointer it is a memory variable-- we don't store values, we store memory addresses), or to dereference it before it has been assigned. | ||
+ | |||
+ | As memory access is one of C's strengths, and memory protection incorporated into most modern operating systems, it isn't uncommon to experience some unique errors when dealing with pointers inappropriately in C. Specifically, | ||
+ | |||
+ | All a Segmentation Fault (or Seg Fault, SIGSEGV) happens to be is an intervention from the operating system informing us we tried to access memory that we did not have control over. | ||
+ | |||
+ | When initializing a pointer, we will either set it equal to the address of another variable, or, as will commonly be the case in Data Structures, we will allocate a block of memory and set our pointer' | ||
+ | |||
+ | =====Nodes===== | ||
+ | Now let us look at the basic underlying unit that we will be spending the semester playing with in Data Structures; the node. | ||
+ | |||
+ | There' | ||
+ | |||
+ | nodes are the building blocks of linked lists, a peer to the array. | ||
+ | |||
+ | But as we know, arrays are limited to a fixed size- we must allocate the size by first usage. If don't use it all, memory is wasted; if we don't use enough, we're out of luck (short of allocating a new array that is larger, and copying all values from old to new-- some languages have this as a more basic feature, and tend to call it a dynamic array or vector). | ||
+ | |||
+ | The linked list, on the other hand, is only allocated a node at a time. It therefore allows a best fit for the data set we are working with. For this reason (and because we are allocating memory) linked lists are associated with dynamic memory programming... as the actual memory needs are determined during runtime. | ||
+ | |||
+ | ====What is / What is in the node==== | ||
+ | So what exactly **IS** this node thing? | ||
+ | |||
+ | For one, it is something we create. A self-created variable, eh? Anything that might come to mind as far as a packaging agent? A variable that can contain other variables? | ||
+ | |||
+ | If you are thinking **struct**, you'd be absolutely right. | ||
+ | |||
+ | We will use struct to package its contents. | ||
+ | |||
+ | What is in the node? Again, this answer depends on our particular needs. For now, lacking any specific application, | ||
+ | |||
+ | The node, to make it practical and useful, also needs the ability to reference other nodes. Pointers are they key for referencing other variables (via memory address storage and dereferencing), | ||
+ | |||
+ | What type of pointer? Well, a pointer to the type that defines our node- our struct. | ||
+ | |||
+ | Here it is: | ||
+ | |||
+ | <code c> | ||
+ | struct node { | ||
+ | int value; | ||
+ | struct node *next; | ||
+ | }; | ||
+ | </ | ||
+ | |||
+ | To make our lives easier, we'll also follow up our node struct definition with a **typedef**, | ||
+ | |||
+ | <code c> | ||
+ | struct node { | ||
+ | int value; | ||
+ | struct node *next; | ||
+ | }; | ||
+ | typedef struct node Node; | ||
+ | </ | ||
+ | |||
+ | So now, when we wish to create a new node (let's say first, which we'll also make a pointer), we have two ways we can declare it: | ||
+ | |||
+ | <code c> | ||
+ | struct node *first; | ||
+ | </ | ||
+ | |||
+ | or: | ||
+ | |||
+ | < | ||
+ | Node *first; | ||
+ | </ | ||
+ | |||
+ | as I said, identical... just one is more convenient for us to type, due to the added typedef. | ||
+ | |||
+ | ====Allocation==== | ||
+ | Now, for any given node to be useful, we need to allocate memory for it (otherwise, the first attempted access to any of its members is a segfault waiting to happen). | ||
+ | |||
+ | To allocate memory, we use **malloc()**, | ||
+ | |||
+ | < | ||
+ | first = (Node *) malloc (sizeof(Node)); | ||
+ | </ | ||
+ | |||
+ | We have to cast the return type of **malloc()** because it is somewhat type generic- it returns a pointer to raw memory, and in C that needs to be associated with a specific format. Since we're allocating memory to a node, we'll have the compiler apply the rules of the node struct we just created. | ||
+ | |||
+ | ====Putting a value in the node==== | ||
+ | Assigning (and retrieving) information from the node is merely a struct operation: | ||
+ | |||
+ | < | ||
+ | // to assign | ||
+ | first -> value = 12; // first is a pointer to a struct, so we use the structure pointer arrow | ||
+ | |||
+ | // to retrieve | ||
+ | printf(" | ||
+ | |||
+ | // dealing with next- set it to an initial sane value | ||
+ | first -> next = NULL; | ||
+ | </ | ||
+ | |||
+ | ====Linking our node to another node==== | ||
+ | The key to everything is linking one node to the next. This is also the part where confusion sets in, due to the level of abstraction at play. | ||
+ | |||
+ | I highly recommend drawing pictures to help you in tracing out what is going on, especially as you are first learning this... and likely throughout the semester. And by drawing pictures, I mean take out a sheet of paper and pen/pencil, and draw nodes (use circles), write the " | ||
+ | |||
+ | At present, we have a node called first that we've allocated memory to and altered its contents, that diagram would look as follows: | ||
+ | |||
+ | {{ : | ||
+ | |||
+ | Notice how all the elements of the node are dealt with (both value and the next pointer). And first, being a mere pointer to a struct, is a name that points to (because it merely contains the address of) the memory region we **malloc()**' | ||
+ | |||
+ | Now,to link to another node (and put in, say, a 37 for its value) we'd do something along these lines: | ||
+ | |||
+ | <code c> | ||
+ | first -> next = (Node *) malloc (sizeof(Node)); | ||
+ | first -> next -> value = 37; | ||
+ | first -> next -> next = NULL; | ||
+ | </ | ||
+ | |||
+ | And don't forget to update your diagram: | ||
+ | |||
+ | {{ : | ||
+ | |||
+ | ====Use variables to make it easier to traverse the list==== | ||
+ | Since we need to keep a placeholder on our allocated memory, **first** is more or less immovable in the current context (it is our link to everything). | ||
+ | |||
+ | You may be noticing the potential for some very long code about to happen (what if we wanted to add a third nod... those next's would become next -> next, and so on). But there' | ||
+ | |||
+ | Here is that same node construction logic, redone using an additional **tmp** node pointer, and also adding in a third node (containing the value 8): | ||
+ | |||
+ | < | ||
+ | Node *first, *tmp = NULL; | ||
+ | |||
+ | first = (Node *) malloc (sizeof(Node)); | ||
+ | tmp = first; | ||
+ | |||
+ | tmp -> value = 12; | ||
+ | tmp -> next = NULL; | ||
+ | |||
+ | tmp -> next = (Node *) malloc (sizeof(Node)); | ||
+ | tmp = tmp -> next; | ||
+ | |||
+ | tmp -> value = 37; | ||
+ | tmp -> next = NULL; | ||
+ | |||
+ | tmp -> next = (Node *) malloc (sizeof(Node)); | ||
+ | tmp = tmp -> next; | ||
+ | |||
+ | tmp -> value = 8; | ||
+ | tmp -> next = NULL; | ||
+ | </ | ||
+ | |||
+ | Our node diagram now looks as follows (but ideally, you'd have been updating it line by line as this program went along-- do not wait "until the end" | ||
+ | |||
+ | {{ : | ||
+ | |||
+ | =====Procedure===== | ||
+ | I would like for you to write a C program utilizing nodes to construct a simple list, joining new values onto the end of the last node. | ||
+ | |||
+ | The program should do the following: | ||
+ | |||
+ | * implement a node containing signed char's (for **value**) | ||
+ | * prompt the user to enter a value (-1 to exit) | ||
+ | * allocate a node (or the next node-- you'll have to deal with first vs. remaining case logic) | ||
+ | * obtain input from the user and assign it to the current node | ||
+ | * once a -1 has been provided, display the list from beginning to end and then exit | ||
+ | |||
+ | Sample output would look as follows: | ||
+ | |||
+ | <cli> | ||
+ | lab46: | ||
+ | Enter a value (-1 to quit): 12 | ||
+ | Enter a value (-1 to quit): 37 | ||
+ | Enter a value (-1 to quit): 8 | ||
+ | Enter a value (-1 to quit): 59 | ||
+ | Enter a value (-1 to quit): 86 | ||
+ | Enter a value (-1 to quit): 4 | ||
+ | Enter a value (-1 to quit): -1 | ||
+ | |||
+ | List: 12 -> 37 -> 8 -> 59 -> 86 -> 4 -> NULL | ||
+ | lab46: | ||
+ | </ | ||
+ | |||
+ | NOTE: This is just example input. Not only should your program work with this, but lists of any length, containing any arrangement of valid values. | ||
+ | |||
+ | =====Submission===== | ||
+ | To successfully complete this project, the following criteria must be met: | ||
+ | |||
+ | * Handwritten diagram fully detailed for an 8 node list uploaded to your Opus | ||
+ | * insert the image into your journal, and have related commentary on what is going on | ||
+ | * submit the URL (to that week's journal entry) along with your program | ||
+ | * Code must compile cleanly (no warnings or errors) | ||
+ | * Executed program must display in a manner similar to provided output. | ||
+ | * Output must be correct | ||
+ | * Code must be nicely and consistently indented (you may use the **indent** tool) | ||
+ | * Code must be commented | ||
+ | * have a properly filled-out comment banner at the top | ||
+ | * have at least 20% of your program consist of **< | ||
+ | * Track/ | ||
+ | * Submit a copy of your source code to me using the **submit** tool. | ||
+ | |||
+ | To submit this program to me using the **submit** tool, run the following command at your lab46 prompt: | ||
+ | |||
+ | <cli> | ||
+ | $ submit data nodes node.c http:// | ||
+ | Submitting data project " | ||
+ | -> node.c(OK) | ||
+ | -> http:// | ||
+ | |||
+ | SUCCESSFULLY SUBMITTED | ||
+ | </ | ||
+ | |||
+ | You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches. |