User Tools

Site Tools


haas:fall2019:data:projects:dls0

Corning Community College

CSCS2320 Data Structures

Project: DLS0

Errata

This section will document any updates applied to the project since original release:

  • revision #: <description> (DATESTAMP)

Objective

In this project, we resume our conceptual journey and explore another data structure: stacks.

Background

A stack is considered one of the most important data structures, along with queues (next week's project) and trees. And it is largely because of how often we find them playing out in nature or our day-to-day lives.

The word “stack” is defined as:

  • (generically): a pile of objects, typically one that is neatly arranged
  • (computing): a set of storage locations that store data in such a way that the most recently stored item is the first to be retrieved

Additionally, when viewing it as a verb (an action), we also find some positive computing application (bolded) in a less reputable cardplaying usage:

  • shuffle or arrange (a deck of cards) dishonestly so as to gain an unfair advantage

Or, to distill it out:

  • arrange so as to gain advantage

Combining with our previous definitions, we have:

  • a set of storage locations that are arranged in such a way so as to give us an advantage- the most recently stored item (the last to be placed onto the stack) is the first to be retrieved.

Lists and Nodes

So, how does all this list and node stuff play into our stack implementation?

Well, we're going to build the stack ON TOP OF lists (which are composed of nodes).

Therefore, a stack is a data structure that stores its data in a list (which consists of nodes), and we apply various rules/restrictions on our access of that list data.

The concept of restricting access is a very important one- which we did with our list as well (limiting our access to the list through the use of append(), insert(), and obtain() versus manipulating the next/prev pointers manually all the time). By limiting how we access the data, we give ourselves certain algorithmic advantages:

  • error reduction: if we have a small set of operations that can do one thing, and do their one thing extremely well (insert(), append(), and obtain() again, for instance), we can then rely on them to do the low-level grunt work, freeing us up to accomplish higher level tasks (such as sorting or swapping), or even things like determining if a word is a palindrome.
  • performance: by restricting our available choices, the edge cases we have to check for are reduced, and in ideal situations, the average case moves closer to the best case.

conceptualizing a stack

It is common to think of a stack as a vertical object, much like a pile of papers that need to be processed (or a pile of anything we need to work with).

Although we've commonly viewed lists horizontally (from left to right), there is absolutely nothing requiring this positional orientation.

Similarly, stacks possess no mandatory orientation, but we do usually visualize them as vertical entities, largely because that's how the piles of paper that accumulate on our desks tend to grow.

the stack

The stack data structure presents certain advantages that encourages its use in solving problems (why do we stack a bunch of papers all in the same place to create piles? Why is that more advantageous than giving each one its own unique desk space?), and we accomplish that by its compositional definition:

  • a stack has a top, basically a node pointer that constantly points to the top node in the stack (equivalent to the underlying list's last pointer).
  • to put an item on the stack, we push it there. So one of the functions we'll be implementing is push(), which will take the node we wish to place on the given stack, and push will handle all the necessary coordination with its underlying list (i.e. it should call existing list functions to manipulate the list)
  • to get an item off of the stack, we pop it. In our pop() function, we grab the top node off the stack (this also translates into a set of list-level transactions that our pop() function will handle for us).

These qualities cause the stack to be described as a LIFO (or FILO) structure:

  • LIFO: Last In First Out
  • FILO: First In Last Out

And that describes what is conceptually going on– if we can ONLY access our data through one location (the top), the data most immediately available to us is that which we most recently placed there (hence the last one we pushed in would be the first one we get back when popping it).

This concept is very important, and being aware of it can be of significant strategic importance when going about solving problems (and seeing its pattern proliferate in nature).

With that said, the existence of top, along with the core push() and pop() functions defines the minimal necessary requiments to interface with a stack. Sometimes we'll see additional actions sneak in. While these may be commonly associated with stacks, they should not be confused as core requiments of a stack:

  • peek: the ability to gain access to the top node without removing it from the stack
  • is the stack empty?: the ability to query the stack and determine if it is empty or non-empty (or perhaps if non-empty, how full is it?)

While we may be implementing these supplemental functions, it should be noted that not only are they in no way necessary for using a stack, they could be detrimental (just as relying on counting can be a crutch).

Their inclusion should ONLY be viewed as a means of convenience (in certain scenarios they may result in less code needing to be written), but NOT as something you should routinely make use of.

size can matter

With a stack, there sometimes exists a need to cap its total size (especially in applications on the computer, we may have only allocated a fixed amount of space and cannot exceed it). For this reason, we will need to maintain a count of nodes in the stack (ie the underlying list).

This is why dll2 exists: to introduce qty back into the list struct.

Additionally, the stack will have a configured maximum size- if the quantity of nodes in the list exceeds the configured size of the stack, we should prevent any additional pushes.

It should also be pointed out that in other applications, a stack need not have a maximum size.. in which case it can theoretically grow an indefinite amount. We will explore both conditions (unbounded and bounded stacks) in this project.

stack error conditions

There are two very important operational error conditions a stack can experience:

  • stack overflow: this is the situation where the quantity of the list is equal to the configured stack size (in the case of a bounded stack), and we try to push another node onto the stack (the stack is only so high, and pushing one more time will cause it to overflow)
  • stack underflow: this is the situation where the stack is empty, yet we still try to pop a value from it.

Project Overview

For this project, we're going to be implementing the stack data structure atop of our recently re-implemented linked list (the doubly linked list).

Should you be having any lingering issues with your doubly-linked list implementation, remember that the test reference implementation is (and has been) available. With this, you don't have to worry about all the supporting node and list functions that aren't the focus of the project.

inc/stack.h

To implement a stack, we'll be creating a new type of struct. Continuing our previous pattern, we'll isolate that specific information in its own header file:

#ifndef _STACK_H
#define _STACK_H
 
//////////////////////////////////////////////////////////////////////
//
// Stack relies on list (which relies on node) to work.
// See the layers?
//
#include "list.h"
 
//////////////////////////////////////////////////////////////////////
//
// Define the stack struct
//
struct stack {
    Node              *top;            // pointer to top of stack
    List              *data;           // pointer to stack data
    ulli               size;           // size of stack
};
 
code_t  mkstack(Stack **, ulli);       // create new stack of size
code_t  cpstack(Stack  *, Stack **);   // duplicate stack
code_t  rmstack(Stack **);             // deallocate stack
 
code_t  push   (Stack **, Node   *);   // add new node onto stack
code_t  pop    (Stack **, Node  **);   // grab node off of stack
code_t  peek   (Stack  *, Node  **);   // show top node of stack
 
code_t  isempty(Stack  *);             // check stack emptiness
 
#endif

As indicated, with stacks, suddenly a lot of the underlying details start to be abstracted away. And the total number of unique functions being created also tends to decrease.

For our stack implementation, just as with our doubly-linked list implementation, we will make use of the double pointer in order to achieve passing parameters by address.

This is necessary so that we can free up the return value of push() and pop() to be used for status (ie look out for stack overflows and underflows).

peek() and isempty() are being implemented as an exercise to aid in your understanding of stacks. Again, avoid their use except is a means of convenience (or to further optimize your code). The general rule of thumb is that the use of peek() and isempty() should result in shortening your code in a clear or clever way.

If you cannot think of how to solve a problem without the use of peek()/isempty(), that is a strong clue that you shouldn't be using them.

Also, while nothing is stopping you from doing so, the idea here is that things like size and the underlying list qty in stack transactions will NOT be accessed outside of the push() and pop() functions. Just like my warnings about using qty in your list solutions– do not consider size as a variable for your general use (push() will probably be the only place it is used).

In object-oriented programming, both size and qty would be private member variables of their respective classes, unable to be used by anything other than their respective member functions.

inc/data.h

With stacks, the following new information has been added to data.h:

//////////////////////////////////////////////////////////////////////
//
// Status codes for the doubly linked stack implementation
//
#define  DLS_SUCCESS         0x0000000001000000
#define  DLS_CREATE_FAIL     0x0000000002000000
#define  DLS_NULL            0x0000000004000000
#define  DLS_EMPTY           0x0000000008000000
#define  DLS_OVERFLOW        0x0000000010000000
#define  DLS_UNDERFLOW       0x0000000020000000
#define  DLS_ERROR           0x0000000040000000
#define  DLS_INVALID         0x0000000080000000
#define  DLS_DEFAULT_FAIL    0x0000000000804000

Technical note: Due to space constraints (there are 9 stack status codes), you'll notice DLS_DEFAULT_FAIL is not a unique number, but a combination of two previous values. This is made possible by using two values that should never be regularly occurring, and especially not in combination: DLN_DEFAULT_FAIL and DLL_DEFAULT_FAIL. I had to employ a similar trick with queues, which you'll see in next week's project.

stack library

In src/stack/, you will find skeletons of the above prototyped functions, hollowed out in anticipation of being made operational.

Figure out what is going on, the connections, and make sure you understand it.

Again, your stack is to utilize the stack for its underlying data storage operations. This is what the stack's data list pointer is to be used for.

stack operation status codes

You'll notice the presence of a set of stack-related #define's in the data.h header file. These are intended to be used to report on various states of stack status after performing various operations.

They are not exclusive- in some cases, multiple states can be applied. The intent is that you will OR together all pertinent states and return that from the function.

  • DLS_SUCCESS - everything went according to plan, no errors encountered, average case
  • DLS_CREATE_FAIL - memory allocation failed (considered in error)
  • DLS_NULL - result is NULL (probably in error)
  • DLS_EMPTY - result is an empty list/stack (may or may not be in error)
  • DLS_OVERFLOW - operation exceeds allocated size of list (may be considered an error)
  • DLS_UNDERFLOW - operation cannot proceed due to lack of data (may be considered an error)
  • DLS_DEFAULT_FAIL - default state of unimplemented functions (default error)
  • DLS_ERROR - some error occurred
  • DLS_INVALID - invalid state (pointer to stack does not exist)

For example, in the case of “DLS_CREATE_FAIL”, there are actually a total of three states raised:

  • DLS_ERROR (a problem has occurred)
  • DLS_CREATE_FAIL (a problem has occurred when using malloc())
  • DLS_NULL (no memory allocated, so stack cannot be anything but NULL)

ALL THREE states must be returned from the function in question should such an occurrence take place (in addition, various underlying list and node status codes may be present as well– see the unit tests for more information).

Stack library unit tests

In testing/stack/unit/, you will find these files:

  • unit-mkstack.c - unit test for mkstack() library function
  • unit-cpstack.c - unit test for cpstack() library function
  • unit-rmstack.c - unit test for rmstack() library function
  • unit-push.c - unit test for push() library function
  • unit-pop.c - unit test for pop() library function
  • unit-peek.c - unit test for peek() library function
  • unit-isempty.c - unit test for isempty() library function

There are also corresponding verify-FUNCTION.sh scripts that will output a “MATCH”/“MISMATCH” to confirm overall conformance with the pertinent stack functionality.

These are complete runnable programs (when compiled, and linked against the stack library, which is all handled for you by the Makefile system in place).

Of particular importance, I want you to take a close look at:

  • the source code to each of these unit tests
    • the purpose of these programs is to validate the correct functionality of the respective library functions
    • follow the logic
    • make sure you understand what is going on
    • ask questions to get clarification!
  • the output from these programs once compiled and ran
    • analyze the output
    • make sure you understand what is going on
    • ask questions to get clarification!

stack testing applications

palindrome-stack

Now that we've completed our stack functionality, we can use these individual functions to piece together solutions to various everyday problems where a stack could be effective (and even compare approaches to when we didn't have the benefit of a stack in solving the problem). After all, that's a big aspect to learning data structures- they open doors to new algorithms and problem solving capabilities.

Our task (once again) will be that of palindromes (ie words/phrases that, when reversed, spell the same thing).

This implementation will be considered an extra credit opportunity, so as to offer those who have fallen behind (but working to get caught up) a reprieve on some of the credit they've lost.

It is also highly recommended to undertake as it will give you further experience working with these concepts.

Note this is a DIFFERENT approach than you would have taken in the program with sll2 and dll1- you're to use stack functionality to aid you with the heavy lifting. You should not be directly using any list functions in the implementation of this solution, except perhaps in the initial building of the input string (otherwise use the stack, and let the stack use the list functions).

Expected Results

To assist you in verifying a correct implementation, you can check your implementation against the results of my implementation:

stack library

Here is what you should get for stack:

lab46:~/src/data/dls0$ make check
======================================================
=    Verifying Doubly-Linked Stack Functionality     =
======================================================
   [mkstack] Total:   9, Matches:   9, Mismatches:   0
      [push] Total:  18, Matches:  18, Mismatches:   0
       [pop] Total:  19, Matches:  19, Mismatches:   0
   [cpstack] Total:  11, Matches:  11, Mismatches:   0
      [peek] Total:  20, Matches:  20, Mismatches:   0
   [isempty] Total:   5, Matches:   5, Mismatches:   0
   [rmstack] Total:  10, Matches:  10, Mismatches:   0
======================================================
   [RESULTS] Total:  92, Matches:  92, Mismatches:   0
====================================================== 
lab46:~/src/data/dls0$ 

Submission

Project Submission

When you are done with the project and are ready to submit it, you simply run make submit:

lab46:~/src/data/PROJECT$ make submit
...

Submission Criteria

To be successful in this project, the following criteria must be met:

  • Project must be submit on time, by the posted deadline.
    • Early submissions will earn 1 bonus point per full day in advance of the deadline.
      • Bonus eligibility requires an honest attempt at performing the project (no blank efforts accepted)
    • Late submissions will lose 25% credit per day, with the submission window closing on the 4th day following the deadline.
      • To clarify: if a project is due on Wednesday (before its end), it would then be 25% off on Thursday, 50% off on Friday, 75% off on Saturday, and worth 0% once it becomes Sunday.
      • Certain projects may not have a late grace period, and the due date is the absolute end of things.
  • all requested functions must be implemented in the related library
  • all requested functionality must conform to stated requirements (either on this project page or in comment banner in source code files themselves).
  • Output generated must conform to any provided requirements and specifications (be it in writing or sample output)
    • output obviously must also be correct based on input.
  • Processing must be correct based on input given and output requested
  • Project header files are NOT to be altered. During evaluation the stock header files will be copied in, which could lead to compile-time problems.
  • Code must compile cleanly.
    • Each source file must compile cleanly (worth 3 total points):
      • 3/3: no compiler warnings, notes or errors.
      • 2/3: one of warning or note present during compile
      • 1/3: two of warning or note present during compile
      • 0/3: compiler errors present (code doesn't compile)
  • Code must be nicely and consistently indented (you may use the indent tool)
    • You are free to use your own coding style, but you must be consistent
    • Avoid unnecessary blank lines (some are good for readability, but do not go overboard- double-spacing your code will get points deducted).
    • Indentation will be rated on the following scale (worth 3 total points):
      • 3/3: Aesthetically pleasing, pristine indentation, easy to read, organized
      • 2/3: Mostly consistent indentation, but some distractions (superfluous or lacking blank lines, or some sort of “busy” ness to the code)
      • 1/3: Some indentation issues, difficult to read
      • 0/3: Lack of consistent indentation (didn't appear to try)
  • Unless fundamentally required, none of your code should perform any inventory or manual counting. Basing your algorithms off such fixed numbers complicates things, and is demonstrative of a more controlling nature.
  • Code must be commented
    • Any “to be implemented” comments MUST be removed
      • these “to be implemented” comments, if still present at evaluation time, will result in points being deducted.
    • Commenting will be rated on the following scale (worth 3 total points):
      • 3/3: Aesthetically pleasing (comments aligned or generally not distracting), easy to read, organized
      • 2/3: Mostly consistent, some distractions or gaps in comments (not explaining important things)
      • 1/3: Light commenting effort, not much time or energy appears to have been put in.
      • 0/3: No original comments
      • should I deserve nice things, my terminal is usually 90 characters wide. So if you'd like to format your code not to exceed 90 character wide terminals (and avoid line wrapping comments), at least as reasonably as possible, those are two sure-fire ways of making a good impression on me with respect to code presentation and comments.
    • Sufficient comments explaining the point of provided logic MUST be present
  • Code must be appropriately modified
    • Appropriate modifications will be rated on the following scale (worth 3 total points):
      • 3/3: Complete attention to detail, original-looking implementation- also is not unnecessarily reinventing existing functionality
      • 2/3: Lacking some details (like variable initializations), but otherwise complete (still conforms, or conforms mostly to specifications), and reinvents some wheels
      • 1/3: Incomplete implementation (typically lacking some obvious details/does not conform to specifications)
      • 0/3: Incomplete implementation to the point of non-functionality (or was not started at all)
    • Implementation must be accurate with respect to the spirit/purpose of the project (if the focus is on exploring a certain algorithm to produce results, but you avoid the algorithm yet still produce the same results– that's what I'm talking about here).. worth 3 total points:
      • 3/3: Implementation is in line with spirit of project
      • 2/3: Some avoidance/shortcuts taken (note this does not mean optimization– you can optimize all you want, so long as it doesn't violate the spirit of the project).
      • 1/3: Generally avoiding the spirit of the project (new, different things, resorting to old and familiar, despite it being against the directions)
      • 0/3: entirely avoiding.
    • Error checking must be adequately and appropriately performed, according to the following scale (worth 3 total points):
      • 3/3: Full and proper error checking performed for all reasonable cases, including queries for external resources and data.
      • 2/3: Enough error checking performed to pass basic project requirements and work for most operational cases.
      • 1/3: Minimal error checking, code is fragile (code may not work in full accordance with project requirements)
      • 0/3: No error checking (code likely does not work in accordance with project requirements)
  • Any and all non-void functions written must have, at most, 1 return statement
    • points will be lost for solutions containing multiple return statements in a function.
  • Absolutely, positively NO (as in ZERO) use of goto statements.
    • points will most definitely be lest for solutions employing such things.
  • Track/version the source code in a repository
  • Filling out any submit-time questionnaires
  • Submit a copy of your source code to me using the submit tool (make submit will do this) by the deadline.
haas/fall2019/data/projects/dls0.txt · Last modified: 2017/10/24 07:50 by 127.0.0.1