Table of Contents

Corning Community College

CSCS2330 Discrete Structures

Project: Binary Data Tool (bdt1)

Objective

To apply our binary data knowledge and interactions in the creation of a useful tool for aiding us in the debugging of our dcfX endeavors.

Background

With our recent xxd(1) tool implementation in bdt0, we are going to continue down this rabbit hole a bit by writing a tool of particular value to our dcfX endeavors.

One of the things I noticed while helping debug was a frequent loss of place while looking at the hex output of encoded data. Seconds were lost relocating the points of difference, and over time, those lost seconds add up.

It would be uniquely useful if we had a way to highlight the first point (byte) of difference in two files, so we can then focus on why/how they are different, vs. devoting far too much time to discovering what is different.

Task

Your task is to write a custom binary difference visualizer, in a format not unlike that of xxd(1), but certainly different from the output format we strove for in bdt0.

The idea is to take 2 files as input, parse through those (ideally similar) files, until the first point of difference is found, at which point your tool will display:

Thought empowerment vs. thought slavery

Something I've noticed with many who are so used to conforming and following authority, is that the thought of “questioning why things are” rarely comes into the picture.

I've certainly seen plenty of examples… of people messing something up, and then proceeding to live with the mistake, maybe bothered by the inconvenience, but seemingly powerless to fix it.

The thing is, we are very much in control, and if the universe doesn't conform to our demands, we must simply realign the universe.

So here, while debugging binary data… instead of just going with the flow and inconveniencing ourselves, losing our place and wasting time elongating our debugging process, we will be writing a specialized tool that should assist us greatly in the dcfX debugging process.

The key is to identify an inconvenience. If we have a tool that helps, but is limited, is that a limitation we can live with, or can we improve our overall process by improving the tool (either by extending it, or by writing a new tool altogether).

We've done this a bit with pipes… xxd(1) doesn't natively support capping its lines of display, so we've been using UNIX pipes to have commands like head(1) and tail(1) greatly enhance the utility of our xxd(1) output (versus haplessly scrolling through hundreds of lines of hex values). Thing is, how many would have done this if I never showed you examples?

So please, be on the lookout for limitations in the process- ANY process. Sometimes there is nothing we can really do, but other times, we definitely can. Don't just go with some mindless flow- constantly evaluate whatever process you are following:

There are constantly opportunities for enhancement of process. It is our job to identify strategic ones that can make significant gains. That's why we automate things with shell scripts, that's why we learn to solve problems, that's why we learn about different approaches to algorithms.

So these bdt# projects are a specific foray into this special case study of writing our own custom tool that can get a certain job done, faster. Reducing OUR particular need to keep tabs on something the computer is very much better at doing.

Implementation Restrictions

As our goal is not only to explore the more subtle concepts of computing but to promote different methods of thinking (and arriving at solutions seemingly in different ways), one of the themes I have been harping on is the stricter adherence to the structured programming philosophy. It isn't just good enough to be able to crank out a solution if you remain blind to the many nuances of the tools we are using, so we will at times be going out of our way to emphasize focus on certain areas that may see less exposure (or avoidance due to it being less familiar).

As such, the following implementation restrictions are also in place:

Basically, I am going loosen my implementation restriction grip for this project: I would like you NOT to disappoint me. Write clean, effective code… show me that you have learned something from this class.

Program Specifications

For this project, I am looking for a minimum subset of functionality. But there are many potential improvements that can be made, which I would consider for bonus points.

Basic functionality

Your program should:

The focus is the FIRST byte of difference. The algorithm could get considerably trickier when dealing with additional differences (especially if extra bytes are involved in the difference).

Bonus opportunities

Some ideas to enhance your program for potential bonus points:

Output

A basic mockup (pictures coming soon) of desired output:

lab46:~/src/discrete/bdt1$ ./bdt1 in/sample0.txt in/sample0.off
00000090: 0011 2233 4455 6677 8899 aabb ccdd eeff | 0011 2233 4455 6677 8899 aabb ccdd eeff
000000a0: 55aa 66bb 0401 77cc 88dd 99ee aaff 89af | 55aa 66bb 0501 77cc 88dd 99ee aaff 89af
000000b0: 9988 7766 5544 3322 1100 ffee ddcc bbaa | 9988 7766 5544 3322 1100 ffee ddcc bbaa
lab46:~/src/discrete/bdt1$ 

Submission

To successfully complete this project, the following criteria must be met:

To submit this program to me using the submit tool, run the following command at your lab46 prompt:

$ submit discrete bdt1 bdt1.c
Submitting discrete project "bdt1":
    -> bdt1.c(OK)

SUCCESSFULLY SUBMITTED

You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.