User Tools

Site Tools


Sidebar

projects

  • cci0 (due 20160127)
  • mms0 (due 20160203)
  • dow0 (due 20160210)
  • mbe0 (due 20160224)
  • pnc0 (due 20160302)
  • mbe1 (due 20160309)
  • cos0 (due 20160316)
  • sam0 (due 20160323)
  • cbf0 (due 20160406)
  • afn0 (due 20160413)
  • gfo0 (due 20160420)
haas:spring2016:cprog:projects:cbf0

Corning Community College

CSCS1320 C/C++ Programming

~~TOC~~

Project: C Binary Fun (cbf0)

Objective

To practice manipulating binary data in a C program (for fun and glory).

Background

With the UNIX people exploring binary data, and using hex editors, it only makes sense to steer some of our activities towards the manipulation of binary data as well- one cannot effectively solve a whole domain of problems if they have no idea how to work with binary data.

This project aims to ameliorate that.

Binary data merely refers to data as the computer stores it. The computer is a binary device, so its raw data (as it exists on various forms of storage and media) is often referred to as binary data, to reflect the 1s and 0s being represented.

The data we have become familiar with is textual data. We read from and write to files with the express purpose of storing text in them. And with the use of various text processing tools, we can easily manipulate these text files.

But: did you know that all text data is also binary data?

The trick to remember is that its opposite is not always true: not all binary data is text. In fact most of it isn't. Text represents is a very narrow range of possible data values, and then only within a certain context. You may “see” random letters when viewing binary data, but there is no continuity. The data values that we utilize when interacting with text are also valid combinations of binary values. Which can mean almost anything.

So, text is really ONE (of many) possible representations of binary data. We need to gain a wider perspective and get more familiar with this more expansive and general notion of binary data.

The computer works in units of bytes, which these days means groups of 8-bits. C has the ability to arbitrarily read and write individual bytes of data, and we will want to make use of that to aid us in our current task.

Task

Your task is to write a hex viewer and information highlighter, under the data theme of hard drive partition tables (not so much a focus in this project as it'll be in the planned sequel to this project: cbf1).

On lab46, in the cbf0/ subdirectory of the CPROG Public Directory, are a number of files ending in a .mbr; most are copies of Master Boot Records (MBRs) from various installed Operating Systems (so real, actual data).

Please copy these files into a working directory for your cbf0 endeavors. Assuming you have a ~/src/cprog/cbf0/ directory already existing and ready to go, you can run the following commands:

lab46:~$ cp /var/public/spring2016/cprog/cbf0/*.mbr ~/src/cprog/cbf0/
lab46:~$ 

If you get a prompt back (no errors), then you were likely successful. Change into your project directory and begin work.

Your task is to write a C program that takes a file name as a command-line argument, opens that file, reads its contents, and displays that data (in hex) to the screen according to various criteria:

Your program must:

  • Require the user supply a file via the command-line
    • if the file specified does not exist/cannot be opened, display an error message and exit.
      • error message should be of the form: Error: Could not open 'filename' for reading!
        • Where filename is the name of the file specified on the command-line (make sure the quotes surround it in the output).
    • no further processing should be done if the file is not able to be accessed.
  • Detect the current size of the terminal (see “Detecting Terminal Size” section below), and record the lines and columns into variables for use in your program.
    • If the terminal your program is being run in is less than 80 columns, display an error message and exit.
      • error message should be of the form: Error: Terminal width is less than 80 columns!
      • Your program will only be displaying to an area 80 characters wide, so a wider terminal will not influence program output.
    • Similarly, if the number of lines in the terminal is less than 20, display a similar error message and exit.
      • error message should be of the form: Error: Terminal height is less than 20 lines!
      • Unlike the width, the height can impact program output (taller terminals, if not otherwise throttled by a second command-line argument, can auto-expand if there is more room and data to display).
  • The second command-line argument is a sizing throttle (controlling the number of lines your program will display). If a 0 is given, assume autosize (use the detected height to be your maximum in your calculations).
  • Display an ASCII header identifying the various fields (offset, hex, ascii), surrounded by dashed lines, running 79 characters in width. See below for more details.
  • Each row after the header will display:
    • an 8-digit hex offset (with leading 0x indicator)
    • followed by a single space
    • then 4 single-spaced bytes of data
    • two spaces
    • another 4 single-spaced bytes of data, etc.
      • (for a total of 4 such columns, resulting in 16 total bytes being displayed per line).
    • Finally, two spaces after the last column, and:
    • a 16-character ASCII representation field (no separating spaces between the values)
      • all printable characters should be displayed.
      • all non-printable (and various whitespace) characters should be substituted with a '.'
    • No line will exceed 79 characters.
    • No line will be shorter than 79 characters.
    • A newline will be the character at position 80.
  • Finally, another dashed heading line (this time a footer line) will be displayed, capping the output nicely.
  • The hex values and rendered ASCII displayed will be sourced from the file specified on the command-line. While the target files for this project are all 512 bytes, your program should be able to handle larger and smaller files, and update its display accordingly.
  • If a line throttle is given, your program is to stop output of data and display the terminating footer (to be included to the line count, just as the header is, so that means 4 lines are allocated for headers/footers)
    • on a 20 character tall terminal, this would leave 16 lines for data display, which would result in a total of 256 bytes being displayed (offsets 0x00 - 0xff).
  • Once the data in the file has been exhausted, you need to wrap up as appropriate; finish the current line (even if you have to pad spaces), display the corresponding ascii field (padding spaces as appropriate), and display the closing footer.
  • Don't forget to fclose() any open file pointers! And free() any malloc()'ed or calloc()'ed memory.

Detecting Terminal Size

To detect the current size of your terminal, you may make use of the following code (provided in the form of a complete program for you to test, and then adapt into your code as appropriate):

#include <sys/ioctl.h>
#include <stdio.h>
 
int main (void)
{
    struct winsize terminal;
    ioctl  (0, TIOCGWINSZ, &terminal);
 
    printf ("lines:   %d\n", terminal.ws_row);
    printf ("columns: %d\n", terminal.ws_col);
    return (0);
}

An ioctl(2) is a method (and system/library call) for manipulating underlying device parameters of special files (for the UNIX people: everything is a file, including your keyboard, and terminal screen).

Here we are accessing the information on our terminal file, retrieving the width and height so that we can make use of them productively in our programs.

Compile and run the above code to see how it works. Try it in different size terminals. Then incorporate the logic into your hex viewer for this project.

Example output

When running the program, its output should match the following examples precisely (I'll be evaluating in part on the exactness of it matching my version).

Unthrottled display (512 bytes)

lab46:~/src/cprog$ ./hexview win7.mbr 0
-------------------------------------------------------------------------------
offset      0  1  2  3   4  5  6  7   8  9  a  b   c  d  e  f  ascii
-------------------------------------------------------------------------------
0x00000000 33 c0 8e d0  bc 00 7c 8e  c0 8e d8 be  00 7c bf 00  3.....|......|..
0x00000010 06 b9 00 02  fc f3 a4 50  68 1c 06 cb  fb b9 04 00  .......Ph.......
0x00000020 bd be 07 80  7e 00 00 7c  0b 0f 85 0e  01 83 c5 10  ....~..|........
0x00000030 e2 f1 cd 18  88 56 00 55  c6 46 11 05  c6 46 10 00  .....V.U.F...F..
0x00000040 b4 41 bb aa  55 cd 13 5d  72 0f 81 fb  55 aa 75 09  .A..U..]r...U.u.
0x00000050 f7 c1 01 00  74 03 fe 46  10 66 60 80  7e 10 00 74  ....t..F.f`.~..t
0x00000060 26 66 68 00  00 00 00 66  ff 76 08 68  00 00 68 00  &fh....f.v.h..h.
0x00000070 7c 68 01 00  68 10 00 b4  42 8a 56 00  8b f4 cd 13  |h..h...B.V.....
0x00000080 9f 83 c4 10  9e eb 14 b8  01 02 bb 00  7c 8a 56 00  ............|.V.
0x00000090 8a 76 01 8a  4e 02 8a 6e  03 cd 13 66  61 73 1c fe  .v..N..n...fas..
0x000000a0 4e 11 75 0c  80 7e 00 80  0f 84 8a 00  b2 80 eb 84  N.u..~..........
0x000000b0 55 32 e4 8a  56 00 cd 13  5d eb 9e 81  3e fe 7d 55  U2..V...]...>.}U
0x000000c0 aa 75 6e ff  76 00 e8 8d  00 75 17 fa  b0 d1 e6 64  .un.v....u.....d
0x000000d0 e8 83 00 b0  df e6 60 e8  7c 00 b0 ff  e6 64 e8 75  ......`.|....d.u
0x000000e0 00 fb b8 00  bb cd 1a 66  23 c0 75 3b  66 81 fb 54  .......f#.u;f..T
0x000000f0 43 50 41 75  32 81 f9 02  01 72 2c 66  68 07 bb 00  CPAu2....r,fh...
0x00000100 00 66 68 00  02 00 00 66  68 08 00 00  00 66 53 66  .fh....fh....fSf
0x00000110 53 66 55 66  68 00 00 00  00 66 68 00  7c 00 00 66  SfUfh....fh.|..f
0x00000120 61 68 00 00  07 cd 1a 5a  32 f6 ea 00  7c 00 00 cd  ah.....Z2...|...
0x00000130 18 a0 b7 07  eb 08 a0 b6  07 eb 03 a0  b5 07 32 e4  ..............2.
0x00000140 05 00 07 8b  f0 ac 3c 00  74 09 bb 07  00 b4 0e cd  ......<.t.......
0x00000150 10 eb f2 f4  eb fd 2b c9  e4 64 eb 00  24 02 e0 f8  ......+..d..$...
0x00000160 24 02 c3 49  6e 76 61 6c  69 64 20 70  61 72 74 69  $..Invalid parti
0x00000170 74 69 6f 6e  20 74 61 62  6c 65 00 45  72 72 6f 72  tion table.Error
0x00000180 20 6c 6f 61  64 69 6e 67  20 6f 70 65  72 61 74 69   loading operati
0x00000190 6e 67 20 73  79 73 74 65  6d 00 4d 69  73 73 69 6e  ng system.Missin
0x000001a0 67 20 6f 70  65 72 61 74  69 6e 67 20  73 79 73 74  g operating syst
0x000001b0 65 6d 00 00  00 63 7b 9a  98 a8 b3 d9  00 00 80 01  em...c{.........
0x000001c0 01 00 0b 7f  3f 03 3f 00  00 00 c1 7d  00 00 00 00  ....?.?....}....
0x000001d0 01 04 07 6d  ed df 00 7e  00 00 80 8d  79 00 00 00  ...m...~....y...
0x000001e0 00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
0x000001f0 00 00 00 00  00 00 00 00  00 00 00 00  00 00 55 aa  ..............U.
-------------------------------------------------------------------------------
lab46:~/src/cprog$ 

20 line Throttled display (on a file of 512 bytes)

Here we force our program to cut output short, with a 20 line display cap (note: 4 lines for header/footer, 16 lines for data).

lab46:~/src/cprog$ ./hexview juicebox.mbr 20
-------------------------------------------------------------------------------
offset      0  1  2  3   4  5  6  7   8  9  a  b   c  d  e  f  ascii
-------------------------------------------------------------------------------
0x00000000 eb 3c 90 4f  70 65 6e 42  53 44 00 00  02 02 00 00  .<.OpenBSD......
0x00000010 00 00 00 00  00 f8 00 00  00 00 00 00  10 00 00 00  ................
0x00000020 00 00 00 00  00 00 29 00  00 00 00 55  4e 49 58 20  ......)....UNIX
0x00000030 4c 41 42 45  4c 55 46 53  20 34 2e 34  00 00 ea 48  LABELUFS 4.4...H
0x00000040 00 c0 07 b0  58 e9 37 01  31 c0 8e d0  bc fc 7b 0e  ....X.7.1.....{.
0x00000050 1f be e4 01  88 d6 b4 02  cd 16 0c 00  a8 03 74 03  ..............t.
0x00000060 4e 30 f6 e8  56 01 f6 c6  80 74 1e 52  bb aa 55 b4  N0..V....t.R..U.
0x00000070 41 cd 13 5a  72 13 81 fb  55 aa 75 0d  f6 c1 01 74  A..Zr...U.u....t
0x00000080 08 c7 06 d1  01 84 01 eb  1a 52 b4 08  cd 13 72 b3  .........R....r.
0x00000090 88 36 55 01  80 e1 3f 74  aa 88 0e 4c  01 b0 3b e8  .6U...?t...L..;.
0x000000a0 25 01 5a 66  b8 18 00 00  00 bb e0 07  ff 16 d1 01  %.Zf............
0x000000b0 66 be 28 06  00 00 bf 03  00 89 f9 83  f9 0c 72 03  f.(...........r.
0x000000c0 b9 0c 00 bb  00 40 b0 2e  e8 fc 00 fc  66 ad 66 60  .....@......f.f`
0x000000d0 ff 16 d1 01  66 61 81 c3  00 04 4f e2  e9 09 ff 74  ....fa....O....t
0x000000e0 22 b8 49 00  08 e4 0f 85  95 00 fe 06  e3 00 66 ad  ".I...........f.
0x000000f0 53 bb e0 07  ff 16 d1 01  5b 66 be 00  02 00 00 89  S.......[f......
-------------------------------------------------------------------------------
lab46:~/src/cprog$ 

30 line throttled display (on a 217 byte file)

This example demonstrates the scenario where there isn't enough data to complete not only the specified number of lines, but not even the line it was displaying. In such a case, we pad the line with spaces out to the end (both data and ascii fields) so that everything lines up (note the display of the ascii for whatever data was present on the line).

lab46:~/src/cprog$ ./hexview shortfall.mbr 30
-------------------------------------------------------------------------------
offset      0  1  2  3   4  5  6  7   8  9  a  b   c  d  e  f  ascii
-------------------------------------------------------------------------------
0x00000000 eb 63 90 10  8e d0 bc 00  b0 b8 00 00  8e d8 8e c0  .c..............
0x00000010 fb be 00 7c  bf 00 06 b9  00 02 f3 a4  ea 21 06 00  ...|.........!..
0x00000020 00 be be 07  38 04 75 0b  83 c6 10 81  fe fe 07 75  ....8.u........u
0x00000030 f3 eb 16 b4  02 b0 01 bb  00 7c b2 80  8a 74 01 8b  .........|...t..
0x00000040 4c 02 cd 13  ea 00 7c 00  00 eb fe 00  00 00 00 00  L.....|.........
0x00000050 00 00 00 00  00 00 00 00  00 00 00 80  01 00 00 00  ................
0x00000060 00 00 00 00  ff fa eb 07  f6 c2 80 75  02 b2 80 ea  ...........u....
0x00000070 74 7c 00 00  31 c0 8e d8  8e d0 bc 00  20 fb a0 64  t|..1....... ..d
0x00000080 7c 3c ff 74  02 88 c2 52  be 80 7d e8  1c 01 be 05  |<.t...R..}.....
0x00000090 7c f6 c2 80  74 48 b4 41  bb aa 55 cd  13 5a 52 72  |...tH.A..U..ZRr
0x000000a0 3d 81 fb 55  aa 75 37 83  e1 01 74 32  31 c0 89 44  =..U.u7...t21..D
0x000000b0 04 40 88 44  ff 89 44 02  c7 04 10 00  66 8b 1e 5c  .@.D..D.....f..\
0x000000c0 7c 66 89 5c  08 66 8b 1e  60 7c 66 89  5c 0c c7 44  |f.\.f..`|f.\..D
0x000000d0 06 00 70 b4  42 cd 13 72  05                        ..p.B..r.       
-------------------------------------------------------------------------------
lab46:~/src/cprog$ 

Bonus Opportunities

The following can be considered a bonus point opportunity:

  • Enhance the program to accept up to 6 pairs of additional values (offset followed by its length), where each offset through length will be colored using ANSI text escape sequences.
  • For any line containing this colorized text, highlight the address in bold white.

Sample output

As an example, running the program with the following arguments could produce results like this:

lab46:~/src/cprog$ ./hexview win7.mbr 0 0x1be 1 0x1c2 1 0x1c6 4 0x1ca 4 0x1fe 2

ANSI escape sequences for color

This probably isn't very portable, and depending on the terminal, it may not work for some people.

It may be most convenient to set up preprocessor #define statements near the top of your code, as follows:

#define  ANSI_RESET             "\x1b[0m"
#define  ANSI_BOLD              "\x1b[1m"
#define  ANSI_FG_BLACK          "\x1b[30m"
#define  ANSI_FG_RED            "\x1b[31m"
#define  ANSI_FG_GREEN          "\x1b[32m"
#define  ANSI_FG_YELLOW         "\x1b[33m"
#define  ANSI_FG_BLUE           "\x1b[34m"
#define  ANSI_FG_MAGENTA        "\x1b[35m"
#define  ANSI_FG_CYAN           "\x1b[36m"
#define  ANSI_FG_WHITE          "\x1b[37m"
#define  ANSI_BG_BLACK          "\x1b[40m"
#define  ANSI_BG_RED            "\x1b[41m"
#define  ANSI_BG_GREEN          "\x1b[42m"
#define  ANSI_BG_YELLOW         "\x1b[43m"
#define  ANSI_BG_BLUE           "\x1b[44m"
#define  ANSI_BG_MAGENTA        "\x1b[45m"
#define  ANSI_BG_CYAN           "\x1b[46m"
#define  ANSI_BG_WHITE          "\x1b[47m"

To use, you output them:

fprintf(stdout, ANSI_FG_GREEN);
fprintf(stdout, "This text is green\n");
fprintf(stdout, ANSI_RESET);

You have to remember to turn the color or setting off (resetting it) to revert back to the original color.

You can mix and match as well:

fprintf(stdout, ANSI_FG_YELLOW);
fprintf(stdout, ANSI_BG_BLUE);
fprintf(stdout, ANSI_BOLD);
fprintf(stdout, "This text is bold yellow on blue\n");
fprintf(stdout, ANSI_RESET);

While there are 8 available foreground colors, bolding can double that range to 16.

Submission

To successfully complete this project, the following criteria must be met:

  • Code must compile cleanly (no warnings or errors)
    • Use the -Wall flag when compiling.
  • Code must be nicely and consistently indented (you may use the indent tool)
  • Code must utilize the algorithm/approach presented above
  • Output must match the specifications presented above (when given the same inputs)
  • Code must be commented
    • have a properly filled-out comment banner at the top
    • have at least 20% of your program consist of //-style descriptive comments
  • Track/version the source code in a repository
  • Submit a copy of your source code to me using the submit tool.

To submit this program to me using the submit tool, run the following command at your lab46 prompt:

$ submit cprog cbf0 cbf0.c
Submitting cprog project "cbf0":
    -> cbf0.c(OK)

SUCCESSFULLY SUBMITTED

You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.

haas/spring2016/cprog/projects/cbf0.txt · Last modified: 2016/03/22 19:56 by wedge