In the previous week, we encountered a diversity of errors while developing our codes for the RLE0 encoder and decoder. One of the most frequent errors we found ourselves struggling with was a mismatch between the bytes contained within the files processed by our programs and the sample files. As a way to debug those issues, some of us resorted to comparing the MD5SUMS hashes or the binaries with a hex editor of both files. In other words, we were dependent on other already existing tools to patch our bugs. However, these tools have one flaw they are not specified RLE debugging tools. If only we could find the perfect tool that meets our needs, or better yet, if only we could develop our own debugging tools for an improved RLE algorithm development experience. (I feel like I managed to channel my inner Matt there :P)
Therefore, this week, for bdb0 we are tasked to write a debug tool that will be used in future rleX projects. This debug tool will compare two binary files, one being the output of our encoder/decoder, and the other being some reference file.
BDB0 will take two arguments, two different files that will be compared to each other:
./bdb0 FILE1.txt sample1.txt Argv[0] Arg[1] Arg[2]
These files are not necessarily RLE-specific, you could use bdb0 to compare any files together.
The main focus of this program is to help us compare binaries, displayed in hexadecimal values, similar to a hex editor, only displaying the mismatched data. Hence, the output should be the line byte of discrepancy, along with the previous and following 16 bytes and the address of those bytes
*Our task is to ask questions on Discord or in class and document our findings on this wiki page collaboratively, regarding the functionality of this project.
WIKI EDGE PAGE *For anybody interested in editing the wiki page, here is the dokuwiki user guide: https://www.dokuwiki.org/wiki:syntax#basic_text_formatting -Ash
When checking to verify that one's project functions as intended, it should be borne in mind that the verify program included in the grabit does not work, so it will ultimately be up to you to check the program. Ways this may be done include using the diff and xxd commands, and testing your project with with the files included in the in sub-directory. I myself found it useful to custom-create a few files in order to ensure certain edge cases occurred, such that they may be tested for (bear in mind this project is a debugger to assist us in future projects, so leaving flaws is likely sub-optimal).
This is an example of the output of the program:
lab46:~/src/SEMESTER/DESIG/PROJECT$ ./bdb0 example0 example1 ============================================================================================================== |000000e0 | 5325 2525 2525 2574 2525 7474 2574 7474 || 5325 2525 2525 2574 2525 7474 2574 7474 | 000000e0| |000000f0 | 5252 5252 5252 5245 4545 4545 4545 4545 || 5525 1b5b 303b 353b 3337 3b34 366d 401b | 000000f0| |00000100 | 5b30 3b31 3b33 363b 3437 6d53 1b5b 303b || 5b30 3b31 3b33 363b 3437 6d53 1b5b 303b | 00000100| ==============================================================================================================
This is the structure:
The middle line, called mismatch, is the focus of the program, and it should be displayed in a different color with the use of ANSI escape codes.
The program is to display 16 bytes prior to, 16 bytes of, and 16 bytes after the first byte of difference detected in the two files. If either the 16 bytes prior or 16 bytes after the mismatch are not data from the file, in other words, you've reached the end of the file and there is no more data to display, you may display it with a NULL character (0x00, '\n'). You can display it with other characters, should you prefer dots, dashes, or other characters that could represent the end of the file. However, make sure these cannot be possible reoccurrences elsewhere in the file.
Example: In this example you see a mismatch happening right on the very first bytes of the file. Since there is no previous data to fetch, this lack of data is then displayed as NULL.
The address portion is a line count in hexadecimal format.
The colors you use are up to you, but some coloring has to happen. So far it is required that three parts of the output are displayed in different colors, the addresses (on both sides), the data (both sides), and the mismatched bytes line.
Your program should only print the FIRST mismatch, NOT the entire document neither every mismatched line of bytes. JUST THE FIRST ONE along the previous and next bytes. Also make sure that you deal with edge cases in your program. Your file should be able to handle mismatches at the very start of the compared files and the very end of the compared files.
When running the program with two DIFFERENT files, the program should display the line containing the first byte that is mismatched, with the leading/post 16 bytes.
When running the program with two SAME files, the program should display nothing. The normal process of using make check to check your file against the desired result won't work in this case. The verify file is that of a different year when different specifications were desired. Running it this time around will lead you to many mismatches and headaches. You can check your output against the pictures provided above as a general guideline but other than that it is essential that if you have any questions to just ask in discord instead of guessing and hoping it is what is desired. Most of the desired output is already located on the project page.
I think there are two approaches people have taken with the display of information with this project:
The main thing of note here is the position of the first mismatched bit in its line. In the first implementation, bytes are displayed chronologically, with the leftmost bit always being a multiple of 16. The second implementation has the mismatched bit always in the leftmost position and never in the middle of a line. The approach you take comes down to personal preference as is with a lot of this project, since we don't have a strict verify script to check output with.