Corning Community College
CSCS2330 Discrete Structures
To continue to explore the realm of algorithmic encoding/decoding of information, potentially achieving data compression in ideal scenarios, and collaboratively authoring and documenting the project and its specifications.
In rle0, you implemented the fixed-encoding, single byte stride.
In rle1, you implemented the fixed-encoding, variable but consistent throughout encoding byte stride.
In rle2, we explore the implementation of a control byte, one byte out of the 256 available bytes that will be used to kick off a conditional encoding sequence of a variable but consistent throughout encoding byte stride.
To assist with consistency across all implementations, data files for use with this project are available on lab46 via the grabit tool. Be sure to obtain it and ensure your implementation properly works with the provided data.
lab46:~/src/SEMESTER/DESIG$ grabit DESIG PROJECT
You will want to go here to edit and fill in the various sections of the document:
Version 3 of our RLE algorithm.
This time we will be focusing on conditional encoding and decoding of runs of similar data greater than one.
Run Length Encoding (RLE) Data Compression Algorithm
Run-length encoding (RLE) algorithm is a lossless compression of runs of data, where repeated sequences of data are stored as a representation of a single data value and its count (how many times it repeats consecutively). RLE encoding is commonly used in JPEG, TIFF, and PDF files, to name a few examples.
If a control byte is present inside your data but does not include encoded data with it, you will have to make modifications to your encode/decode process. If this happens during your encoder, the output should be: control_byte 01 01 control_byte. Your decoder should be able to handle if this happens and properly decode that as “control_byte” with control byte being whatever control byte belongs to that file.
STRIDES and CONTROL BYTES for given files:
sample0.txt: Stride: 1, Control Byte: 0x62(B)
sample1.txt: Stride: 30, Control Byte: 0xff(255)
sample2.bmp: Stride: 6, Control Byte: 0x25(37)
sample3.wav: Stride: 73, Control Byte: 0x17(23)
sample5.txt: Stride: 4, Control Byte: 0x29(41)
Please reference the image below to find the hexadecimal value of the ASCII symbols:
*Our task is to ask questions on Discord or in class and document our findings on this wiki page collaboratively, regarding the functionality of this project.
*For anybody interested in editing the wiki page, here is the dokuwiki user guide: https://www.dokuwiki.org/wiki:syntax#basic_text_formatting -Ash
Determine a unified means of output so that all submissions have an identical format.
./encode INFILE OUTPATH STRIDE CONTROL argv[0] argv[1] argv[2] argv[3] argv[4] ./encode sample0.txt some/other/directory/ 2 5
./decode INFILE OUTPATH argv[0] argv[1] argv[2] ./decode sample0.txt some/other/directory/
NOTE Control value is input as a decimal 0-255. When running encode for this project, the control value should be input as '61' for a, '62' for b, etc.. So, in order to do this, you will need to do some conversion within rle2 in order to convert the 61 decimal into hexadecimal. One way to do this is possibly sscanf. The same is true of the stride.
NOTE To convert a string argument into a usable decimal, set your desired variable equal to atoi() and put the string argument you wish to convert in the function. This works with unsigned chars as well as ints. If you atoi to an unsigned char, it is easy to put it into the header array with a simple assignment.
Input-file→the name of the file that is being encoded/decoded
Outpath→The path to the output file NOT including the filename (ie '~/rle2/x.rle' → '~/rle/')
Stride→How long the chain of bundled characters is (ie. 1ab is 2, 1abc is 3, etc)
Control→A value chosen to signal that the following bits are compressed, and thus will need to be decoded. Preferably the least common bit, as false positives are a possibility
(mostly the same as rle0)
Header Format:
byte 0: 0x72
byte 1: 0x6c
byte 2: 0x65
byte 3: 0x58
byte 4: 0x20
byte 5: 0x52
byte 6: 0x4c
byte 7: 0x45
byte 8: 0x(control byte)
byte 9: 0x03 (version)
byte 10: 0x(stride value) – changes depending on input
byte 11: 0xArgv The length of the source file name, not including NULL terminator
(how many characters in Argv - 1)
byte 12: The name of the source file, not including the NULL terminator
Eval script is inside, however, it doesn't seem to work and is seg faulting when ran. To manually verify, you can check the checksums of your output files against the given output file. Example: ./decode sample0.txt.rle should output the same checksum as sample0.txt inside the data file. You can do this for all given files.
Derive a set of tests that all submissions should perform to ascertain correctness (state the tests, the inputs, and the expected outputs). In conjunction with conforming output specifications, all submissions should match (this is the basis for writing a verification script that can automate the process).
Which, being said: once output specifications and verification tests have been established, anyone writing a verification script to automate this can be eligible to receive bonus points.
To be successful in this project, the following criteria (or their equivalent) must be met:
Let's say you have completed work on the project, and are ready to submit, you would do the following (assuming you have a program called uom0.c):
lab46:~/src/SEMESTER/DESIG/PROJECT$ make submit
You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.
I'll be evaluating the project based on the following criteria:
156:rle2:final tally of results (156/156) *:rle2:used grabit to obtain project by the Sunday prior to duedate [13/13] *:rle2:clean compile, no compiler messages [13/13] *:rle2:implementation passes verification tests [39/39] *:rle2:adequate modifications to code from template [39/39] *:rle2:program operations conform to project specifications [39/39] *:rle2:code tracked in lab46 semester repo [13/13]