Lab46 Wiki

This is an old revision of the document!

BACKGROUND

Run Length Encoding (RLE) Data Compression Algorithm

Run-length encoding (RLE) algorithm is a lossless compression of runs of data, where repeated sequences of data are stored as a representation of a single data value and its count (how many times it repeats consecutively). RLE encoding is commonly used in JPEG, TIFF, and PDF files, to name a few examples.

For anybody interested in editing the wiki page, here is the dokuwiki user guide: https://www.dokuwiki.org/wiki:syntax#basic_text_formatting -Ash

ALGORITHM: RLE

The RLE encoding algorithm first takes a file that contains a run of data in the following way:

Original:

aaaabbcdefgggg

Suppose this data sequence is contained within a .txt file.

The RLE algorithm will encode this data by replacing the repeated characters with a count number and a single value.

4a2b1c1d1e1f5g

There are 4 a's (0x61), 2 b's(0x62), 1 c(0x63), 1 d(0x64), 1 e(0x65), 1 f (0x66), and 5 g's (0x67).

Encoded:

04 61 02 62 01 63 01 64 01 65 01 66 05 67
^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^
4  a  2  b  1  c  1  d  1  e  1  f  5  g

Please reference the image below to find the hexadecimal value of the ASCII symbols:

SPECIFICATIONS

Our task is to ask questions on Discord or in class, document our findings on this wiki page, and implement the RLE encoding/decoding algorithms in our own encoder and decoder programs.

While fetching data from the source file, use the function feof() from stdlib.h to ensure the end of file has been found; This will help you avoid any false positives resulting in unpredictable behavior.

Example:

while ( !feof( in ) )
{
     data = fgetc( in );
     fprintf ( out, "%c", data);
}

DATA HEADER SPECIFICATIONS

Header Format:

byte 0: 0x72
byte 1: 0x6c
byte 2: 0x65
byte 3: 0x58
byte 4: 0x20
byte 5: 0x52
byte 6: 0x4c
byte 7: 0x45
byte 8: 0x00 (reserved)
byte 9: 0x01 (version)
byte 10: 0x01 (stride value)
byte 11: 0xArgv The length of the source file name, not including NULL terminator
(how many characters in Argv - 1)
byte 12: The name of the source file, not including the NULL terminator

Info

Byte 8 - Reserved for a future project.
Byte 9 - Version number, this week is version 1.
Byte A - Width of comparison. 1 byte for this week's project.
Byte B - 'bob' would be 3.

In the case of 'bob', byte C would be 0x62('b'), byte D would be 0x6F('o'), and byte E would be 0x62('b').

PROGRAM

Encode program will take two arguments:

./encode INFILE  OUTFILE
 argv[0] argv[1] argv[2]

./encode sample0.txt sample0.txt.rle

Encoder will output a file with a name equivalent to the second argument.

Decode program will take one argument:

./decode INFILE
 argv[0] arfv[1]

./decode sample0.txt.rle

Decode will output a file of similar name without the .rle extension after running your decoder.

OUTPUT SPECIFICATIONS

Program should be able to encode a file such as sample0.txt and successfully output sample0.txt.rle in a similar manner to the file given to you when you initially grabbed the project. Checksums should be identical to each other. Similar to encoding, decoding should be identical with an input of sample0.txt.rle for example, it should output a file sample0.txt with a checksum equivalent to that of sample0.txt when you first grabbed the project.

Output some data after a successful run of the program with information of how many characters were encoded and how many characters were printed into the output file with a comparison percentage in regards to file compression.

Encoded: xxx bytes
Decoded: xxx bytes
File compression: %

VERIFICATION

All programs should be able to both encode/decode with interoperability in mind such that person_a's decoder must be able to decode a file that has been encoded by person_b's encoder, and vise versa.

eg.

some_file = "Hello World"
person_a -> encode(some_file)
person_b -> decode(some_file)

print(some_file)

>> Hello World

Lab46 Wiki

Sidebar

Table of Contents

BACKGROUND

ALGORITHM: RLE

SPECIFICATIONS

DATA HEADER SPECIFICATIONS

Info

PROGRAM

OUTPUT SPECIFICATIONS

VERIFICATION

Lab46 Wiki

User Tools

Site Tools

Sidebar

Table of Contents

BACKGROUND

ALGORITHM: RLE

SPECIFICATIONS

DATA HEADER SPECIFICATIONS

Info

PROGRAM

OUTPUT SPECIFICATIONS

VERIFICATION

Page Tools