User Tools

Site Tools


notes:discrete:fall2022:projects:rle2

This is an old revision of the document!


BACKGROUND

Version 3 of our RLE algorithm.

This time we will be focusing on conditional encoding and decoding of runs of similar data greater than one.


Run Length Encoding (RLE) Data Compression Algorithm

Run-length encoding (RLE) algorithm is a lossless compression of runs of data, where repeated sequences of data are stored as a representation of a single data value and its count (how many times it repeats consecutively). RLE encoding is commonly used in JPEG, TIFF, and PDF files, to name a few examples.

SPECIFICATIONS

Please reference the image below to find the hexadecimal value of the ASCII symbols:

*Our task is to ask questions on Discord or in class and document our findings on this wiki page collaboratively, regarding the functionality of this project.

*For anybody interested in editing the wiki page, here is the dokuwiki user guide: https://www.dokuwiki.org/wiki:syntax#basic_text_formatting -Ash

OUTPUT SPECIFICATIONS

Determine a unified means of output so that all submissions have an identical format.

PROGRAM

Arguments

./encode INFILE  OUTPATH STRIDE  CONTROL
 argv[0] argv[1] argv[2] argv[3] argv[4]

./encode sample0.txt some/other/directory/ 2 5
./decode INFILE  OUTPATH
 argv[0] argv[1] argv[2]

./decode sample0.txt some/other/directory/

Explanation

Input-file→the name of the file that is being encoded/decoded Outpath→??? Stride→How long the chain of bundled characters is (ie. 1ab is 2, 1abc is 3, etc)
Control→A value chosen to signal that the following bits are compressed, and thus will need to be decoded. Preferably the least common bit, as false positives are a possibility

DATA HEADER SPECIFICATIONS

(mostly the same as rle0)

Header Format:

byte 0: 0x72
byte 1: 0x6c
byte 2: 0x65
byte 3: 0x58
byte 4: 0x20
byte 5: 0x52
byte 6: 0x4c
byte 7: 0x45
byte 8: 0x(control byte)
byte 9: 0x03 (version)
byte 10: 0x(stride value) – changes depending on input
byte 11: 0xArgv The length of the source file name, not including NULL terminator
(how many characters in Argv - 1)
byte 12: The name of the source file, not including the NULL terminator

EXAMPLES

VERIFICATION

Derive a set of tests that all submissions should perform to ascertain correctness (state the tests, the inputs, and the expected outputs). In conjunction with conforming output specifications, all submissions should match (this is the basis for writing a verification script that can automate the process).

Which, being said: once output specifications and verification tests have been established, anyone writing a verification script to automate this can be eligible to receive bonus points.

PSEUDOCODE

notes/discrete/fall2022/projects/rle2.1667013826.txt.gz · Last modified: 2022/10/29 03:23 by abarbcal