User Tools

Site Tools


Sidebar

projects

wcp1 (due 20240828)
btt0 (due 20240904)
wcp2 (due 20240904)
pct0 (bonus; due 20240905)
pct1 (bonus; due 20240905)
pct2 (due 20240905)
abc0 (due 20240906)
msi0 (due 20240911)
pct3 (bonus; due 20240911)
wcp3 (due 20240911)
msi1 (due 20240918)
pct4 (due 20240918)
wcp4 (due 20240918)
dsr0 (due 20240926)
pct5 (bonus; due 20240926)
wcp5 (due 20240926)
gfo0 (due 20241002)
pct6 (due 20241002)
pnc0 (due 20241002)
wcp6 (due 20241002)
dsr1 (due 20241009)
pct7 (bonus; due 20241009)
wcp7 (due 20241009)
bwp1 (bonus; due 20241016)
pct8 (due 20241016)
pnc1 (due 20241016)
wcp8 (due 20241016)
pct9 (bonus; due 20241023)
pnc2 (due 20241023)
wcp9 (due 20241023)
gfo1 (due 20241030)
mag0 (due 20241030)
pctA (due 20241030)
wcpA (due 20241030)
mag1 (due 20241106)
pctB (bonus; due 20241106)
wcpB (due 20241106)
mag2 (due 20241113)
pctC (due 20241113)
wcpC (due 20241113)
pctD (bonus; due 20241120)
wcpD (bonus; due 20241120)
bwp2 (bonus; due 20241204)
gfo2 (due 20241204)
pctE (bonus; due 20241204)
wcpE (bonus; due 20241204)
EoCE (due 20241216)
haas:fall2024:discrete:projects:rle2

Corning Community College

CSCS2330 Discrete Structures

PROJECT: Run-Length Encoding (RLE2)

OBJECTIVE

To continue to explore the realm of algorithmic encoding/decoding of information, potentially achieving data compression in ideal scenarios, and collaboratively authoring and documenting the project and its specifications.

OVERVIEW

In rle0, you implemented the fixed-encoding, single byte stride.

In rle1, you implemented the fixed-encoding, variable but consistent throughout encoding byte stride.

In rle2, we explore the implementation of a control byte, one byte out of the 256 available bytes that will be used to kick off a conditional encoding sequence of a variable but consistent throughout encoding byte stride.

GRABIT

To assist with consistency across all implementations, data files for use with this project are available on lab46 via the grabit tool. Be sure to obtain it and ensure your implementation properly works with the provided data.

lab46:~/src/SEMESTER/DESIG$ grabit DESIG PROJECT

EDIT

You will want to go here to edit and fill in the various sections of the document:

BACKGROUND

Version 3 of our RLE algorithm.

This time we will be focusing on conditional encoding and decoding of runs of similar data greater than one.


Run Length Encoding (RLE) Data Compression Algorithm

Run-length encoding (RLE) algorithm is a lossless compression of runs of data, where repeated sequences of data are stored as a representation of a single data value and its count (how many times it repeats consecutively). RLE encoding is commonly used in JPEG, TIFF, and PDF files, to name a few examples.

SPECIFICATIONS

If a control byte is present inside your data but does not include encoded data with it, you will have to make modifications to your encode/decode process. If this happens during your encoder, the output should be: control_byte 01 01 control_byte. Your decoder should be able to handle if this happens and properly decode that as “control_byte” with control byte being whatever control byte belongs to that file.

REFERENCES

STRIDES and CONTROL BYTES for given files:
sample0.txt: Stride: 1, Control Byte: 0x62(B)
sample1.txt: Stride: 30, Control Byte: 0xff(255)
sample2.bmp: Stride: 6, Control Byte: 0x25(37)
sample3.wav: Stride: 73, Control Byte: 0x17(23)
sample5.txt: Stride: 4, Control Byte: 0x29(41)

Please reference the image below to find the hexadecimal value of the ASCII symbols:

*Our task is to ask questions on Discord or in class and document our findings on this wiki page collaboratively, regarding the functionality of this project.

*For anybody interested in editing the wiki page, here is the dokuwiki user guide: https://www.dokuwiki.org/wiki:syntax#basic_text_formatting -Ash

OUTPUT SPECIFICATIONS

Determine a unified means of output so that all submissions have an identical format.

PROGRAM

Arguments
./encode INFILE  OUTPATH STRIDE  CONTROL
 argv[0] argv[1] argv[2] argv[3] argv[4]

./encode sample0.txt some/other/directory/ 2 5
./decode INFILE  OUTPATH
 argv[0] argv[1] argv[2]

./decode sample0.txt some/other/directory/

NOTE Control value is input as a decimal 0-255. When running encode for this project, the control value should be input as '61' for a, '62' for b, etc.. So, in order to do this, you will need to do some conversion within rle2 in order to convert the 61 decimal into hexadecimal. One way to do this is possibly sscanf. The same is true of the stride.

NOTE To convert a string argument into a usable decimal, set your desired variable equal to atoi() and put the string argument you wish to convert in the function. This works with unsigned chars as well as ints. If you atoi to an unsigned char, it is easy to put it into the header array with a simple assignment.

Explanation

Input-file→the name of the file that is being encoded/decoded

Outpath→The path to the output file NOT including the filename (ie '~/rle2/x.rle' → '~/rle/')

Stride→How long the chain of bundled characters is (ie. 1ab is 2, 1abc is 3, etc)

Control→A value chosen to signal that the following bits are compressed, and thus will need to be decoded. Preferably the least common bit, as false positives are a possibility

DATA HEADER SPECIFICATIONS

(mostly the same as rle0)

Header Format:

byte 0: 0x72
byte 1: 0x6c
byte 2: 0x65
byte 3: 0x58
byte 4: 0x20
byte 5: 0x52
byte 6: 0x4c
byte 7: 0x45
byte 8: 0x(control byte)
byte 9: 0x03 (version)
byte 10: 0x(stride value) – changes depending on input
byte 11: 0xArgv The length of the source file name, not including NULL terminator
(how many characters in Argv - 1)
byte 12: The name of the source file, not including the NULL terminator

EXAMPLES

VERIFICATION

Eval script is inside, however, it doesn't seem to work and is seg faulting when ran. To manually verify, you can check the checksums of your output files against the given output file. Example: ./decode sample0.txt.rle should output the same checksum as sample0.txt inside the data file. You can do this for all given files.

Derive a set of tests that all submissions should perform to ascertain correctness (state the tests, the inputs, and the expected outputs). In conjunction with conforming output specifications, all submissions should match (this is the basis for writing a verification script that can automate the process).

Which, being said: once output specifications and verification tests have been established, anyone writing a verification script to automate this can be eligible to receive bonus points.

PSEUDOCODE

 

SUBMISSION

To be successful in this project, the following criteria (or their equivalent) must be met:

  • Project must be submit on time, by the deadline.
    • Late submissions will lose 33% credit per day, with the submission window closing on the 3rd day following the deadline.
  • All code must compile cleanly (no warnings or errors)
    • Compile with the -Wall and –std=gnu18 compiler flags
    • all requested functionality must conform to stated requirements (either on this document or in a comment banner in source code files themselves).
  • Executed programs must display in a manner similar to provided output
    • output formatted, where applicable, must match that of project requirements
  • Processing must be correct based on input given and output requested
  • Output, if applicable, must be correct based on values input
  • Code must be nicely and consistently indented
  • Code must be consistently written, to strive for readability from having a consistent style throughout
  • Code must be commented
    • Any “to be implemented” comments MUST be removed
      • these “to be implemented” comments, if still present at evaluation time, will result in points being deducted.
      • Sufficient comments explaining the point of provided logic MUST be present
  • No global variables (without instructor approval), no goto statements, no calling of main()!
  • Track/version the source code in your lab46 semester repository
  • Submit a copy of your source code to me using the submit tool (make submit on lab46 will do this) by the deadline.

Submit Tool Usage

Let's say you have completed work on the project, and are ready to submit, you would do the following (assuming you have a program called uom0.c):

lab46:~/src/SEMESTER/DESIG/PROJECT$ make submit

You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.

RUBRIC

I'll be evaluating the project based on the following criteria:

156:rle2:final tally of results (156/156)
*:rle2:used grabit to obtain project by the Sunday prior to duedate [13/13]
*:rle2:clean compile, no compiler messages [13/13]
*:rle2:implementation passes verification tests [39/39]
*:rle2:adequate modifications to code from template [39/39]
*:rle2:program operations conform to project specifications [39/39]
*:rle2:code tracked in lab46 semester repo [13/13]

Pertaining to the collaborative authoring of project documentation

  • each class member is to participate in the contribution of relevant information and formatting of the documentation
    • minimal member contributions consist of:
      • near the class average edits (a value of at least four productive edits)
      • near the average class content change average (a value of at least 256 bytes (absolute value of data content change))
      • near the class content contribution average (a value of at least 1kiB)
      • no adding in one commit then later removing in its entirety for the sake of satisfying edit requirements
    • adding and formatting data in an organized fashion, aiming to create an informative and readable document that anyone in the class can reference
    • content contributions will be factored into a documentation coefficient, a value multiplied against your actual project submission to influence the end result:
      • no contributions, co-efficient is 0.50
      • less than minimum contributions is 0.75
      • met minimum contribution threshold is 1.00

Additionally

  • Solutions not abiding by spirit of project will be subject to a 50% overall deduction
  • Solutions not utilizing descriptive why and how comments will be subject to a 25% overall deduction
  • Solutions not utilizing indentation to promote scope and clarity or otherwise maintaining consistency in code style and presentation will be subject to a 25% overall deduction
  • Solutions not organized and easy to read (assume a terminal at least 90 characters wide, 40 characters tall) are subject to a 25% overall deduction
haas/fall2024/discrete/projects/rle2.txt · Last modified: 2022/11/06 15:21 by 127.0.0.1