This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
haas:fall2017:discrete:projects:dcf1 [2017/09/08 12:42] – [Other specification details] wedge | haas:fall2017:discrete:projects:dcf1 [2017/10/09 14:21] (current) – wedge | ||
---|---|---|---|
Line 3: | Line 3: | ||
< | < | ||
</ | </ | ||
- | |||
- | ~~TOC~~ | ||
======Project: | ======Project: | ||
+ | |||
+ | =====Errata===== | ||
+ | Any changes that have been made. | ||
+ | |||
+ | * Revision 0.1: Updating dcfX v2 spec and added some additional implementation constraints (20170907) | ||
+ | * Revision 0.2: Finalized project data files, adapted included ' | ||
+ | * Revision 0.3: Updated check script so it no longer gives out false negatives. **make getdata** to grab the updated copy (20170921) | ||
=====Objective===== | =====Objective===== | ||
Line 55: | Line 60: | ||
====Header==== | ====Header==== | ||
It is actually **identical** to the specifications of last week, save for four changes: | It is actually **identical** to the specifications of last week, save for four changes: | ||
- | - we're no longer hard-coding the **stride** value to 1 (byte 10) | + | - we're no longer hard-coding the **stride** value to 1 (byte 10), but instead obtaining it from the command-line (argv[3]), any valid value between 1 and 255 (inclusive). |
- we're placing a 2 in the version byte (byte 9) | - we're placing a 2 in the version byte (byte 9) | ||
- the embedded source file name will now be stripped of any path (ie " | - the embedded source file name will now be stripped of any path (ie " | ||
- the destination argument (argv[2]) is now merely a path, NOT a path+filename (ie " | - the destination argument (argv[2]) is now merely a path, NOT a path+filename (ie " | ||
* the destination file is a combination of the destination path + source filename + " | * the destination file is a combination of the destination path + source filename + " | ||
+ | |||
+ | And specifically for **decode**, the source filename will be retrieved out of the post-header information at the start of the encoded file. | ||
Every RL-encoded file will start with the following 12-byte header: | Every RL-encoded file will start with the following 12-byte header: | ||
Line 162: | Line 169: | ||
‘/ | ‘/ | ||
‘/ | ‘/ | ||
- | ‘/ | + | ... |
- | ‘/ | + | |
- | ‘/ | + | |
- | ‘/ | + | |
- | ‘/ | + | |
- | ‘/ | + | |
- | ‘/ | + | |
- | ‘/ | + | |
make: Leaving directory '/ | make: Leaving directory '/ | ||
lab46: | lab46: | ||
Line 227: | Line 226: | ||
<cli> | <cli> | ||
- | lab46: | + | lab46: |
- | input name length: | + | |
- | | + | input filename: |
- | output filename: | + | embedded name length: 11 bytes |
- | | + | embedded file name: sample3.txt |
- | read in: 250934 | + | |
- | | + | output filename: |
- | | + | stride value: |
+ | | ||
+ | data written out: 78 bytes | ||
+ | total written | ||
+ | compression rate: 4.88% | ||
lab46: | lab46: | ||
</ | </ | ||
- | With various formats, you'll likely want to play with the stride in order to find better compression | + | Similarly, if we were to encode the **sample2.bmp** data file from dcf0 with the right stride, we can actually achieve a notable amount of compression (unlike our results from dcf0 with a stride fixed at 1 byte): |
+ | |||
+ | < | ||
+ | lab46: | ||
+ | input name length: 22 bytes | ||
+ | input filename: ../ | ||
+ | embedded name length: 11 bytes | ||
+ | embedded file name: sample2.bmp | ||
+ | output name length: 19 bytes | ||
+ | | ||
+ | stride value: 37 bytes | ||
+ | read in: 250934 bytes | ||
+ | data written out: 183730 bytes | ||
+ | total written out: 183753 bytes | ||
+ | compression rate: 26.78% | ||
+ | lab46: | ||
+ | </ | ||
+ | |||
+ | With various formats, you'll likely want to play with the stride in order to find better compression | ||
====Decode==== | ====Decode==== | ||
<cli> | <cli> | ||
- | lab46: | + | lab46: |
- | input filename: | + | input filename: |
output name length: 11 bytes | output name length: 11 bytes | ||
| | ||
Line 254: | Line 275: | ||
</ | </ | ||
=====Check Results===== | =====Check Results===== | ||
+ | |||
A good way to test that both encode and decode are working is to encode data then immediately turn around and decode that same data. If the decoded file is in the same state as the original, pre-encoded file, you know things are working. | A good way to test that both encode and decode are working is to encode data then immediately turn around and decode that same data. If the decoded file is in the same state as the original, pre-encoded file, you know things are working. | ||
- | If you'd like to verify your implementations beyond simply encoding | + | ====diff compare=== |
+ | A quick way to check if two files are identical is to run the **diff(1)** command on them, so assuming | ||
- | Run it on the original unencoded | + | < |
+ | lab46: | ||
+ | lab46: | ||
+ | </ | ||
+ | |||
+ | Just getting your prompt back indicates no major differences were found. | ||
+ | |||
+ | ====MD5sum compare==== | ||
+ | If you'd like to be REALLY sure, generate MD5sum hashes and compare: | ||
+ | |||
+ | < | ||
+ | lab46: | ||
+ | 10f9bc85023dcf37be2b04638cb45ee2 | ||
+ | 10f9bc85023dcf37be2b04638cb45ee2 | ||
+ | lab46: | ||
+ | </ | ||
+ | |||
+ | As you can see, both hashes match (the MD5sum hashes are analyzing | ||
+ | |||
+ | ====Hex Dump/ | ||
+ | You may want to check and see what exactly your program is generating. | ||
+ | |||
+ | This can be done by performing a hex data dump (or visualization) of the raw data in the output file. | ||
+ | |||
+ | The tool I'd recommend for quick viewing is **xxd(1)**; please see the following example: | ||
+ | |||
+ | < | ||
+ | lab46: | ||
+ | 0000000: 6463 6658 2052 4c45 0002 030b 7361 6d70 dcfX RLE....samp | ||
+ | 0000010: 6c65 332e 7478 7401 6162 6201 6363 6301 le3.txt.abb.ccc. | ||
+ | 0000020: 6464 6401 6465 6501 6565 6502 6666 6602 ddd.dee.eee.fff. | ||
+ | 0000030: 6767 6701 6768 6802 6868 6803 6969 6902 ggg.ghh.hhh.iii. | ||
+ | 0000040: 6a6a 6a01 6a6a 6b02 6b6b 6b02 6c6c 6c01 jjj.jjk.kkk.lll. | ||
+ | 0000050: 6d6d 6d01 6d6d 6e01 6e6e 6e01 6f6f 6f01 mmm.mmn.nnn.ooo. | ||
+ | 0000060: 7070 7101 0a | ||
+ | lab46: | ||
+ | </ | ||
+ | |||
+ | With this output, we can confirm, byte-by-byte, | ||
+ | |||
+ | * leftmost: byte offset (from start of file) | ||
+ | * middle: hex data (in pairs- big endian by default, so as you expect to read it) | ||
+ | * rightmost: the ASCII-ized representation of the middle data | ||
+ | |||
+ | =====Verify Results===== | ||
+ | If you'd like to verify your implementations, | ||
+ | |||
+ | To run it, you need a functioning **encode** and **decode** program (although it does its best otherwise). | ||
+ | |||
+ | It runs through four separate tests, storing the results in a corresponding **o#/** directory (sometimes, if applicable, intermediate results in a corresponding **m#/** directory): | ||
+ | |||
+ | * test 0: take the raw data files in **in/** and encodes them (**o0/**) | ||
+ | * test 1: take pre-encoded data files in **in/** and decodes them (**o1/**) | ||
+ | * test 2: take the raw data files in **in/**, encodes them (**m2/**), then decodes them (**o2/**) | ||
+ | * test 3: take pre-encoded data files in **in/**, decodes them (**m3/**), then encodes them (**o3/**) | ||
+ | |||
+ | How it works: | ||
+ | |||
+ | - depending | ||
+ | * if single step, result is in **o#/** directory | ||
+ | * if multi-step, result is in **m#/** directory, then second operation puts its result into **o#/** | ||
+ | - A checksum is taken of the original file in **in/** | ||
+ | - Another checksum is taken of the new file in **o#/** | ||
+ | - The checksums are compared. If they match, " | ||
+ | |||
+ | ====Successful operation==== | ||
+ | If all goes according to plan, you'll see " | ||
+ | |||
+ | < | ||
+ | lab46: | ||
+ | ================================================= | ||
+ | = PHASE 0: Raw -> Encode data verification test = | ||
+ | ================================================= | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | ====Unsuccessful operation==== | ||
+ | Should something not work correctly, you'll see a " | ||
+ | |||
+ | < | ||
+ | lab46: | ||
+ | ================================================= | ||
+ | = PHASE 0: Raw -> Encode data verification test = | ||
+ | ================================================= | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | in/ | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | ====Incomplete operation==== | ||
+ | Should something not work at all (like a missing or uncompiling decode binary), you'll see a " | ||
+ | |||
+ | < | ||
+ | lab46: | ||
+ | ... | ||
+ | ================================================= | ||
+ | = PHASE 1: Decode -> Raw data verification test = | ||
+ | ================================================= | ||
+ | Missing ' | ||
+ | ... | ||
+ | </ | ||
- | The **diff(1)** tool will also likely work well enough for our endeavors here. | ||
=====Submission===== | =====Submission===== | ||
- | ====Project Submission==== | + | To successfully complete this project, the following criteria must be met: |
+ | |||
+ | * Code must compile cleanly (no warnings or errors) | ||
+ | * Output must be correct, and match the form given in the sample output above. | ||
+ | * Implementations must be compliant to dcfX v2 spec, and pass all tests in the check tool. | ||
+ | * Code must be nicely and consistently indented (you may use the **indent** tool) | ||
+ | * Code must implement the algorithm(s) presented above. | ||
+ | * **encode.c** | ||
+ | * **decode.c** | ||
+ | * indicated error conditions are identified and reported, along with expected program behavior | ||
+ | * Code must be commented | ||
+ | * comments must be meaningful and descriptive of the process (tell me how/why you're doing what you're doing) | ||
+ | * have a properly filled-out comment banner at the top | ||
+ | * be sure to include any compiling instructions, | ||
+ | * Track/ | ||
+ | * Submit a copy of your source code to me using the **submit** tool. | ||
To submit this program to me using the **submit** tool, run the following command at your lab46 prompt: | To submit this program to me using the **submit** tool, run the following command at your lab46 prompt: | ||
Line 286: | Line 454: | ||
You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches. | You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches. | ||
- | |||
- | ====Submission Criteria==== | ||
- | To be successful in this project, the following criteria must be met: | ||
- | |||
- | * Project must be submit on time, by the posted deadline. | ||
- | * Early submissions will earn 1 bonus point per full day in advance of the deadline. | ||
- | * Bonus eligibility requires an honest attempt at performing the project (no blank efforts accepted) | ||
- | * Late submissions will lose 25% credit per day, with the submission window closing on the 4th day following the deadline. | ||
- | * To clarify: if a project is due on Wednesday (before its end), it would then be 25% off on Thursday, 50% off on Friday, 75% off on Saturday, and worth 0% once it becomes Sunday. | ||
- | * Certain projects may not have a late grace period, and the due date is the absolute end of things. | ||
- | * all requested functionality must conform to stated requirements (either on this project page or in comment banner in source code files themselves). | ||
- | * code resulting in two binaries must be submitted: | ||
- | * source code that when compiled produces the **encode** program | ||
- | * if you're only using one file for the encode, that source file should be called **encode.c** | ||
- | * source code that when compiled produces the **decode** program | ||
- | * if you're only using one file for the decode, that source file should be called **decode.c** | ||
- | * Output generated must conform to any provided requirements and specifications (be it in syntax or sample output) | ||
- | * output obviously must also be correct based on input. | ||
- | * Processing must be correct based on input given and output requested | ||
- | * Specification details are NOT to be altered. This project will be evaluated according to the specifications laid out in this document. | ||
- | * Code must compile cleanly. | ||
- | * Each source file must compile cleanly (worth 3 total points): | ||
- | * 3/3: no compiler warnings, notes or errors. | ||
- | * 2/3: one of warning or note present during compile | ||
- | * 1/3: two of warning or note present during compile | ||
- | * 0/3: compiler errors present (code doesn' | ||
- | * Code must be nicely and consistently indented (you may use the **indent** tool) | ||
- | * You are free to use your own coding style, but you must be **consistent** | ||
- | * Avoid unnecessary blank lines (some are good for readability, | ||
- | * Indentation will be rated on the following scale (worth 3 total points): | ||
- | * 3/3: Aesthetically pleasing, pristine indentation, | ||
- | * 2/3: Mostly consistent indentation, | ||
- | * 1/3: Some indentation issues, difficult to read | ||
- | * 0/3: Lack of consistent indentation (didn' | ||
- | * Code must be commented | ||
- | * Commenting will be rated on the following scale (worth 4 total points): | ||
- | * 4/4: Not only aesthetically pleasing, but also adequately explains the WHY behind what you are doing | ||
- | * 3/4: Aesthetically pleasing (comments aligned or generally not distracting), | ||
- | * 2/4: Mostly consistent, some distractions or gaps in comments (not explaining important things) | ||
- | * 1/4: Light commenting effort, not much time or energy appears to have been put in. | ||
- | * 0/4: No original comments | ||
- | * should I deserve nice things, my terminal is usually 90 characters wide. So if you'd like to format your code not to exceed 90 character wide terminals (and avoid line wrapping comments), at least as reasonably as possible, those are two sure-fire ways of making a good impression on me with respect to code presentation and comments. | ||
- | * Sufficient comments explaining the point of provided logic **MUST** be present | ||
- | * Code must be appropriately modified | ||
- | * Appropriate modifications will be rated on the following scale (worth 3 total points): | ||
- | * 3/3: Complete attention to detail, original-looking implementation | ||
- | * 2/3: Lacking some details (like variable initializations), | ||
- | * 1/3: Incomplete implementation (typically lacking some obvious details/ | ||
- | * 0/3: Incomplete implementation to the point of non-functionality (or was not started at all) | ||
- | * Implementation must be accurate with respect to the spirit/ | ||
- | * 3/3: Implementation is in line with spirit of project | ||
- | * 2/3: Some avoidance/ | ||
- | * 1/3: Generally avoiding the spirit of the project (new, different things, resorting to old and familiar, despite it being against the directions) | ||
- | * 0/3: entirely avoiding. | ||
- | * Error checking must be adequately and appropriately performed, according to the following scale (worth 3 total points): | ||
- | * 3/3: Full and proper error checking performed for all reasonable cases, including queries for external resources and data. | ||
- | * 2/3: Enough error checking performed to pass basic project requirements and work for most operational cases. | ||
- | * 1/3: Minimal error checking, code is fragile (code may not work in full accordance with project requirements) | ||
- | * 0/3: No error checking (code likely does not work in accordance with project requirements) | ||
- | * Track/ | ||
- | * Submit a copy of your source code to me using the **submit** tool (**make submit** will do this) by the deadline. | ||