User Tools

Site Tools


Sidebar

projects

ntr0 (due 20200826)
pct1 (bonus; due 20200821)
wcp1 (due 20200821)
pct2 (due 20200826)
wcp2 (due 20200826)
adm0 (due 20200902)
pct3 (bonus; due 20200902)
wcp3 (due 20200902)
pbx0 (due 20200909)
pct4 (bonus; due 20200909)
wcp4 (due 20200909)
pbx1 (due 20200916)
pct5 (bonus; due 20200916)
wcp5 (due 20200916)
pbx2 (due 20200923)
pct6 (due 20200923)
wcp6 (due 20200923)
usr0 (due 20200930)
pct7 (bonus; due 20200930)
wcp7 (due 20200930)
gfo0 (due 20201007)
pct8 (due 20201007)
wcp8 (due 20201007)
upf0 (due 20201014)
pct9 (bonus; due 20201014)
wcp9 (due 20201014)
upf1 (due 20201021)
pctA (due 20201021)
wcpA (due 20201021)
wpa0 (due 20201028)
pctB (bonus; due 20201028)
wcpB (due 20201028)
pctC (due 20201104)
wcpC (due 20201104)
pctD (bonus; due 20201111)
wcpD (bonus; due 20201111)
pctE (bonus; due 20201118)
wcpE (bonus; due 20201118)
eoce (due 20201125)
haas:fall2020:unix:projects:bdp0

Corning Community College

CSCS1730 UNIX/Linux Fundamentals

Project: BINARY DATA PROCESSING (bdp0)

Errata

Typos and bug fixes:

  • <description> (DATESTAMP)

Objective

Use your UNIX skills and tools at hand to enable you to solve a problem in the realm of raw data management and data recovery.

Background

As a side job to help you through school, you've become employed at a local microblogging and meme archival firm as their head UNIX IT lead. Your run-of-the-mill tasks include setting up single-purpose web pages and web-browsable images to aid the researchers in tracking the evolution of memes.

Everything was going fine, until one day a researcher with a freshly obtained meme (from a multi-seeded bittorrent transfer), experienced a hard drive failure.

Preservation of this meme is downright critical to on-going research, and with seconds to spare before the system locks up, you manage to do a memory dump of the region of RAM containing the downloaded meme data, and transfer it to another system before it becomes unresponsive and unavailable.

The last thing you see on the screen before the system locks up is a hex address of the table of contents and its octal length, which was stored in a memdump.status file.

For example, you may see something like:

  • address (in hex): 0x1ced3
  • length (in octal): 130

Hard drive replaced and OS reinstalling on the researcher's computer, your task is now of equal importance: pick out the file fragments from the raw memory dump, and assemble them all into one file, meeting specifications laid out by the researchers and chief meme archivist.

The air is thick with anticipation.

This is the moment you've been working towards your whole life.

You pause and do a quick tai chi exercise to calm the mind and gather some inner energy. Eyes closed. Deep breath in. Deep breath out. Your eyes snap open and shine with a fierceness and determination that would make any obfuscated data quiver.

It is go time.

Obtain the file

This week's project is located in the bdp0/ subdirectory of the UNIX Public Directory, in files called: memdump.ram and memdump.status, both located in a directory by the name of your lab46 username.

There is a companion file called dectohex.c, which may be of some value, directly or indirectly.

Make a copy of these into your home directory somewhere and set to work.

NOTE: Hopefully it has been standard practice to locate project files in their own unique subdirectory, such as under src/unix/, where you can then add/commit/push the results to your repository (you ARE regularly putting stuff in your repository, aren't you?)

Process

The file you seek has been broken up into separate parts, each potentially encoded or encapsulated in some way.

To make matters more interesting, the file fragments are located in a raw memory dump, which you'll have to perform some minor data recovery techniques on to get them out and further massage them.

There is a table of contents index located within this memory dump… it is of the following format:

-toc-filename:offset,length;filename2:offset,length;...;-toc-

To make things more interesting, the offset is stored as a hexadecimal value.

The length is recorded in octal. It represents the total number of bytes contained in the entry (including the start).

Be mindful of the base.

Luckily, you know where to get the table of contents from memory. From there, you can reconstruct the means to access the remaining file fragments.

Useful tools

You may want to become familiar with the manual pages of the following tools (in addition to tools you've already encountered):

  • dd(1)
  • bc(1)
  • netpbm(1)
  • pnmscale(1)

Additionally, looking through any companion files provided in this project may offer you some unique value.

quick dd primer

Those with little patience and low observation skills are often quick to label dd a difficult or weird tool. While it is true that dd is no ls, it is a powerful tool, quite useful in its particular domain.

dd is referred to as a data dump or data duplicator… namely, its task is to copy information, and to do so very well.

In some respects, it is like cp, only vastly more capable, as it sees and allows you control over more of the file (a file is made up of bytes– cp just copies the whole file, because it works in units of files; dd, on the other hand, sees a file as a sequence of bytes).

In other respects, it is like cat (then again, one can also perform file copies with cat)… in that it reads input and produces output.

Unlike cp and cat, dd specializes in byte-level operations. Both cat and cp are limited to operating on the entire file as a basic operation. dd goes a bit deeper.

Example 0: dd as cp

If we had a file, /etc/motd, and we wanted to make a copy of it in our current working directory under the name thing, we could do this:

lab46:~/src/unix/bdp0$ dd if=/dev/motd of=thing
1+1 records in
1+1 records out
859 bytes (859 B) copied, 0.000149696 s, 5.7 MB/s
lab46:~/src/unix/bdp0$ 

As previously stated, dd specializes in byte level operations. So it is a far more articulate file copy.

We see two options being used with dd:

  • if= this specifies what the input file will be
  • of= this specifies what the output file will be

And with this, dd will read from /etc/motd and output to thing (in current directory).

As it is, dd was able to operate on the file as one chunk (similar to how cp or cat would work), but we can go deeper.

Example 1: Fine-Grained cp with dd

For example, to specifically copy the file byte-by-byte:

lab46:~/src/unix/bdp0$ dd if=/etc/motd of=thing bs=1
859+0 records in
859+0 records out
859 bytes (859 B) copied, 0.00400408 s, 215 kB/s
lab46:~/src/unix/bdp0$ 

Notice the number of records has changed to match the file size (there are 859 bytes in the file, so a byte by byte copy would result in 859 records). Also note the transfer speed went down… 859 byte transfers is a lot more expensive than 1 larger information transaction.

That new option, bs, allows for the setting of the block size. In this case, we're setting the block size of that dd transaction to 1 byte from its default.

Example 2: dd as cat

To simulate cat using dd, we merely instruct it where to send its data:

lab46:~/src/unix/bdp0$ dd if=/etc/motd of=/dev/tty
##############################################################################
##  __         _     _ _   __
## |  |   __ _| |__ / | |_/ /   LAIR Public Shell Machine
## |  |__/ _` | '_ \\_  _/ _ \
## |_____\__,_|_.__/  |_|\___/  Lab46 is the CCC Computer & Information
## ---------------------------  Science public shell box for student course-
## c o r n i n g - c c . e d u  work, projects, and skills exploration.
##
## PLEASE USE THE SYSTEM, LAIR, AND RELATED RESOURCES RESPONSIBLY!
##
## LAB46 RESOURCES:
##    website:         http://lab46.corning-cc.edu/
##    help form:       http://lab46.corning-cc.edu/help_request
##    help contact:    haas@corning-cc.edu or wedge@lab46.corning-cc.edu
##
## USAGE INFORMATION:
##    basic usage:     type 'usage' at the prompt
##    check mail:      type 'alpine' at the prompt; broken? type 'fixmail'
##
1+0 records in
1+0 records out
859 bytes (859 B) copied, 0.000158424 s, 5.4 MB/s
lab46:~/src/unix/bdp0$ 

Due to “everything being a file”, displaying to STDOUT is merely specifying a file that corresponds to your terminal screen.

And if you wanted to redirect it to a file? It's STDOUT, so your I/O redirectors will work as expected.

There are many additional options in dd, so it is highly recommended you read through the manual page and experiment.

example 3: grabbing the last 200 bytes

Let's say we wanted to only grab the last 200 bytes of that 859 byte file.

cp and cat would have some difficulty easily doing this on their own… perhaps with other tools this could be facilitated, but it falls within the capabilities of what dd can do (it does what it does extremely well).

It turns out there is a skip option to dd, that let's it skip ahead some number of blocks in the input before it starts processing. We want to use that to grab the last 200 bytes in the file.

First, we need to figure out how much to skip… knowing the file is 859 bytes, and desiring only the last 200 bytes, we use a little math:

 859
-200
 ===
 659

So, let's give it a shot:

lab46:~/src/unix/bdp0$ dd if=/etc/motd of=/dev/tty skip=659
dd: ‘/etc/motd’: cannot skip to specified offset
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000252174 s, 0.0 kB/s
lab46:~/src/unix/bdp0$ 

Wut rohs! Seems there is some problem.

This is what tends to throw beginners for a loop with dd, they forget that dd does not assume a block size of 1 byte, but something larger. Well, we just told dd to skip 659 blocks into the file… and if a block is more than 1 byte (we saw in our first examples it was at least 859 bytes), then this would be an unreasonable/nonsensical request.

So let us fix it, by specifying a block size of 1:

lab46:~/src/unix/bdp0$ dd if=/etc/motd of=/dev/tty bs=1 skip=659
s@corning-cc.edu or wedge@lab46.corning-cc.edu
##
## USAGE INFORMATION:
##    basic usage:     type 'usage' at the prompt
##    check mail:      type 'alpine' at the prompt; broken? type 'fixmail'
##
200+0 records in
200+0 records out
200 bytes (200 B) copied, 0.00165582 s, 121 kB/s
lab46:~/src/unix/bdp0$ 

Submission

Successful completion will result in the following criteria being met:

  • Resulting image has been scaled approximately 2x to a resolution of 414×418
  • Image has been converted to PNG format and named meme0531.png
  • Image has been placed in your Lab46 webspace, in a unix/bdp0/ directory which is searchable to the web server (world search); image is world readable.
    • No superfluous permissions should be present for group/other. User obviously needs adequate permissions for you to manipulate it.
    • All parent directories need to also be world searchable in order to function
      • Setting all permissions could result in your home directory being accessed by third parties. ONLY set the minimum required permissions.
    • Aside from user permission, group should have no permissions set.
    • ONLY the indicated permission for world should be set for impacted files.
    • Be sure you can view said image in a web browser.
  • When all is said and done, you will submit 2 files and 1 URL:
    • bdp0steps, which contains:
      • line 1: the full URL to view your file in a web browser
      • line 2: the phrase encountered when viewing this image
      • lines 3-: the command lines you used to undertake this project (you can exclude initial copying and end submission commands).. be sure to mention offsets/lengths/sizes of things.
    • meme0531.png, which should conform to the resolution and format specifications above, and be correctly reassembled.
    • The working URL to the meme0531.png file hosted in your lab46 webspace.

Submit

Please submit as follows:

lab46:~/src/unix/bdp0$ submit unix bdp0 bdp0steps meme0531.png http://lab46.corning-cc.edu/~USERNAME/unix/bdp0/meme0531.png
Submitting unix project "bdp0":
    -> bdp0steps(OK)
    -> meme0531.png(OK) 
    -> http://lab46.corning-cc.edu/~USERNAME/unix/bdp0/meme0531.png(OK)

SUCCESSFULLY SUBMITTED
lab46:~/src/unix/bdp0$ 
haas/fall2020/unix/projects/bdp0.txt · Last modified: 2018/03/05 19:47 by 127.0.0.1