User Tools

Site Tools


Sidebar

projects

  • uxi0 (due 20150128)
  • arc0 (due 20150204)
  • pbx0 (due 20150211)
  • pbx1 (due 20150225)
  • udr0 (due 20150311)
  • udr1 (due 20150318)
  • udr2 (due 20150408)
  • EoCE - bottom of Opus (due 20150514 by 4:30pm)
haas:spring2015:unix:projects:udr2

This is an old revision of the document!


Corning Community College

CSCS1730 UNIX/Linux Fundamentals

~~TOC~~

Project: UNIX DATA RECOVERY (udr2)

Errata

Typos and bug fixes:

  • no fixes of note

Objective

Continuing our “1337 haxxing” series of projects, we've found considerable conceptual self-imposed roadblocks blocking our employment of otherwise simple computing properties (that data is a series of bytes, and ultimately, that everything is a file).

We resume our exploration with another practical example, this time based on real data generated by an EEG device. The intersection of hardware, software, and logic play vital roles in problem solving activities (even if it is just enabling analysts to make more educated guesses), and seems to be a skill increasingly taken for granted and alien.

Background

An electroencephalogram (EEG) is a test that detects electrical activity in your brain. Brain cells communicate via electrical impulses and are active all the time, even when you're asleep. This activity can be visualized as wavy lines on an EEG recording, but ultimately is sourced from raw bytes sampled from the device performing the data acquisition.

Sleep is a common area of study where this is particularly applicable, and is even somewhat of a modern day fad- smartphone apps to special wristbands can be used to monitor aspects of our sleeping quality, and more products are coming to market all the time.

We will be analyzing data generated by a consumer-grade EEG headset– basically a device one wears when going to sleep, and via conductive pads in contact with the skin on the forehead, monitors the brainwaves and can determine their level of activity (especially in regard to whether they are asleep, and what level of sleep they are at).

The data was obtained from a live session (me, sleeping) during my initial polyphasic sleep adaptation a few years ago– so there'll be opportunities to see “normal, boring” sleep patterns, transitions, and even more optimized and sleep sessions (including rather restful 20-minute power naps).

The device used generated bytes of raw data, which I captured into individual data files. We will be learning how that data is structured so that we may parse it, and ultimately derive information such as sleep duration, type of sleep, etc.

Like udr0 and udr1… we're just manipulating (reading/writing) bytes of data, and applying specific rules and methods to how we interpret various bytes, or sequences of bytes.

Once again there is a conceptual as well as practical angle… some people will struggle more with one over the other, and as always: questions are not just encouraged, they are expected for success!

EEG data packet format

EEG data is represented in the form of data packets– collections of bytes that can be decoded to convey a particular meaning (state of sleep, timestamp, signal strength, etc.). The format of the data packet is as follows:

Field Length (bytes) Description
'A' (0x41) 1 character starting the packet
4 (0x04) 1 the protocol “version”, of which only 4 is currently supported
checksum 1 a one byte checksum formed by summing the identifier byte and all the data bytes
msglen 2 a two byte message length (little endian). This length includes the size of the data payload plus the identifier
inv_msglen 2 is the inverse of msglen sent for redundancy. If msglen does not match ~inv_msglen, we can start looking for the next packet immediately, instead of reading some arbitrary number of bytes, based on a bad length
time_sec 1 the lower 8 bits of the current unix time (when session was recorded)
sub_sec 2 the 16-bit sub-second (runs through 0xFFFF in 1 second), LSB first
seqnum 1 the 8-bit sequence number
datatype 1 the datatype (see table below)
datablock variable the array of binary data

Obtain the file

This week's project is located in the spring2015/udr1/ directory of the UNIX Public Directory, in a file called: data.file

Make a copy of this into your home directory somewhere and set to work.

NOTE: Hopefully it has been standard practice to locate project files in their own unique subdirectory, such as under src/unix/, where you can then add/commit/push the results to your repository (you ARE regularly putting stuff in your repository, aren't you?)

Process

The data you seek (2 files) is obfuscated and contained within this file.

Plain text directions give clues on how to find both pieces of information, and it is up to you to use your skills to extract the necessary data.

Some additional information:

  • The first file should be named udr1.text and be properly oriented.
  • The second (big) file runs from the starting point until the very end of the file
  • It should be named 'gizmo', and reside in your current working directory.
  • gizmo is binary data, and entirely reversed- you need to get its bytes back in order (last byte should be first byte, 2nd to last should be 2nd, etc.)
    • You are to write a shell script to perform the de-reversal of the data, reading from data.file and through whatever processing is needed, produce the file called gizmo.
  • The urev tool has some additional constraints with respect to gizmo… running it should notify you of any details you are lacking.

Useful tools

You may want to become familiar with the manual pages of the following tools (in addition to tools you've already encountered):

  • dd(1)
  • bc(1)
  • du(1)
  • bash(1) shell scripting
  • od(1)
  • bvi(1)
  • hexedit(1)

… along with other tools previously encountered.

Submission

Successful completion will result in the following criteria being met:

  • Resulting file with proper settings should enable you to run urev tool.
  • You have completed all weekly exercises (96, I think) before the deadline, being mindful of the intentionally-paced nature of urev.
    • Bonus opportunity: while still performing a minimum of 3 distict urev sessions, how could you get around the urev-imposed time limit? (Without copying/changing urev).
  • When all is said and done, you will submit 3 files:
    • udr1.text
      • Append the dd line(s) as well as any other command lines needed to extract and properly re-orient the file. Also be sure to indicate what is in the file you found (content, not just type of data).
    • your bash script enabling the processing of data.file to produce gizmo
      • Be sure to include comments indicating the reasoning behind actions taken
    • Your extracted/processed gizmo file

Submit

Please submit as follows:

lab46:~/src/unix/udr1$ submit unix udr1 udr1.text getgizmo.bash gizmo
Submitting unix project "udr1":
    -> udr1.text(OK) 
    -> getgizmo.bash(OK)
    -> gizmo(OK) 

SUCCESSFULLY SUBMITTED
lab46:~/src/unix/udr1$ 
haas/spring2015/unix/projects/udr2.1426507574.txt.gz · Last modified: 2015/03/16 12:06 by wedge