This is an old revision of the document!
Corning Community College
CSCS1730 UNIX/Linux Fundamentals
~~TOC~~
Typos and bug fixes:
Continuing our “1337 haxxing” series of projects, we've found considerable conceptual self-imposed roadblocks blocking our employment of otherwise simple computing properties (that data is a series of bytes, and ultimately, that everything is a file).
We resume our exploration with another practical example, this time based on real data generated by an EEG device. The intersection of hardware, software, and logic play vital roles in problem solving activities (even if it is just enabling analysts to make more educated guesses), and seems to be a skill increasingly taken for granted and alien.
An electroencephalogram (EEG) is a test that detects electrical activity in your brain. Brain cells communicate via electrical impulses and are active all the time, even when you're asleep. This activity can be visualized as wavy lines on an EEG recording, but ultimately is sourced from raw bytes sampled from the device performing the data acquisition.
Sleep is a common area of study where this is particularly applicable, and is even somewhat of a modern day fad- smartphone apps to special wristbands can be used to monitor aspects of our sleeping quality, and more products are coming to market all the time.
We will be analyzing data generated by a consumer-grade EEG headset– basically a device one wears when going to sleep, and via conductive pads in contact with the skin on the forehead, monitors the brainwaves and can determine their level of activity (especially in regard to whether they are asleep, and what level of sleep they are at).
The data was obtained from a live session (me, sleeping) during my initial polyphasic sleep adaptation a few years ago– so there'll be opportunities to see “normal, boring” sleep patterns, transitions, and even more optimized and sleep sessions (including rather restful 20-minute power naps).
The device used generated bytes of raw data, which I captured into individual data files. We will be learning how that data is structured so that we may parse it, and ultimately derive information such as sleep duration, type of sleep, etc.
Like udr0 and udr1… we're just manipulating (reading/writing) bytes of data, and applying specific rules and methods to how we interpret various bytes, or sequences of bytes.
Once again there is a conceptual as well as practical angle… some people will struggle more with one over the other, and as always: questions are not just encouraged, they are expected for success!
EEG data is represented in the form of data packets– collections of bytes that can be decoded to convey a particular meaning (state of sleep, timestamp, signal strength, etc.). The format of the data packet is as follows:
Field | Length (bytes) | Description |
---|---|---|
'A' (0x41) | 1 | character starting the packet |
4 (0x04) | 1 | the protocol “version”, of which only 4 is currently supported |
checksum | 1 | a one byte checksum formed by summing the identifier byte and all the data bytes |
msglen | 2 | a two byte message length (little endian). This length includes the size of the data payload plus the identifier |
inv_msglen | 2 | is the inverse of msglen sent for redundancy. If msglen does not match ~inv_msglen, we can start looking for the next packet immediately, instead of reading some arbitrary number of bytes, based on a bad length |
time_sec | 1 | the lower 8 bits of the current unix time (when session was recorded) |
sub_sec | 2 | the 16-bit sub-second (runs through 0xFFFF in 1 second), LSB first |
seqnum | 1 | the 8-bit sequence number |
datatype | 1 | the datatype (see table below) |
datablock | variable | the array of binary data |
This week's project is located in the spring2015/udr1/ directory of the UNIX Public Directory, in a file called: data.file
Make a copy of this into your home directory somewhere and set to work.
NOTE: Hopefully it has been standard practice to locate project files in their own unique subdirectory, such as under src/unix/, where you can then add/commit/push the results to your repository (you ARE regularly putting stuff in your repository, aren't you?)
The data you seek (2 files) is obfuscated and contained within this file.
Plain text directions give clues on how to find both pieces of information, and it is up to you to use your skills to extract the necessary data.
Some additional information:
You may want to become familiar with the manual pages of the following tools (in addition to tools you've already encountered):
… along with other tools previously encountered.
Successful completion will result in the following criteria being met:
Please submit as follows:
lab46:~/src/unix/udr1$ submit unix udr1 udr1.text getgizmo.bash gizmo Submitting unix project "udr1": -> udr1.text(OK) -> getgizmo.bash(OK) -> gizmo(OK) SUCCESSFULLY SUBMITTED lab46:~/src/unix/udr1$