======dtr0 tips and tricks======
=====Declaring variables=====
A variable is defined as: an element, feature, or factor that is liable to vary or change.
On the computer, a variable is memory location containing binary data. Often times, additional rules of interpretation will be applied to the binary data, pertaining to a certain format (integer data, floating point data, memory address data)
In C, variables need to be declared and ideally initialized before we use them. All variables need to be identified with some data format (otherwise known as a **data type**).
As talked about in class, and on the dtr0 project, there are a number of possible data types to choose from. For dtr0, we will be focusing on the integer types:
* char
* short int
* int
* long int
* long long int
Each of these can store whole numbers, of differing ranges. The idea is to use the data type that "best fits" the scenario you are using it for.
ie, if you know you will ONLY need to use the values 0-100, while ALL of these integer types can accommodate such a request, there's one which can accommodate it while wasting the least amount of storage.
There is an additional qualifier we can associate with each of the integer types, and that is the indication of whether or not we wish to interact with negative numbers. The two qualifiers are:
* signed - allow positive AND negative numbers
* unsigned - no negative numbers at all
If you omit the signedness qualifier, it will likely default to a **signed** type.
To declare a short integer type named **number** with unsigned qualities:
unsigned short int number;
It is good programming practice to be as specific as possible. As such, when we declare variables, it is also a good idea to then initialize them to a known starting value (don't assume!). Here, we will initialize **number** to 0:
number = 0;
We can combine both of these steps into one:
unsigned short int number = 0;
=====Naming your source file=====
Remember to name your source file with a ".c" at the end: **dtr0.c**
Especially as we will be creating a **dtr0** executable, there will be a naming conflict, and if you are not careful, you may lose your source code if in a separately named file called **dtr0.c**
=====Binary representation=====
The computer is a binary device. That means it stores and transacts all its data in base 2, where there are only 2 counting bits: 0 and 1.
Think of this as a light switch, which can be either turned off, or turned on. That's it.
The computer will group a number of these bits together when it stores, retrieves, or manipulates information. The common unit of transaction is the byte, which in modern days has been set to eight bits.
The computer can access data in units of bytes (typically in some power of two), and the amount of bytes (and therefore bits) we have within that collection of data denotes the number of different combinations which can be represented (one at a time).
For example, take a single bit. That "light switch". How many possible combinations can exist with one light switch?
Two, right? It can either be ON, or it can be OFF. Nothing else.
We can use a mathematical expression to aid us in determining the number of possibilities, based on how many bits are present.
It is: 2^x=y
Where **x** is the number of bits, and **y** is the number of possibilities.
^ # bits | number of possibilities |
^ 0 | 1 |
^ 1 | 2 |
^ 2 | 4 |
^ 3 | 8 |
^ 4 | 16 |
^ 5 | 32 |
^ 6 | 64 |
^ 7 | 128 |
^ 8 | 256 |
^ 9 | 512 |
If there are eight bits in a byte, a byte can store any one of 256 unique possibilities (but of course, it can only be ONE of those possibilities at any given time).
If we had two bytes, we would have sixteen total bits. How many total possibilities can sixteen bits represent?
=====Hexadecimal as a binary short hand=====
While the computer is a binary device, when interacting with the computer we will often use base 16 (hexadecimal), as a convenient short hand for interacting with the computer's information.
This is because, like 2, 16 is a power of two, so there are certain conveniences leveraged.
Mainly, as it takes four binary bits to represent sixteen unique possibilities, one hexadecimal counting symbol reflects four binary bits. Hexadecimal values will be much shorter than their binary counterparts, while both storing the same information, and not needing any complicated conversion process.
Take the following table:
^ base 2 ^ base 8 ^ base 10 ^ base 16 |
^ (binary) ^ (octal) ^ (decimal) ^ (hexadecimal) |
| 0000 | 00 | 0 | 0x0 |
| 0001 | 01 | 1 | 0x1 |
| 0010 | 02 | 2 | 0x2 |
| 0011 | 03 | 3 | 0x3 |
| 0100 | 04 | 4 | 0x4 |
| 0101 | 05 | 5 | 0x5 |
| 0110 | 06 | 6 | 0x6 |
| 0111 | 07 | 7 | 0x7 |
| 1000 | 010 | 8 | 0x8 |
| 1001 | 011 | 9 | 0x9 |
| 1010 | 012 | 10 | 0xA |
| 1011 | 013 | 11 | 0xB |
| 1100 | 014 | 12 | 0xC |
| 1101 | 015 | 13 | 0xD |
| 1110 | 016 | 14 | 0xE |
| 1111 | 017 | 15 | 0xF |
We can see here that 1011 binary is 0xB hexadecimal.
And 0111 binary is 0x7 hexadecimal.
We can easily convert between the two bases, in the case of binary to hexadecimal, by grouping together four bits at a time (from the right, or least significant bit of the number):
10110111 binary, when put into groups of four bits, is:
1011 0111
And then, using the table, what is 1011 in hex? 0xB
When is 0111 in hex? 0x7.
So, 10110111 binary in hexadecimal is 0xB7.
Consequently, it takes two hex digits to represent one byte.
We can go the opposite way as well:
0xDEADBEEF (yes, a valid, actual hexadecimal value)
0x D E A D B E E F
We merely need to look up each hex digit's corresponding binary representation, and write them out:
D is 1101, E is 1110, A is 1010, etc.
0xDEADBEEF in binary is 11011110101011011011111011101111.
Also, with 0xDEADBEEF, since we know it takes two hexedecimal digits to represent a byte, 0xDE AD BE EF, is a four byte quantity (and with eight bits in a byte, eight bits times four bytes is a total of 32 bits).
When we transact in bases OTHER THAN the powers of 2 bases (for example, 10), things do not transition directly or easily. Instead, more involved translation processes are involved.
Take the number 175 decimal, and say we want to convert it to hexadecimal:
^ place ^ value (in decimal) |
| 16^0 | 1 |
| 16^1 | 16 |
| 16^2 | 256 |
As we can see, 256 is much larger than 175, so there are ZERO 256s in the number.
175 is much larger than 16, and it is next in line, so there are some number of 16s in 175:
| 0 | 0 |
| 1 | 16 |
| 2 | 32 |
| 3 | 48 |
| 4 | 64 |
| 5 | 80 |
| 6 | 96 |
| 7 | 112 |
| 8 | 128 |
| 9 | 144 |
| 10 | 160 |
| 11 | 176 |
It appears that 10 is our best fit, **there are 10 sixteens** in 175. What is decimal 10 in hex? Looking at our table: 0xA
175 - 160 = 15
We are now down to the one's place. There are 15 ones in 15. What is decimal 15 in hex? Once again, the table says: 0xF.
Taking the first and second hex value, and concatenating them together, we get: 0xAF.
Therefore, decimal 175 is hexadecimal AF.
=====Bitwise AND=====
The logical AND operation is used to evaluate the true or false nature of two statements (A and B). An AND is true if BOTH A and B are true, and false otherwise.
A Bitwise-AND (the & operator in C), can be used to perform a bit-by-bit evaluation of two binary values:
1101 1001
& 0100 & 0100
==== ====
0100 0000
It is a good way of isolating a true value at a certain bit position, and masking out the rest.
char A = 13;
char B = 4;
char C = A & B;
fprintf (stdout, "A is: 0x%X\n", A); // %X means "display as hexadecimal"
fprintf (stdout, "B is: 0x%X\n", B);
fprintf (stdout, "C is: 0x%X\n", C);
=====Bitwise OR=====
The logical (inclusive) OR operation is used to evaluate the true or false nature of two statements (A and B). An OR is true if EITHER or BOTH A and B are true, and false otherwise.
A Bitwise-OR (the | operator in C), can be used to perform a bit-by-bit evaluation of two binary values:
1101 1000
| 0100 | 0100
==== ====
1101 1100
It is a good way of accumulating desired bits, and ensuring that others are set as desired.
char A = 13;
char B = 4;
char C = A | B;
fprintf (stdout, "A is: 0x%X\n", A); // %X means "display as hexadecimal"
fprintf (stdout, "B is: 0x%X\n", B);
fprintf (stdout, "C is: 0x%X\n", C);
=====Unsigned values=====
If we have an unsigned 4-bit value, that means all four bits are dedicated for displaying the value (ranging from 0 to 15, seeing as we have 4 bits). Basically, see the big table above, comparing the binary column against the decimal column.
=====Signed values=====
When a signed value is desired, one of the bits is utilized as a so-called "sign bit" (the most-significant, or leftmost, bit of the value).
It isn't precisely that, however, as we'd end up with a numerical problem (0 is positive, 1 is negative):
| 0 000 | +0 |
| 0 001 | +1 |
| 0 010 | +2 |
| 0 011 | +3 |
| 0 100 | +4 |
| 0 101 | +5 |
| 0 110 | +6 |
| 0 111 | +7 |
| 1 000 | -0 |
| 1 001 | -1 |
| 1 010 | -2 |
| 1 011 | -3 |
| 1 100 | -4 |
| 1 101 | -5 |
| 1 110 | -6 |
| 1 111 | -7 |
Do you see the problem with this approach? We'd end up with TWO zero values: positive zero, and negative zero.
That sort of breaks the universe in nasty ways.
So what do we do? Computers have adopted a scheme known as "twos complement" for the encoding of negative values. To obtain the twos complement, we invert the bits, then add one:
Let's do -1. As +1 is: 0 001, we invert that:
0 001 -> 1 110, then add one:
1 110 + 1 = 1 111.
-2: 0 010 -> 1 101 + 1 = 1 110
-3: 0 011 -> 1 100 + 1 = 1 101
See a pattern yet? We're going backwards:
Twos complement:
| 0 000 | +0 |
| 0 001 | +1 |
| 0 010 | +2 |
| 0 011 | +3 |
| 0 100 | +4 |
| 0 101 | +5 |
| 0 110 | +6 |
| 0 111 | +7 |
| 1 000 | -8 |
| 1 001 | -7 |
| 1 010 | -6 |
| 1 011 | -5 |
| 1 100 | -4 |
| 1 101 | -3 |
| 1 110 | -2 |
| 1 111 | -1 |
So now we have values ranging from -8 through +7 (still a quantity of 16).
Furthermore, due to the "reversal" of the negative values, look how it plays into our arithmetic so naturally:
signed char number = 0; // binary 0000
number = number - 1;
fprintf (stdout, "number is: %hhd\n", number);
number = 7; // binary 0111
number = number + 1;
fprintf (stdout, "number is: %hhd\n", number);
So if we had a 4-bit signed variable, and we wanted to ensure it contained the LOWEST possible value, looking at the table, that means we'd want a one in the sign bit, and zeros in all the other bits.
Is there a way, when we have our signed char number variable, to force it to that state?
number = 0;
number = number | 0x8;
And for the highest? Would that not be a 0 in the sign bit, followed by all 1s?
number = 0;
number = number | 0x7;