Computer Organization FAQ

A list of Frequently asked questions and answers to them.

Brian Nichols - Jump Instructions and Input Data

Jump Instructions

When should I use jump statements?
-You definitely want to use jump statements when you want to modify something repetitively.

How do I perform jump statements?
-To utilize jump statements, first you need to realize what kind of jump you need for your certain situation. There's multiple kinds. Ex. Jump on Equality, Jump if Greater, Jump if Less, Jump on Overflow, Jump on Zero, and so on. This step can become very tricky. I've already chose the wrong jump twice. You need to make sure the one you choose is the one that will perform the way you want.

How does it know when to jump?
-When performing jump instructions, you need to use the compare statement. Make sure to always have the compare instruction right before the jump statement. The jump statement knows when to jump because of the flag register. By using compare, it will change the flag to the way you need it. If not, the flags may be changed becuase of other arithmetic instructions.

Can I see a jump code examples?
-Sure! Here's one to ponder over. Hope it helps!

equal:
mov ebx,input
mov ecx,outmsg
mov eax,4
and ebx, 0xff
cmp ebx,0xa
jne equal

Input Data

How do I make memory available for input data?
-To do this, you need to create a block of memory for that data. You also need to specify how many bytes you need. Here is an example:

input1 resd 4
input2 resd 20

How do I ask a user for data?
-The simplest way of asking a user for data is outputting a question for them to answer. For example, “Please type a number between 1 and 5: ”. Then after that, you would prepare to acquire data from the user.

How do I accept input from the user?
-Here is an example of data being inputted fropm a user, and stored into input1:

mov eax,3
mov edx,4
mov ecx,input1
mov ebx,1
int 128

How do I manipulate the data?
-The most straightforward way of manipulating data the way you want it, is by using other statements, such as add, sub, mul, and, and so on. But to do so, you need to place it in a register, so that you can edit it. You can do this by:

mov eax,input1
mov ebx,[eax]

^Now you are able to manipulate the data, becuase its located in ebx. eax is used to point to the beginning memory location of input1. ebx is used to store what eax is pointing to.

Mike Gough

Q: What is a system call?
A: System calls are used in assembly to perform functions that require operating system support, such as selecting the display device during output and receiving input.

An example of a system call would be printf:
  mov eax, 4  ; Call to the system to perform a print function 
  mov ebx, 3 ; Send the output to the display device
  mov ecx, var ; What to print, a pre-defined variable in this case
  mov edx, varlen ; The length of the variable to print
  int 80h ; System interrupt

Q: Why is assembly language system dependant?
A: It's all about how the system “Thinks”. x86 architecture uses a “little endian” concept. So if you write code designed to be used on x86 then try to run it on AIX or Solaris, well, things will turn out not as you expect, as they use “big endian” architecture.

Q: Ok, what exactly is big/little endian?
A: Endieness is the concept of bit-ordering on a specific architecture. A sequence of bites will read in opposite order on machines with different endieness. The term endieness apparently came from Gulliver's Travels - where Liliput required their citizens to crack their hard boiled eggs on the small end, and in Blefuscu they were required to crack them on the large end. Liliput was little-endian and Blefuscu was big endian!

Q: How do I obtain the memory size for a variable to use in EDX during a print call?
A: When you define your memory storage variable, you can create another that references it's size.

section .bss
  input1  resd 10
  input1Len equ  $ - input1 <-- This line reads and stores the length of data entered into input1.

Q: How do i put a comment in an assembly program?
A: That's easy, comments are preceded by a semicolon (;)

;//////////////////////////////
;//  This program does this and that
://////////////////////////////

or

mov eax, 4 ; System call for print

Q: What is an opcode?
A: An opcode is an instruction for the computer to perform a certain action. Opcodes are referenced by mnemonics such as MOV, ADD, JNE etc. A complete list for the x86 can be found here: http://www.jegerlehner.ch/intel/IntelCodeTable.pdf

Q: What is a system interrupt?
A: System interrupts are arcitecture dependant routines called from the code to perform a specific action. This is VERY dependant on the host OS - so care must be taken when you want to port to other operating systems.

Some examples of interrupts…

Int 34  - FLOATING POINT EMULATION - OPCODE D8h <- This interrupt is used to emulate floating-point instructions with an opcode of D8h \\
Int 62  - HP 95LX - USED BY CALCULATOR <- This is a vender specific interrupt!\\

A full list of interrupts for linux can be found here: http://www.ctyme.com/intr/int.htm

Q: What kind of numeric data can I use as arguments of opcodes?
A: Anything you want! Want to use binary, you input 0011b for example, the b appended to the end tells the compiler that we are using binary. If you want to use hex for example, you append with an h!

Jeff Jansen

Compiling: Being An Awesomely huge Douche Who Cant Finish his EoCE:

Ricky Moses

Q: How Do i Add numbers together higher than 9? A:

      First, you need to And out the newline character.  
      Second add the two digits together.  
      Do a compare to see if the sum is greater than 10.  
      If it is, you need to subtract 10 from the Sum and then add one to the next digit.  
      Afterward you need to switch the order, by saving the value to two different registers, then anding the beginning off of one, and the end of the other.  
      Then adding them together again.  
      Then Finally it is ready to print out.

Q: What About bigger Numbers? like Adding two 2 digit numbers together? A:

      First, move each digit into each register. 
      Then, add the 2 ones digits together and test if they are greater than 10.
      After, add the 2 10s digits together.
      Test if the Ones digits are greater than 10
      If they are, subtract 10, and add one to the tens digits.
      switch the tens digit around, and switch the ones digit around 
      orient them in the registers so that the ones digit is in the front and the 10s is in the back
      send them back into a sum location
      and they are ready to print out.

Jesse Short

Q: What assembler should I use?
A: It depends on the os and hardware but if linux with x86 intel: NASM (Netwide Assembler) is probably the smartest choice. But look aroud, there ar many others such as MASM, FASM, YASM, etc.

Q: What is x86 assembly language?
A: It is a backwards compatible language for the x86 class processors (Intel Pentium series and AMD Athalon's series). It uses mnemonics (short easy words to remember. In this case they resemble an abreviation of their full length word. IE: mov = move, jmp = jump, sub = subtract, etc. These mnemonics represent operations that the cpu can perform. Like all other assembly languages, this one is also machine specific.

Q: What is a register?
A: It's a high speed storage area located on the CPU. These registers can contain the address of a memory location where the data is stored.

Q: Is there a limited number of these registers?
A: Yes, it depends on the cpu. For instance in the x86 there are 4 general purpose registers. But there are also a few others that are function specific.

Q: How big are these registers?
A: The size ranges. For instance if an 'e' (e = extended) is prefixed before one of the general purpose registers, it is 32 bits and can therefore manipulate 32 bits of data.

Q: What are the general purpose regsters?
A: ax, bx, cx, and dx

Q: What is assembly language good for?
A: Assembly is good for code optimization if something is just having too much overhead and programming bootloaders, device drivers, or a kernel

Q: What is the stack?
A: The stack is a spot in memory that can be used temporarily to store data. It is naturally a last in first off concept. You can think of the stack as literally stacking things on top of each other.

Mike Short

Lets expand on the registers a little bit more.

General Purpose Registers:

   As the title says, general register are the one we use most of the time Most of the instructions perform on these registers. They all can be broken down into 16 and 8 bit registers.

The “H” and “L” suffix on the 8 bit registers stand for high byte and low byte. With this out of the way, let's see their individual main use:

   EAX,AX,AH,AL : Called the Accumulator register. 
                  It is used for I/O port access, arithmetic, interrupt calls,
                  etc...

   EBX,BX,BH,BL : Called the Base register
                  It is used as a base pointer for memory access
                  Gets some interrupt return values
                  
   ECX,CX,CH,CL : Called the Counter register
                  It is used as a loop counter and for shifts
                  Gets some interrupt values

   EDX,DX,DH,DL : Called the Data register
                  It is used for I/O port access, arithmetic, some interrupt 
                  calls.
                  
    	   ESP/SP : Stack Pointer register. 
                  Pointer to the top of the stack.
                  
    	   EBP/BP : Stack Base Pointer register. 
                  Used to point to the base of the stack.
                  
   ESI/SI : Source register. 
                  Used as a pointer to a source in stream operations.
                  
   EDI/DI : Destination register. 
                  Used as a pointer to a destination in stream operations.

Segment Registers:

   Segment registers hold the segment address of various items. They are only available in 16 values. They can only be set by a general register or special instructions. Some of them are critical for the good execution of the program and you might want to consider playing with them when you'll be ready for multi-segment programming.

             CS : Holds the Code segment in which your program runs.
                  Changing its value might make the computer hang.

             DS : Holds the Data segment that your program accesses.
                  Changing its value might give erronous data.

       ES,FS,GS : These are extra segment registers available for
                  far pointer addressing like video memory and such.

             SS : Holds the Stack segment your program uses.
                  Sometimes has the same value as DS.
                  Changing its value can give unpredictable results,
                  mostly data related.

Indexes and Pointers:

   Indexes and pointer and the offset part of and address. They have various uses but each register has a specific function. They some time used with a segment register to point to far address (in a 1Mb range). The register with an "E" prefix can only be used in protected mode.
  
  ES:EDI EDI DI : Destination index register
                  Used for string, memory array copying and setting and
                  for far pointer addressing with ES

  DS:ESI EDI SI : Source index register
                  Used for string and memory array copying

  SS:EBP EBP BP : Stack Base pointer register
                  Holds the base address of the stack
              
  SS:ESP ESP SP : Stack pointer register
                  Holds the top address of the stack

  CS:EIP EIP IP : Index Pointer
                  Holds the offset of the next instruction
                  It can only be read

EFLAGS Register:

   The EFLAGS register hold the state of the processor. It is modified by many intructions and is used for comparing some parameters, conditional loops and conditionnal jumps. Each bit holds the state of specific parameter of the last instruction. Here is a listing:

  

Bit   Label    Desciption
---------------------------
0      CF      Carry flag
2      PF      Parity flag
4      AF      Auxiliary carry flag
6      ZF      Zero flag
7      SF      Sign flag
8      TF      Trap flag
9      IF      Interrupt enable flag
10     DF      Direction flag
11     OF      Overflow flag
12-13  IOPL    I/O Priviledge level
14     NT      Nested task flag
16     RF      Resume flag
17     VM      Virtual 8086 mode flag
18     AC      Alignment check flag (486+)
19     VIF     Virutal interrupt flag
20     VIP     Virtual interrupt pending flag
21     ID      ID flag

Operation Suffixes:

   GAS assembly instructions are generally suffixed with the letters "b", "s", "w", "l", "q" or "t" to determine what size operand is being manipulated.

b = byte (8 bit)
s = short (16 bit integer) or single (32-bit floating point)
w = word (16 bit)
l = long (32 bit integer or 64-bit floating point)
q = quad (64 bit)
t = ten bytes (80-bit floating point)

Hello World:

   Now everyone that learns a new programming language; just for kicks and giggles and for the more serious matter of actually beginning to understand and learn roughly how the coding works they do a "Hello World" program. I'm also doing this because it's very hard to find anything out there on the interwebz that show any helpful examples. There are many different ways of doing this, but here's mine as an example:

     
; hello.asm
;
; assemble: nasm -f elf -l hello.lst  hello.asm
; link:     gcc -o hello  hello.o
; run:      ./hello
; output:   Hello World

SECTION .data               ; data section
msg:    db "Hello World",10 ; the string to print, 10=cr
len:    equ $-msg           ; "$" means "here"

                            ; len is a value, not an address

SECTION .text               ; code section
global main                 ; make label available to linker
main:                       ; standard  gcc  entry point

    mov edx,len             ; arg3, length of string to print
    mov ecx,msg             ; arg2, pointer to string
    mov ebx,1               ; arg1, where to write, screen
    mov eax,4               ; write sysout command to int 80 hex
    int 0x80                ; interrupt 80 hex, call kernel

    mov ebx,0               ; exit code, 0=normal
    mov eax,1               ; exit command to kernel
    int 0x80                ; interrupt 80 hex, call kernel

Having Trouble Getting Started?

   Seriously, who doesn't? Especially with this stuff >.< but anyway...
   
        ====Appetite Wetter====
             
             Register Addressing
                  mov eax, ebx  ; moves contents of register ebx into eax
                  mov eax, 1     ; moves value of 1 into register eax
             Direct memory addressing
                  mov eax, [102h] ; Actual address is DS:0 + 102h
                  
        ====2 Modes to Program In====
        
             Real Mode
             
                  You generally won't need to know anything about it (unless you are programming for a DOS-based system or, most likely, writing a boot loader that is directly called by the BIOS).
                  
                  In Real Mode, a segment and an offset register are used together to yield a final memory address. The value in the segment register is multiplied by 16 (or shifted 4 bits to the left) and the offset is added to the result. This provides a usable space of 1 MB.
             
             Protected Mode
             
                  If programming in a modern operating system (such as Linux, Windows), you are basically programming in flat 32-bit mode. Any register can be used in addressing, and it is generally more efficient to use a full 32-bit register instead of a 16-bit register part. Additionally, segment registers are generally unused in flat mode, and it is generally a bad idea to touch them.

Nate Webb

Q: What is debugging?

A: The official definition of Debugging is, “a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected.”

Q: How do you initialize the debug process?

A: After you've compiled your program you can start debugging by using the 'gdb' command followed by the name of the program you wish to debug. Ex. 'gdb test2'. Note that the file extension is not required when running the debug program.

Q: What does the 'gdb' program actually show?

A: The debugging program allows the user to look at specific details about the lines of code that were requested. So obviously the first part is to identify those lines of code that the user wants to look at. In order to do this you have to set up “breaks” in the actual code. An example follows…

mov ecx, in1
  mov edx, [ecx]

a: sub edx, 48 b: sub edx, edx c: add edx, 48 d: mov [ecx], edx e: add ecx, 1

  mov edx, 10
  mov ecx, 0

Those variables (a: , b:, c:, etc) represent the lines of code that can be loaded into the debugger. Without the variables the lines cannot be loaded into the debugging program.

Q: How do you load those lines of code into the debugging program?

A: Once you have the specific lines marked and the debugging program opened the next step is to load the lines. In order to do this all the user must do is type break followed by the corresponding letters. For example since in the code segment above you can see “a” through “e” you would type a break statement followed by each letter (a - e) to load each of those lines into the program. Ex.

 break a
 break b
 break c
 etc...

Q: How can you look at the info of those specific lines?

A: After the lines are loaded the program is ready to be run. In order to run the program simply type run. This will run the program as if it would normally however while running through the debug program it will stop at that first break point which gives you the opportunity to check the progress of each of those steps. In order to bring up that info the user must use the “Info Registers” command or simply “i r” this will load a list for that specific break point. An example follows..

   eax            0x2      2
   ecx            0x80491e8        134517224
   edx            0x4      4
   ebx            0x1      1
   esp            0xffffbce0       0xffffbce0
   ebp            0x0      0x0
   esi            0x0      0
   edi            0x0      0
   eip            0x80480dd        0x80480dd <a>
   eflags         0x202    [ IF ]
   cs             0x23     35
   ss             0x2b     43
   ds             0x2b     43
   es             0x2b     43
   fs             0x0      0
   gs             0x0      0

As you can see the Info Registers command shows a LOT of information about the current line. Most notably is the first four lines which show the four main registers and the information present in those registers.

Q: How do you move to the next break point?

A: Once the first line of code has been examined it's time for the next line to be viewed. This is done by simply typing either “c” or “continue” or “s”. All three of these commands will simply move the program to the next break and make it available to be viewed juts like the previous break.

Q: What are the benefits of debugging?

A: As can be seen in the previous Q/A's debugging is a great way to check for and possibly find solutions to otherwise undetectable logic errors. Especially since assembly language can be already difficult to troubleshoot, getting all the help you can get is greatly appreciated, and Debugging is an extremely valuable tool. Even without errors it can really help you understand what commands do what and just gain a better grasp on how assembly works in general.

Q: What are some other handy features of the 'gdb' program?

A: In case there was any confusion here or just in general the help menu can be very effective at clearing any doubts. At any time after starting the 'gdb' program simply type in 'help' and it will load a self explanatory menu that can help you with any commands or information that may be required.

Matthew Taft

Q: How do I store variables into memory?
A: Put the memory location for the variable into a register such as EAX, then put the value you want to store in another register like EBX. After that use mov DWORD [eax], ebx

Q: How should I end my assembly program?
A: It should be ended with a SYSEXIT call to the kernel, which can be set up by using.

mov eax, 1
mov ebx, 0
int 0x80

Q: What do the numbers for the system calls mean?
A: The various system call numbers are as follows:

1 - Exit; used to tell the kernel to terminate a process.
2 - Fork; used to tell the kernel to create a process.
3 - Read; used to read data from the specified device (such as the keyboard).
4 - Write; used to write data to the specified device (such as the monitor).
5 - Open; used to access a file on the system.
6 - Close; used to close a file when it is done being used.

These are the most often used calls in assembly.

Q: How do I use the stack?
A: The commands Push and Pop allow access to the stack. Push along with a register argument pushes the contents of the register onto the stack and increments the stack pointer. Pop takes the last item off of the stack and puts it into the register used as the argument to the command.

Q: How do I get to be able to operate on the contents of a variable?
A: Use mov to put the address of the variable in a register, (mov eax, varname) then use whatever command you wish on it but instead of the regular register name, use (byte [register name], make sure to keep the brackets). The command will then be run on the contents of the address that the register is pointing to, not the contents of the register.

Macros

Q: What is a macro?

A: A macro is a small piece of code that be reused later.

Q: How do I create one in ASM?

A: Here is an example of a macro to exit the program:

%macro         exitProgram
   mov         eax, 1
   mov         ebx, 0
   int         0x80
%endmacro

Macro start with %macro followed by the name of the macro. Then the macro's code. To end the macro you use %endmacro.

Q: Where do I put the macro in my ASM file?

A: Macros are typically placed in the segment .bss section of the program.

segment  .bss
   %macro      exitProgram
      mov      eax, 1
      mov      ebx, 0
      int      0x80
   %endmacro

Q: How do I call a macro?

A: To call our exitProgram from before macro, we just use the name of the macro as it's call.

_start:
   enter       0, 0
   pusha
   [some code]
   exitProgram
   popa
   leave
hlt

Q: Can I pass information to a macro?

A: Sure, you just define how many parameters are going to be passed right after the name of the macro. These values would then be accessed inside the macro using % followed by the number corresponding to it's place in the passed parameters list, starting from 1.

%macro         print 2
   mov         eax, 0x4
   mov         ebx, 0x1
   mov         ecx, %1
   mov         edx, %2
%endmacro

This could then be called using:

   print       msg1, msg1len

As a result, the contents of msg1, of length defined by msg1len would be displayed to stdout.

Q: Can a macro make use of another macro?

A: It sure can. Just call it inside of the macro like you would in your program.

segment  .data
   newline  db 0xA, 0x0
 
segment  .bss
   %macro      printNewline
      print    newline, 0x2
   %endmacro

Q: Is possible to put a whole bunch of macros into an external file and include them in multiple programs?

A: Yes, you would use the %include directive at the top of your asm file. Typically macro files will use a .mac extension.

%include       "myMacros.mac"

Your macro file would then have a segment .bss section with your macro definitions:

segment  .bss
 
%macro         exitProgram
   mov         eax, 1
   mov         ebx, 0
   int         0x80
%endmacro
 
%macro         print 2
   mov         eax, 0x4
   mov         ebx, 0x1
   mov         ecx, %1
   mov         edx, %2
%endmacro

Q: Can I include multiple macro files?

A: Just use another %include statement.

%include       "file1.mac"
%include       "file2.mac"

Tyrone Riley

Q: What is the .data section?

A: This is the initialized data section used for defining constant variables like file names and buffer sizes. Instructions used in this section are DB, DW, DD, DQ, DT and EQU.

section .data
    message:    db     'hello'    ; this is the message "hello" without quotes
    messageLen: equ    $-message  ; this is the length of the message

Q: What is the .bss section?

A: This section is used to declare variables. It can be omitted if you don't need to declare variables. Instructions used in this section are RESB, RESW, RESD, RESQ and REST.

section .bss
    number:     resb    1           ;this reserves 1 byte for number

Q: What is the .text section?

A: Main program code goes here. This section begins with global_start (like int main in c)

Q: Where is the start of a .asm program?

A: within the .text section _start signifies the stat of the code.

section .text
    global _start
 
    _start:

Q: How do you compile a .asm program?

A: program name hello.asm

nasm -f elf hello.asm
ld -s -o hello hello.o
./hello

Q: How do you compile and link at the same time?

A: Using the makefile program.

Q: Whats on the stack when the program first starts?

A: The first thing on the stack is the number of arguments on the command line including the name of the program. The next thing on the stack is the program name. Then any other numbers of words that were on the command line <cli> ./hello 12 text stuff <cli>

1 → 4
2 → hello
3 → text
4 → stuff

Q: How do i insert a line feed on the end of a string?

A: when declaring add a ,10

section .data
    message:    db     'hello',10    ; this is the message "hello" without quotes and a '\n'

Lab46 Wiki

Sidebar

Table of Contents

Computer Organization FAQ

Brian Nichols - Jump Instructions and Input Data

Jump Instructions

Input Data

Mike Gough

Jeff Jansen

Ricky Moses

Jesse Short

Mike Short

General Purpose Registers:

Segment Registers:

Indexes and Pointers:

EFLAGS Register:

Operation Suffixes:

Hello World:

Having Trouble Getting Started?

Nate Webb

Matthew Taft

Macros

Tyrone Riley

Lab46 Wiki

User Tools

Site Tools

Sidebar

Table of Contents

Computer Organization FAQ

Brian Nichols - Jump Instructions and Input Data

Jump Instructions

Input Data

Mike Gough

Jeff Jansen

Ricky Moses

Jesse Short

Mike Short

General Purpose Registers:

Segment Registers:

Indexes and Pointers:

EFLAGS Register:

Operation Suffixes:

Hello World:

Having Trouble Getting Started?

Nate Webb

Matthew Taft

Macros

Tyrone Riley

Page Tools