Chapter Two Low-Level Control Structures
2.1 Chapter Overview
This chapter discusses "pure" assembly language control statements. The last section of this chapter discusses hybrid control structures that combine the features of HLA's high level control statements with the 80x86 control instructions.
2.2 Low Level Control Structures
Until now, most of the control structures you've seen and have used in your programs have been very similar to the control structures found in high level languages like Pascal, C++, and Ada. While these control structures make learning assembly language easy they are not true assembly language statements. Instead, the HLA compiler translates these control structures into a sequence of "pure" machine instructions that achieve the same result as the high level control structures. This text uses the high level control structures to avoid your having to learn too much all at once. Now, however, it's time to put aside these high level language control structures and learn how to write your programs in real assembly language, using low-level control structures.
2.3 Statement Labels
HLA low level control structures make extensive use of labels within your code. A low level control structure usually transfers control from one point in your program to another point in your program. You typically specify the destination of such a transfer using a statement label. A statement label consists of a valid (unique) HLA identifier and a colon, e.g.,
aLabel:Of course, like procedure, variable, and constant identifiers, you should attempt to choose descriptive and meaningful names for your labels. The identifier "aLabel" is hardly descriptive or meaningful.
Statement labels have one important attribute that differentiates them from most other identifiers in HLA: you don't have to declare a label before you use it. This is important, because low-level control structures must often transfer control to a label at some point later in the code, therefore the label may not be defined at the point you reference it.
You can do three things with labels: transfer control to a label via a jump (goto) instruction, call a label via the CALL instruction, and you can take the address of a label. There is very little else you can directly do with a label (of course, there is very little else you would want to do with a label, so this is hardly a restriction). The following program demonstrates two ways to take the address of a label in your program and print out the address (using the LEA instruction and using the "&" address-of operator):
program labelDemo; #include( "stdlib.hhf" ); begin labelDemo; lbl1: lea( ebx, lbl1 ); lea( eax, lbl2 ); stdout.put( "&lbl1=$", ebx, " &lbl2=", eax, nl ); lbl2: end labelDemo; Program 2.1 Displaying the Address of Statement Labels in a ProgramHLA also allows you to initialize dword variables with the addresses of statement labels. However, there are some restrictions on labels that appear in the initialization portions of variable declarations. The most important restriction is that you must define the statement label at the same lex level as the variable declaration. That is, if you reference a statement label in the initialization section of a variable declaration appearing in the main program, the statement label must also be in the main program. Conversely, if you take the address of a statement label in a local variable declaration, that symbol must appear in the same procedure as the local variable. The following program demonstrates the use of statement labels in variable initialization:
program labelArrays; #include( "stdlib.hhf" ); static labels:dword[2] := [ &lbl1, &lbl2 ]; procedure hasLabels; static stmtLbls: dword[2] := [ &label1, &label2 ]; begin hasLabels; label1: stdout.put ( "stmtLbls[0]= $", stmtLbls[0], nl, "stmtLbls[1]= $", stmtLbls[4], nl ); label2: end hasLabels; begin labelArrays; hasLabels(); lbl1: stdout.put( "labels[0]= $", labels[0], " labels[1]=", labels[4], nl ); lbl2: end labelArrays; Program 2.2 Initializing DWORD Variables with the Address of Statement LabelsOnce in a really great while, you'll need to refer to a label that is not within the current procedure. The need for this is sufficiently rare that this text will not describe all the details. However, you can look up the details on HLA's LABEL declaration section in the HLA documentation should the need to do this ever arise.
2.4 Unconditional Transfer of Control (JMP)
The JMP (jump) instruction unconditionally transfers control to another point in the program. There are three forms of this instruction: a direct jump, and two indirect jumps. These instructions take one of the following three forms:
jmp label; jmp( reg32 ); jmp( mem32 );For the first (direct) jump above, you normally specify the target address using a statement label (see the previous section for a discussion of statement labels). The statement label is usually on the same line as an executable machine instruction or appears by itself on a line preceding an executable machine instruction. The direct jump instruction is the most commonly used of these three forms. It is completely equivalent to a GOTO statement in a high level language1. Example:
<< statements >> jmp laterInPgm; . . . laterInPgm: << statements >>The second form of the JMP instruction above, "jmp( reg32 );", is a register indirect jump instruction. This instruction transfers control to the instruction whose address appears in the specified 32-bit general purpose register. To use this form of the JMP instruction you must load the specified register with the address of some machine instruction prior to the execution of the JMP. You could use this instruction to implement a state machine (see "State Machines and Indirect Jumps" on page 654) by loading a register with the address of some label at various points throughout your program; then, arriving along different paths, a point in the program can determine what path it arrived upon by executing the indirect jump. The following short sample program demonstrates how you could use the JMP in this manner:
program regIndJmp; #include( "stdlib.hhf" ); static i:int32; begin regIndJmp; // Read an integer from the user and set EBX to // denote the success or failure of the input. try stdout.put( "Enter an integer value between 1 and 10: " ); stdin.get( i ); mov( i, eax ); if( eax in 1..10 ) then mov( &GoodInput, ebx ); else mov( &valRange, ebx ); endif; exception( ex.ConversionError ) mov( &convError, ebx ); exception( ex.ValueOutOfRange ) mov( &valRange, ebx ); endtry; // Okay, transfer control to the appropriate // section of the program that deals with // the input. jmp( ebx ); valRange: stdout.put( "You entered a value outside the range 1..10" nl ); jmp Done; convError: stdout.put( "Your input contained illegal characters" nl ); jmp Done; GoodInput: stdout.put( "You entered the value ", i, nl ); Done: end regIndJmp; Program 2.3 Using Register Indirect JMP InstructionsThe third form of the JMP instruction is a memory indirect JMP. This form of the JMP instruction fetches a dword value from the specified memory location and transfers control to the instruction at the address specified by the contents of the memory location. This is similar to the register indirect JMP except the address appears in a memory location rather than in a register. The following program demonstrates a rather trivial use of this form of the JMP instruction:
program memIndJmp; #include( "stdlib.hhf" ); static LabelPtr:dword := &stmtLabel; begin memIndJmp; stdout.put( "Before the JMP instruction" nl ); jmp( LabelPtr ); stdout.put( "This should not execute" nl ); stmtLabel: stdout.put( "After the LabelPtr label in the program" nl ); end memIndJmp; Program 2.4 Using Memory Indirect JMP InstructionsWarning: unlike the HLA high level control structures, the low-level JMP instructions can get you into a lot of trouble. In particular, if you do not initialize a register with the address of a valid instruction and you jump indirect through that register, the results are undefined (though this will usually cause a general protection fault). Similarly, if you do not initialize a dword variable with the address of a legal instruction, jumping indirect through that memory location will probably crash your program.
2.5 The Conditional Jump Instructions
Although the JMP instruction provides transfer of control, it does not allow you to make any serious decisions. The 80x86's conditional jump instructions handle this task. The conditional jump instructions are the basic tool for creating loops and other conditionally executable statements like the IF..ENDIF statement.
The conditional jumps test one or more flags in the flags register to see if they match some particular pattern (just like the SETcc instructions). If the flag settings match the instruction control transfers to the target location. If the match fails, the CPU ignores the conditional jump and execution continues with the next instruction. Some conditional jump instructions simply test the setting of the sign, carry, overflow, and zero flags. For example, after the execution of a SHL instruction, you could test the carry flag to determine if the SHL shifted a one out of the H.O. bit of its operand. Likewise, you could test the zero flag after a TEST instruction to see if any specified bits were one. Most of the time, however, you will probably execute a conditional jump after a CMP instruction. The CMP instruction sets the flags so that you can test for less than, greater than, equality, etc.
The conditional JMP instructions take the following form:
Jcc label;
The "cc" in Jcc indicates that you must substitute some character sequence that specifies the type of condition to test. These are the same characters the SETcc instruction uses. For example, "JS" stands for jump if the sign flag is set." A typical JS instruction looks like this
js ValueIsNegative;In this example, the JS instruction transfers control to the ValueIsNegative statement label if the sign flag is currently set; control falls through to the next instruction following the JS instruction if the sign flag is clear.
Unlike the unconditional JMP instruction, the conditional jump instructions do not provide an indirect form. The only form they allow is a branch to a statement label in your program. Conditional jump instructions have a restriction that the target label must be within 32,768 bytes of the jump instruction. However, since this generally corresponds to somewhere between 8,000 and 32,000 machine instructions, it is unlikely you will ever encounter this restriction.
Note: Intel's documentation defines various synonyms or instruction aliases for many conditional jump instructions. The following tables list all the aliases for a particular instruction. These tables also list out the opposite branches. You'll soon see the purpose of the opposite branches.
One brief comment about the "opposites" column is in order. In many instances you will need to be able to generate the opposite of a specific branch instructions (lots of examples of this appear throughout the remainder of this chapter). With only two exceptions, a very simple rule completely describes how to generate an opposite branch:
- If the second letter of the Jcc instruction is not an "n", insert an "n" after the "j". E.g., JE becomes JNE and JL becomes JNL.
- If the second letter of the Jcc instruction is an "n", then remove that "n" from the instruction. E.g., JNG becomes JG and JNE becomes JE.
The two exceptions to this rule are JPE (jump if parity is even) and JPO (jump if parity is odd). These exceptions cause few problems because (a) you'll hardly ever need to test the parity flag, and (b) you can use the aliases JP and JNP synonyms for JPE and JPO. The "N/No N" rule applies to JP and JNP.
Though you know that JGE is the opposite of JL, get in the habit of using JNL rather than JGE as the opposite jump instruction for JL. It's too easy in an important situation to start thinking "greater is the opposite of less" and substitute JG instead. You can avoid this confusion by always using the "N/No N" rule.
The 80x86 conditional jump instruction give you the ability to split program flow into one of two paths depending upon some logical condition. Suppose you want to increment the AX register if BX is equal to CX. You can accomplish this with the following code:
cmp( bx, cx ); jne SkipStmts; inc( ax ); SkipStmts:The trick is to use the opposite branch to skip over the instructions you want to execute if the condition is true. Always use the "opposite branch (N/no N)" rule given earlier to select the opposite branch.
You can also use the conditional jump instructions to synthesize loops. For example, the following code sequence reads a sequence of characters from the user and stores each character in successive elements of an array until the user presses the Enter key (carriage return):
mov( 0, edi ); RdLnLoop: stdin.getc(); // Read a character into the AL register. mov( al, Input[ edi ] ); // Store away the character inc( edi ); // Move on to the next character cmp( al, stdio.cr ); // See if the user pressed Enter jne RdLnLoop;For more information concerning the use of the conditional jumps to synthesize IF statements, loops, and other control structures, see "Implementing Common Control Structures in Assembly Language" on page 629.
Like the SETcc instructions, the conditional jump instructions come in two basic categories - those that test specific processor flags (e.g., JZ, JC, JNO) and those that test some condition ( less than, greater than, etc.). When testing a condition, the conditional jump instructions almost always follow a CMP instruction. The CMP instruction sets the flags so you can use a JA, JAE, JB, JBE, JE, or JNE instruction to test for unsigned less than, less than or equal, equality, inequality, greater than, or greater than or equal. Simultaneously, the CMP instruction sets the flags so you can also do a signed comparison using the JL, JLE, JE, JNE, JG, and JGE instructions.
The conditional jump instructions only test flags, they do not affect any of the 80x86 flags.
2.6 "Medium-Level" Control Structures: JT and JF
HLA provides two special conditional jump instructions: JT (jump if true) and JF (jump if false). These instructions take the following syntax:
jt( boolean_expression ) target_label; jf( boolean_expression ) target_label;The boolean_expression is the standard HLA boolean expression allowed by IF..ENDIF and other HLA high level language statements. These instructions evaluate the boolean expression and jump to the specified label if the expression evaluates true (JT) or false (JF).
These are not real 80x86 instructions. HLA compiles them into a sequence of one or more 80x86 machine instructions that achieve the same result. In general, you should not use these two instructions in your main code; they offer few benefits over using an IF..ENDIF statement and they are no more readable than the pure assembly language sequences they compile into. HLA provides these "medium-level" instructions so that you may create your own high level control structures using macros (see the chapters on Macros, the HLA Run-Time Language, and Domain Specific Languages for more details).
1Unlike high level languages, where your instructors usually forbid you to use GOTO statements, you will find that the use of the JMP instruction in assembly language is absolutely essential.
|