if (x = y) then if (I >= J) then writeln('At point 1') else writeln('At point 2) else write('Error condition');
To convert this nested if..then..else to assembly language, start with the outermost if, convert it to assembly, then work on the innermost if:
; if (x = y) then mov ax, X cmp ax, Y jne Else0 ; Put innermost IF here jmp IfDone0 ; Else write('Error condition'); Else0: print byte "Error condition",0 IfDone0:
As you can see, the above code handles the "if (X=Y)..." instruction, leaving a spot for the second if. Now add in the second if as follows:
; if (x = y) then mov ax, X cmp ax, Y jne Else0 ; IF ( I >= J) then writeln('At point 1') mov ax, I cmp ax, J jnge Else1 print byte "At point 1",cr,lf,0 jmp IfDone1 ; Else writeln ('At point 2'); Else1: print byte "At point 2",cr,lf,0 IfDone1: jmp IfDone0 ; Else write('Error condition'); Else0: print byte "Error condition",0 IfDone0:
The nested if
appears in italics above just to help it stand out.
There is an obvious optimization which you do not really want to make until speed becomes a real problem. Note in the innermost if
statement above that the JMP IFDONE1
instructions simply jumps to a jmp
instruction which transfers control to IfDone0
. It is very tempting to replace the first jmp
by one which jumps directly to IFDone0
. Indeed, when you go in and optimize your code, this would be a good optimization to make. However, you shouldn't make such optimizations to your code unless you really need the speed. Doing so makes your code harder to read and understand. Remember, we would like all our control structures to have one entry and one exit. Changing this jump as described would give the innermost if
statement two exit points.
The for
loop is another commonly nested control structure. Once again, the key to building up nested structures is to construct the outside object first and fill in the inner members afterwards. As an example, consider the following nested for
loops which add the elements of a pair of two dimensional arrays together:
for i := 0 to 7 do for k := 0 to 7 do A [i,j] := B [i,j] + C [i,j];
As before, begin by constructing the outermost loop first. This code assumes that dx will be the loop control variable for the outermost loop (that is, dx is equivalent to "i"):
; for dx := 0 to 7 do mov dx, 0 ForLp0: cmp dx, 7 jnle EndFor0 ; Put innermost FOR loop here inc dx jmp ForLp0 EndFor0:
Now add the code for the nested for loop. Note the use of the cx register for the loop control variable on the innermost for loop of this code.
; for dx := 0 to 7 do mov dx, 0 ForLp0: cmp dx, 7 jnle EndFor0 ; for cx := 0 to 7 do mov cx, 0 ForLp1: cmp cx, 7 jnle EndFor1 ; Put code for A[dx,cx] := b[dx,cx] + C [dx,cx] here inc cx jmp ForLp1 EndFor1: inc dx jmp ForLp0 EndFor0:
Once again the innermost for
loop is in italics in the above code to make it stand out. The final step is to add the code which performs that actual computation.
for i := 1 to 10000 do ;
In assembly, you might see a comparable loop:
mov cx, 8000h DelayLp: loop DelayLp
By carefully choosing the number of iterations, you can obtain a relatively accurate delay interval. There is, however, one catch. That relatively accurate delay interval is only going to be accurate on your machine. If you move your program to a different machine with a different CPU, clock speed, number of wait states, different sized cache, or half a dozen other features, you will find that your delay loop takes a completely different amount of time. Since there is better than a hundred to one difference in speed between the high end and low end PCs today, it should come as no surprise that the loop above will execute 100 times faster on some machines than on others.
The fact that one CPU runs 100 times faster than another does not reduce the need to have a delay loop which executes some fixed amount of time. Indeed, it makes the problem that much more important. Fortunately, the PC provides a hardware based timer which operates at the same speed regardless of the CPU speed. This timer maintains the time of day for the operating system, so it's very important that it run at the same speed whether you're on an 8088 or a Pentium. In the chapter on interrupts you will learn to actually patch into this device to perform various tasks. For now, we will simply take advantage of the fact that this timer chip forces the CPU to increment a 32-bit memory location (40:6ch) about 18.2 times per second. By looking at this variable we can determine the speed of the CPU and adjust the count value for an empty loop accordingly.
The basic idea of the following code is to watch the BIOS timer variable until it changes. Once it changes, start counting the number of iterations through some sort of loop until the BIOS timer variable changes again. Having noted the number of iterations, if you execute a similar loop the same number of times it should require about 1/18.2 seconds to execute.
The following program demonstrates how to create such a Delay
routine:
.xlist include stdlib.a includelib stdlib.lib .list ; PPI_B is the I/O address of the keyboard/speaker control ; port. This program accesses it simply to introduce a ; large number of wait states on faster machines. Since the ; PPI (Programmable Peripheral Interface) chip runs at about ; the same speed on all PCs, accessing this chip slows most ; machines down to within a factor of two of the slower ; machines. PPI_B equ 61h ; RTC is the address of the BIOS timer variable (40:6ch). ; The BIOS timer interrupt code increments this 32-bit ; location about every 55 ms (1/18.2 seconds). The code ; which initializes everything for the Delay routine ; reads this location to determine when 1/18th seconds ; have passed. RTC textequ <es:[6ch]> dseg segment para public 'data' ; TimedValue contains the number of iterations the delay ; loop must repeat in order to waste 1/18.2 seconds. TimedValue word 0 ; RTC2 is a dummy variable used by the Delay routine to ; simulate accessing a BIOS variable. RTC2 word 0 dseg ends cseg segment para public 'code' assume cs:cseg, ds:dseg ; Main program which tests out the DELAY subroutine. Main proc mov ax, dseg mov ds, ax print byte "Delay test routine",cr,lf,0 ; Okay, let's see how long it takes to count down 1/18th ; of a second. First, point ES as segment 40h in memory. ; The BIOS variables are all in segment 40h. ; ; This code begins by reading the memory timer variable ; and waiting until it changes. Once it changes we can ; begin timing until the next change occurs. That will ; give us 1/18.2 seconds. We cannot start timing right ; away because we might be in the middle of a 1/18.2 ; second period. mov ax, 40h mov es, ax mov ax, RTC RTCMustChange: cmp ax, RTC je RTCMustChange ; Okay, begin timing the number of iterations it takes ; for an 18th of a second to pass. Note that this ; code must be very similar to the code in the Delay ; routine. mov cx, 0 mov si, RTC mov dx, PPI_B TimeRTC: mov bx, 10 DelayLp: in al, dx dec bx jne DelayLp cmp si, RTC loope TimeRTC neg cx ;CX counted down! mov TimedValue, cx ;Save away mov ax, ds mov es, ax printf byte "TimedValue = %d",cr,lf byte "Press any key to continue",cr,lf byte "This will begin a delay of five " byte "seconds",cr,lf,0 dword TimedValue getc mov cx, 90 DelayIt: call Delay18 loop DelayIt Quit: ExitPgm ;DOS macro to quit program. Main endp ; Delay18-This routine delays for approximately 1/18th sec. ; Presumably, the variable "TimedValue" in DS has ; been initialized with an appropriate count down ; value before calling this code. Delay18 proc near push ds push es push ax push bx push cx push dx push si mov ax, dseg mov es, ax mov ds, ax ; The following code contains two loops. The inside ; nested loop repeats 10 times. The outside loop ; repeats the number of times determined to waste ; 1/18.2 seconds. This loop accesses the hardware ; port "PPI_B" in order to introduce many wait states ; on the faster processors. This helps even out the ; timings on very fast machines by slowing them down. ; Note that accessing PPI_B is only done to introduce ; these wait states, the data read is of no interest ; to this code. ; ; Note the similarity of this code to the code in the ; main program which initializes the TimedValue variable. mov cx, TimedValue mov si, es:RTC2 mov dx, PPI_B TimeRTC: mov bx, 10 DelayLp: in al, dx dec bx jne DelayLp cmp si, es:RTC2 loope TimeRTC pop si pop dx pop cx pop bx pop ax pop es pop ds ret Delay18 endp cseg ends sseg segment para stack 'stack' stk word 1024 dup (0) sseg ends end Main