mov ah, 3dh ;Open the file mov al, 0 ;Open for reading lea dx, Filename ;Presume DS points at filename int 21h ; segment jc BadOpen mov FHndl, ax ;Save file handle LP: mov ah,3fh ;Read data from the file lea dx, Buffer ;Address of data buffer mov cx, 1 ;Read one byte mov bx, FHndl ;Get file handle value int 21h jc ReadError cmp ax, cx ;EOF reached? jne EOF mov al, Buffer ;Get character read putc ;Print it (IOSHELL call) jmp LP ;Read next byte EOF: mov bx, FHndl mov ah, 3eh ;Close file int 21h jc CloseError
There isn't much to this program at all. Now consider the same example rewritten to use blocked I/O:
Example: This example opens a file and reads it to the EOF using blocked I/O
mov ah, 3dh ;Open the file mov al, 0 ;Open for reading lea dx, Filename ;Presume DS points at filename int 21h ; segment jc BadOpen mov FHndl, ax ;Save file handle LP: mov ah,3fh ;Read data from the file lea dx, Buffer ;Address of data buffer mov cx, 256 ;Read 256 bytes mov bx, FHndl ;Get file handle value int 21h jc ReadError cmp ax, cx ;EOF reached? jne EOF mov si, 0 ;Note: CX=256 at this point. PrtLp: mov al, Buffer[si] ;Get character read putc ;Print it inc si loop PrtLp jmp LP ;Read next block ; Note, just because the number of bytes read doesn't equal 256, ; don't get the idea we're through, there could be up to 255 bytes ; in the buffer still waiting to be processed. EOF: mov cx, ax jcxz EOF2 ;If CX is zero, we're really done. mov si, 0 ;Process the last block of data read Finis: mov al, Buffer[si] ; from the file which contains putc ; 1..255 bytes of valid data. inc si loop Finis EOF2: mov bx, FHndl mov ah, 3eh ;Close file int 21h jc CloseError
This example demonstrates one major hassle with blocked I/O - when you reach the end of file, you haven't necessarily processed all of the data in the file. If the block size is 256 and there are 255 bytes left in the file, DOS will return an EOF condition (the number of bytes read don't match the request). In this case, we've still got to process the characters that were read. The code above does this in a rather straight-forward manner, using a second loop to finish up when the EOF is reached. You've probably noticed that the two print loops are virtually identical. This program can be reduced in size somewhat using the following code which is only a little more complex:
Example: This example opens a file and reads it to the EOF using blocked I/O
mov ah, 3dh ;Open the file mov al, 0 ;Open for reading lea dx, Filename ;Presume DS points at filename int 21h ; segment. jc BadOpen mov FHndl, ax ;Save file handle LP: mov ah,3fh ;Read data from the file lea dx, Buffer ;Address of data buffer mov cx, 256 ;Read 256 bytes mov bx, FHndl ;Get file handle value int 21h jc ReadError mov bx, ax ;Save for later mov cx, ax jcxz EOF mov si, 0 ;Note: CX=256 at this point. PrtLp: mov al, Buffer[si] ;Get character read putc ;Print it inc si loop PrtLp cmp bx, 256 ;Reach EOF yet? je LP EOF: mov bx, FHndl mov ah, 3eh ;Close file int 21h jc CloseError
Blocked I/O works best on sequential files. That is, those files opened only for reading or writing (no seeking). When dealing with random access files, you should read or write whole records at one time using the DOS read/write commands to process the whole record. This is still considerably faster than manipulating the data one byte at a time.
Offset Length Description 0 2 An INT 20h instruction is stored here 2 2 Program ending address 4 1 Unused, reserved by DOS 5 5 Call to DOS function dispatcher 0Ah 4 Address of program termination code 0Eh 4 Address of break handler routine 12h 4 Address of critical error handler routine 16h 22 Reserved for use by DOS 2Ch 2 Segment address of environment area 2Eh 34 Reserved by DOS 50h 3 INT 21h, RETF instructions 53h 9 Reserved by DOS 5Ch 16 Default FCB #1 6Ch 20 Default FCB #2 80h 1 Length of command line string 81h 127 Command line string
Note: locations 80h..FFh are used for the default DTA.
Most of the information in the PSP is of little use to a modern MS-DOS assembly language program. Buried in the PSP, however, are a couple of gems that are worth knowing about. Just for completeness, however, we'll take a look at all of the fields in the PSP.
The first field in the PSP contains an int 20h
instruction. Int 20h
is an obsolete mechanism used to terminate program execution. Back in the early days of DOS v1.0, your program would execute a jmp
to this location in order to terminate. Nowadays, of course, we have DOS function 4Ch which is much easier (and safer) than jumping to location zero in the PSP. Therefore, this field is obsolete.
Field number two contains a value which points at the last paragraph allocated to your program By subtracting the address of the PSP from this value, you can determine the amount of memory allocated to your program (and quit if there is insufficient memory available).
The third field is the first of many "holes" left in the PSP by Microsoft. Why they're here is anyone's guess.
The fourth field is a call to the DOS function dispatcher. The purpose of this (now obsolete) DOS calling mechanism was to allow some additional compatibility with CP/M-80 programs. For modern DOS programs, there is absolutely no need to worry about this field.
The next three fields are used to store special addresses during the execution of a program. These fields contain the default terminate vector, break vector, and critical error handler vectors. These are the values normally stored in the interrupt vectors for int 22h
, int 23h
, and int 24h
. By storing a copy of the values in the vectors for these interrupts, you can change these vectors so that they point into your own code. When your program terminates, DOS restores those three vectors from these three fields in the PSP. For more details on these interrupt vectors, please consult the DOS technical reference manual.
The eighth field in the PSP record is another reserved field, currently unavailable for use by your programs.
The ninth field is another real gem. It's the address of the environment strings area. This is a two-byte pointer which contains the segment address of the environment storage area. The environment strings always begin with an offset zero within this segment. The environment string area consists of a sequence of zero-terminated strings. It uses the following format:
string1 0 string2 0 string3 0 ... 0 stringn 0 0
That is, the environment area consists of a list of zero terminated strings, the list itself being terminated by a string of length zero (i.e., a zero all by itself, or two zeros in a row, however you want to look at it). Strings are (usually) placed in the environment area via DOS commands like PATH, SET, etc. Generally, a string in the environment area takes the form
name = parameters
For example, the "SET IPATH=C:\ASSEMBLY\INCLUDE" command copies the string "IPATH=C:\ASSEMBLY\INCLUDE" into the environment string storage area.
Many languages scan the environment storage area to find default filename paths and other pieces of default information set up by DOS. Your programs can take advantage of this as well.
The next field in the PSP is another block of reserved storage, currently undefined by DOS.
The 11th field in the PSP is another call to the DOS function dispatcher. Why this call exists (when the one at location 5 in the PSP already exists and nobody really uses either mechanism to call DOS) is an interesting question. In general, this field should be ignored by your programs.
The 12th field is another block of unused bytes in the PSP which should be ignored.
The 13th and 14th fields in the PSP are the default FCBs (File Control Blocks). File control blocks are another archaic data structure carried over from CP/M-80. FCBs are used only with the obsolete DOS v1.0 file handling routines, so they are of little interest to us. We'll ignore these FCBs in the PSP.
Locations 80h through the end of the PSP contain a very important piece of information- the command line parameters typed on the DOS command line along with your program's name. If the following is typed on the DOS command line:
MYPGM parameter1, parameter2
the following is stored into the command line parameter field:
23, " parameter1, parameter2", 0Dh
Location 80h contains 2310, the length of the parameters following the program name. Locations 81h through 97h contain the characters making up the parameter string. Location 98h contains a carriage return. Notice that the carriage return character is not figured into the length of the command line string.
Processing the command line string is such an important facet of assembly language programming that this process will be discussed in detail in the next section.
Locations 80h..FFh in the PSP also comprise the default DTA. Therefore, if you don't use DOS function 1Ah to change the DTA and you execute a FIND FIRST FILE, the filename information will be stored starting at location 80h in the PSP.
One important detail we've omitted until now is exactly how you access data in the PSP. Although the PSP is loaded into memory immediately before your program, that doesn't necessarily mean that it appears 100h bytes before your code. Your data segments may have been loaded into memory before your code segments, thereby invalidating this method of locating the PSP. The segment address of the PSP is passed to your program in the ds
register. To store the PSP address away in your data segment, your programs should begin with the following code:
push ds ;Save PSP value mov ax, seg DSEG ;Point DS and ES at our data mov ds, ax ; segment. mov es, ax pop PSP ;Store PSP value into "PSP" ; variable. . . .
Another way to obtain the PSP address, in DOS 5.0 and later, is to make a DOS call. If you load ah
with 51h and execute an int 21h
instruction, MS-DOS will return the segment address of the current PSP in the bx
register.
There are lots of tricky things you can do with the data in the PSP. Peter Norton's Programmer's Guide to the IBM PC lists all kinds of tricks. Such operations won't be discussed here because they're a little beyond the scope of this manual.