10.8 Writing Class Methods, Iterators, and Procedures
For each class procedure, method, and iterator prototype appearing in a class definition, there must be a corresponding procedure, method, or iterator appearing within the program (for the sake of brevity, this section will use the term routine to mean procedure, method, or iterator from this point forward). If the prototype does not contain the EXTERNAL option, then the code must appear in the same compilation unit as the class declaration. If the EXTERNAL option does follow the prototype, then the code may appear in the same compilation unit or a different compilation unit (as long as you link the resulting object file with the code containing the class declaration). Like external (non-class) procedures and iterators, if you fail to provide the code the linker will complain when you attempt to create an executable file. To reduce the size of the following examples, they will all define their routines in the same source file as the class declaration.
HLA class routines must always follow the class declaration in a compilation unit. If you are compiling your routines in a separate unit, the class declarations must still precede the code with the class declaration (usually via an #INCLUDE file). If you haven't defined the class by the time you define a routine like point.distance, HLA doesn't know that point is a class and, therefore, doesn't know how to handle the routine's definition.
Consider the following declarations for a point2D class:
type point2D: class const UnitDistance: real32 := 1.0; var x: real32; y: real32; static LastDistance: real32; method distance( fromX: real32; fromY:real32 ); returns( "st0" ); procedure InitLastDistance; endclass;The distance function for this class should compute the distance from the object's point to (fromX,fromY). The following formula describes this computation:
A first pass at writing the distance method might produce the following code:
method point2D.distance( fromX:real32; fromY:real32 ); nodisplay; begin distance; fld( x ); // Note: this doesn't work! fld( fromX ); // Compute (x-fromX) fsub(); fld( st0 ); // Duplicate value on TOS. fmul(); // Compute square of difference. fld( y ); // This doesn't work either. fld( fromY ); // Compute (y-fromY) fsub(); fld( st0 ); // Compute the square of the difference. fmul(); fsqrt(); end distance;This code probably looks like it should work to someone who is familiar with an object-oriented programming language like C++ or Delphi. However, as the comments indicate, the instructions that push the x and y variables onto the FPU stack don't work - HLA doesn't automatically define the symbols associated with the data fields of a class within that class' routines.
To learn how to access the data fields of a class within that class' routines, we need to back up a moment and discover some very important implementation details concerning HLA's classes. To do this, consider the following variable declarations:
var Origin: point2D; PtInSpace: point2D;Remember, whenever you create two objects like Origin and PtInSpace, HLA reserves storage for the x and y data fields for both of these objects. However, there is only one copy of the point2D.distance method in memory. Therefore, were you to call Origin.distance and PtInSpace.distance, the system would call the same routine for both method invocations. Once inside that method, one has to wonder what an instruction like "fld( x );" would do. How does it associate x with Origin.x or PtInSpace.x? Worse still, how would this code differentiate between the data field x and a global object x? In HLA, the answer is "it doesn't." You do not specify the data field names within a class routine by simply using their names as though they were common variables.
To differentiate Origin.x from PtInSpace.x within class routines, HLA automatically passes a pointer to an object's data fields whenever you call a class routine. Therefore, you can reference the data fields indirectly off this pointer. HLA passes this object pointer in the ESI register. This is one of the few places where HLA-generated code will modify one of the 80x86 registers behind your back: anytime you call a class routine, HLA automatically loads the ESI register with the object's address. Obviously, you cannot count on ESI's value being preserved across class routine class nor can you pass parameters to the class routine in the ESI register (though it is perfectly reasonable to specify "@USE ESI;" to allow HLA to use the ESI register when setting up other parameters). For class methods and iterators (but not procedures), HLA will also load the EDI register with the address of the class' virtual method table (see "Virtual Method Tables" on page 943). While the virtual method table address isn't as interesting as the object address, keep in mind that HLA-generated code will overwrite any value in the EDI register when you call a method or an iterator. Again, "EDI" is a good choice for the @USE operand for methods since HLA will wipe out the value in EDI anyway.
Upon entry into a class routine, ESI contains a pointer to the (non-static) data fields associated with the class. Therefore, to access fields like x and y (in our point2D example), you could use an address expression like the following:
(type point2D [esi].xSince you use ESI as the base address of the object's data fields, it's a good idea not to disturb ESI's value within the class routines (or, at least, preserve ESI's value if you need to access the objects data fields after some point where you must use ESI for some other purpose). Note that if you call an iterator or a method you do not have to preserve EDI (unless, for some reason, you need access to the virtual method table, which is unlikely).
Accessing the fields of a data object within a class' routines is such a common operation that HLA provides a shorthand notation for casting ESI as a pointer to the class object: THIS. Within a class in HLA, the reserved word THIS automatically expands to a string of the form "(type classname [esi])" substituting, of course, the appropriate class name for classname. Using the THIS keyword, we can (correctly) rewrite the previous distance method as follows:
method point2D.distance( fromX:real32; fromY:real32 ); nodisplay; begin distance; fld( this.x ); fld( fromX ); // Compute (x-fromX) fsub(); fld( st0 ); // Duplicate value on TOS. fmul(); // Compute square of difference. fld( this.y ); fld( fromY ); // Compute (y-fromY) fsub(); fld( st0 ); // Compute the square of the difference. fmul(); fsqrt(); end distance;Don't forget that calling a class routine wipes out the value in the ESI register. This isn't obvious from the syntax of the routine's invocation. It is especially easy to forget this when calling some class routine from inside some other class routine; don't forget that if you do this the internal call wipes out the value in ESI and on return from that call ESI no longer points at the original object. Always push and pop ESI (or otherwise preserve ESI's value) in this situation, e.g.,
. . . fld( this.x ); // ESI points at current object. . . . push( esi ); // Preserve ESI across this method call. SomeObject.SomeMethod(); pop( esi ); . . . lea( ebx, this.x ); // ESI points at original object here.The THIS keyword provides access to the class variables you declare in the VAR section of a class. You can also use THIS to call other class routines associated with the current object, e.g.,
this.distance( 5.0, 6.0 );To access class constants and STATIC data fields you generally do not use the THIS pointer. HLA associates constant and static data fields with the whole class, not a specific object. To access these class members, just use the class name in place of the object name. For example, to access the UnitDistance constant in the point2D class you could use a statement like the following:
fld( point2D.UnitDistance );As another example, if you wanted to update the LastDistance field in the point2D class each time you computed a distance, you could rewrite the point2D.distance method as follows:
method point2D.distance( fromX:real32; fromY:real32 ); nodisplay; begin distance; fld( this.x ); fld( fromX ); // Compute (x-fromX) fsub(); fld( st0 ); // Duplicate value on TOS. fmul(); // Compute square of difference. fld( this.y ); fld( fromY ); // Compute (y-fromY) fsub(); fld( st0 ); // Compute the square of the difference. fmul(); fsqrt(); fst( point2D.LastDistance ); // Update shared (STATIC) field. end distance;To understand why you use the class name when referring to constants and static objects but you use THIS to access VAR objects, check out the next section.
Class procedures are also static objects, so it is possible to call a class procedure by specifying the class name rather than an object name in the procedure invocation, e.g., both of the following are legal:
Origin.InitLastDistance(); point2D.InitLastDistance();There is, however, a subtle difference between these two class procedure calls. The first call above loads ESI with the address of the Origin object prior to actually calling the InitLastDistance procedure. The second call, however, is a direct call to the class procedure without referencing an object; therefore, HLA doesn't know what object address to load into the ESI register. In this case, HLA loads NULL (zero) into ESI prior to calling the InitLastDistance procedure. Because you can call class procedures in this manner, it's always a good idea to check the value in ESI within your class procedures to verify that HLA contains an object address. Checking the value in ESI is a good way to determine which calling mechanism is in use. Later, this chapter will discuss constructors and object initialization; there you will see a good use for static procedures and calling those procedures directly (rather than through the use of an object).
10.9 Object Implementation
In a high level object-oriented language like C++ or Delphi, it is quite possible to master the use of objects without really understanding how the machine implements them. One of the reasons for learning assembly language programming is to fully comprehend low-level implementation details so one can make educated decisions concerning the use of programming constructs like objects. Further, since assembly language allows you to poke around with data structures at a very low-level, knowing how HLA implements objects can help you create certain algorithms that would not be possible without a detailed knowledge of object implementation. Therefore, this section, and its corresponding subsections, explains the low-level implementation details you will need to know in order to write object-oriented HLA programs.
HLA implements objects in a manner quite similar to records. In particular, HLA allocates storage for all VAR objects in a class in a sequential fashion, just like records. Indeed, if a class consists of only VAR data fields, the memory representation of that class is nearly identical to that of a corresponding RECORD declaration. Consider the Student record declaration taken from Volume Three and the corresponding class:
type student: record Name: char[65]; Major: int16; SSN: char[12]; Midterm1: int16; Midterm2: int16; Final: int16; Homework: int16; Projects: int16; endrecord;student2: class Name: char[65]; Major: int16; SSN: char[12]; Midterm1: int16; Midterm2: int16; Final: int16; Homework: int16; Projects: int16; endclass;
Figure 10.1 Student RECORD Implementation in Memory
Figure 10.2 Student CLASS Implementation in Memory
If you look carefully at these two figures, you'll discover that the only difference between the class and the record implementations is the inclusion of the VMT (virtual method table) pointer field at the beginning of the class object. This field, which is always present in a class, contains the address of the class' virtual method table which, in turn, contains the addresses of all the class' methods and iterators. The VMT field, by the way, is present even if a class doesn't contain any methods or iterators.
As pointed out in previous sections, HLA does not allocate storage for STATIC objects within the object's storage. Instead, HLA allocates a single instance of each static data field that all objects share. As an example, consider the following class and object declarations:
type tHasStatic: class var i:int32; j:int32; r:real32; static c:char[2]; b:byte; endclass; var hs1: tHasStatic; hs2: tHasStatic;Figure 10.3 shows the storage allocation for these two objects in memory.
Figure 10.3 Object Allocation with Static Data Fields
Of course, CONST, VAL, and #MACRO objects do not have any run-time memory requirements associated with them, so HLA does not allocate any storage for these fields. Like the STATIC data fields, you may access CONST, VAL, and #MACRO fields using the class name as well as an object name. Hence, even if tHasStatic has these types of fields, the memory organization for tHasStatic objects would still be the same as shown in Figure 10.3.
Other than the presence of the virtual method table pointer (VMT), the presence of methods, iterators, and procedures has no impact on the storage allocation of an object. Of course, the machine instructions associated with these routines does appear somewhere in memory. So in a sense the code for the routines is quite similar to static data fields insofar as all the objects share a single instance of the routine.
10.9.1 Virtual Method Tables
When HLA calls a class procedure, it directly calls that procedure using a CALL instruction, just like any normal non-class procedure call. Methods and iterators are another story altogether. Each object in the system carries a pointer to a virtual method table which is an array of pointers to all the methods and iterators appearing within the object's class.
Figure 10.4 Virtual Method Table Organization
Each iterator or method you declare in a class has a corresponding entry in the virtual method table. That dword entry contains the address of the first instruction of that iterator or method. To call a class method or iterator is a bit more work than calling a class procedure (it requires one additional instruction plus the use of the EDI register). Here is a typical calling sequence for a method:
mov( ObjectAdrs, ESI ); // All class routines do this. mov( [esi], edi ); // Get the address of the VMT into EDI call( (type dword [edi+n])); // "n" is the offset of the method's entry // in the VMT.For a given class there is only one copy of the VMT in memory. This is a static object so all objects of a given class type share the same VMT. This is reasonable since all objects of the same class type have exactly the same methods and iterators (see Figure 10.5).
Figure 10.5 All Objects That are the Same Class Type Share the Same VMT
Although HLA builds the VMT record structure as it encounters methods and iterators within a class, HLA does not automatically create the actual run-time virtual method table for you. You must explicitly declare this table in your program. To do this, you include a statement like the following in a STATIC or READONLY declaration section of your program, e.g.,
readonly VMT( classname );
Since the addresses in a virtual method table should never change during program execution, the READONLY section is probably the best choice for declaring VMTs. It should go without saying that changing the pointers in a VMT is, in general, a really bad idea. So putting VMTs in a STATIC section is usually not a good idea.
A declaration like the one above defines the variable classname._VMT_. In section 10.10 (see "Constructors and Object Initialization" on page 949) you see that you'll need this name when initializing object variables. The class declaration automatically defines the classname._VMT_ symbol as an external static variable. The declaration above just provides the actual definition of this external symbol.
The declaration of a VMT uses a somewhat strange syntax because you aren't actually declaring a new symbol with this declaration, you're simply supplying the data for a symbol that you previously declared implicitly by defining a class. That is, the class declaration defines the static table variable classname._VMT_, all you're doing with the VMT declaration is telling HLA to emit the actual data for the table. If, for some reason, you would like to refer to this table using a name other than classname._VMT_, HLA does allow you to prefix the declaration above with a variable name, e.g.,
readonly myVMT: VMT( classname );
In this declaration, myVMT is an alias of classname._VMT_. As a general rule, you should avoid aliases in a program because they make the program more difficult to read and understand. Therefore, it is unlikely that you would ever really need to use this type of declaration.
Like any other global static variable, there should be only one instance of a VMT for a given class in a program. The best place to put the VMT declaration is in the same source file as the class' method, iterator, and procedure code (assuming they all appear in a single file). This way you will automatically link in the VMT whenever you link in the routines for a given class.
10.9.2 Object Representation with Inheritance
Up to this point, the discussion of the implementation of class objects has ignored the possibility of inheritance. Inheritance only affects the memory representation of an object by adding fields that are not explicitly stated in the class declaration.
Adding inherited fields from a base class to another class must be done carefully. Remember, an important attribute of a class that inherits fields from a base class is that you can use a pointer to the base class to access the inherited fields from that base class in another class. As an example, consider the following classes:
type tBaseClass: class var i:uns32; j:uns32; r:real32; method mBase; endclass; tChildClassA: class inherits( tBaseClass ); var c:char; b:boolean; w:word; method mA; endclass; tChildClassB: class inherits( tBaseClass ); var d:dword; c:char; a:byte[3]; endclass;Since both tChildClassA and tChildClassB inherit the fields of tBaseClass, these two child classes include the i, j, and r fields as well as their own specific fields. Furthermore, whenever you have a pointer variable whose base type is tBaseClass, it is legal to load this pointer with the address of any child class of tBaseClass; therefore, it is perfectly reasonable to load such a pointer with the address of a tChildClassA or tChildClassB variable, e.g.,
var B1: tBaseClass; CA: tChildClassA; CB: tChildClassB; ptr: pointer to tBaseClass; . . . lea( ebx, B1 ); mov( ebx, ptr ); << Use ptr >> . . . lea( eax, CA ); mov( ebx, ptr ); << Use ptr >> . . . lea( eax, CB ); mov( eax, ptr ); << Use ptr >>Since ptr points at an object of tBaseClass, you may legally (from a semantic sense) access the i, j, and r fields of the object where ptr is pointing. It is not legal to access the c, b, w, or d fields of the tChildClassA or tChildClassB objects since at any one given moment the program may not know exactly what object type ptr references.
In order for inheritance to work properly, the i, j, and r fields must appear at the same offsets all child classes as they do in tBaseClass. This way, an instruction of the form "mov((type tBaseClass [ebx]).i, eax);" will correct access the i field even if EBX points at an object of type tChildClassA or tChildClassB. Figure 10.6 shows the layout of the child and base classes:
Figure 10.6 Layout of Base and Child Class Objects in Memory
Note that the new fields in the two child classes bear no relation to one another, even if they have the same name (e.g., field c in the two child classes does not lie at the same offset). Although the two child classes share the fields they inherit from their common base class, any new fields they add are unique and separate. Two fields in different classes share the same offset only by coincidence.
All classes (even those that aren't related to one another) place the pointer to the virtual method table at offset zero within the object. There is a single VMT associated with each class in a program; even classes that inherit fields from some base class have a VMT that is (generally) different than the base class' VMT. shows how objects of type tBaseClass, tChildClassA and tChildClassB point at their specific VMTs:
Figure 10.7 Virtual Method Table References from Objects
A virtual method table is nothing more than an array of pointers to the methods and iterators associated with a class. The address of the first method or iterator appearing in a class is at offset zero, the address of the second appears at offset four, etc. You can determine the offset value for a given iterator or method by using the @offset function. If you want to call a method or iterator directly (using 80x86 syntax rather than HLA's high level syntax), you code use code like the following:
var sc: tBaseClass; . . . lea( esi, sc ); // Get the address of the object (& VMT). mov( [esi], edi ); // Put address of VMT into EDI. call( (type dword [edi+@offset( tBaseClass.mBase )] );Of course, if the method has any parameters, you must push them onto the stack before executing the code above. Don't forget, when making direct calls to a method, that you must load ESI with the address of the object. Any field references within the method will probably depend upon ESI containing this address. The choice of EDI to contain the VMT address is nearly arbitrary. Unless you're doing something tricky (like using EDI to obtain run-time type information), you could use any register you please here. As a general rule, you should use EDI when simulating class iterator/method calls because this is the convention that HLA employs and most programmers will expect this.
Whenever a child class inherits fields from some base class, the child class' VMT also inherits entries from the base class' VMT. For example, the VMT for class tBaseClass contains only a single entry - a pointer to method tBaseClass.mBase. The VMT for class tChildClassA contains two entries: a pointer to tBaseClass.mBase and tChildClassA.mA. Since tChildClassB doesn't define any new methods or iterators, tChildClassB's VMT contains only a single entry, a pointer to the tBaseClass.mBase method. Note that tChildClassB's VMT is identical to tBaseClass' VMT. Nevertheless, HLA produces two distinct VMTs. This is a critical fact that we will make use of a little later. Figure 10.8 shows the relationship between these VMTs:
Figure 10.8 Virtual Method Tables for Inherited Classes
Although the VMT always appears at offset zero in an object (and, therefore, you can access the VMT using the address expression "[ESI]" if ESI points at an object), HLA actually inserts a symbol into the symbol table so you may refer to the VMT symbolically. The symbol _pVMT_ (pointer to Virtual Method Table) provides this capability. So a more readable way to access the VMT pointer (as in the previous code example) is
lea( esi, sc ); mov( (type tBaseClass [esi])._pVMT_, edi ); call( (type dword [edi+@offset( tBaseClass.mBase )] );If you need to access the VMT directly, there are a couple ways to do this. Whenever you declare a class object, HLA automatically includes a field named _VMT_ as part of that class. _VMT_ is a static array of double word objects. Therefore, you may refer to the VMT using an identifier of the form classname._VMT_. Generally, you shouldn't access the VMT directly, but as you'll see shortly, there are some good reasons why you need to know the address of this object in memory.
|