10.10 Constructors and Object Initialization
If you've tried to get a little ahead of the game and write a program that uses objects prior to this point, you've probably discovered that the program inexplicably crashes whenever you attempt to run it. We've covered a lot of material in this chapter thus far, but you are still missing one crucial piece of information - how to properly initialize objects prior to use. This section will put the final piece into the puzzle and allow you to begin writing programs that use classes.
Consider the following object declaration and code fragment:
var bc: tBaseClass; . . . bc.mBase();Remember that variables you declare in the VAR section are uninitialized at run-time. Therefore, when the program containing these statements gets around to executing bc.mBase, it executes the three-statement sequence you've seen several times already:
lea( esi, bc); mov( [esi], edi ); call( (type dword [edi+@offset( tBaseClass.mBase )] );The problem with this sequence is that it loads EDI with an undefined value assuming you haven't previously initialized the bc object. Since EDI contains a garbage value, attempting to call a subroutine at address "[EDI+@offset(tBaseClass.mBase)]" will likely crash the system. Therefore, before using an object, you must initialize the _pVMT_ field with the address of that object's VMT. One easy way to do this is with the following statement:
mov( &tBaseClass._VMT_, bc._pVMT_ );Always remember, before using an object, be sure to initialize the virtual method table pointer for that field.
Although you must initialize the virtual method table pointer for all objects you use, this may not be the only field you need to initialize in those objects. Each specific class may have its own application-specific initialization that is necessary. Although the initialization may vary by class, you need to perform the same initialization on each object of a specific class that you use. If you ever create more than a single object from a given class, it is probably a good idea to create a procedure to do this initialization for you. This is such a common operation that object-oriented programmers have given these initialization procedures a special name: constructors.
Some object-oriented languages (e.g., C++) use a special syntax to declare a constructor. Others (e.g., Delphi) simply use existing procedure declarations to define a constructor. One advantage to employing a special syntax is that the language knows when you define a constructor and can automatically generate code to call that constructor for you (whenever you declare an object). Languages, like Delphi, require that you explicitly call the constructor; this can be a minor inconvenience and a source of defects in your programs. HLA does not use a special syntax to declare constructors - you define constructors using standard class procedures. As such, you will need to explicitly call the constructors in your program; however, you'll see an easy method for automating this in a later section of this chapter.
Perhaps the most important fact you must remember is that constructors must be class procedures. You must not define constructors as methods (or iterators). The reason is quite simple: one of the tasks of the constructor is to initialize the pointer to the virtual method table and you cannot call a class method or iterator until after you've initialized the VMT pointer. Since class procedures don't use the virtual method table, you can call a class procedure prior to initializing the VMT pointer for an object.
By convention, HLA programmers use the name Create for the class constructor. There is no requirement that you use this name, but by doing so you will make your programs easier to read and follow by other programmers.
As you may recall, you can call a class procedure via an object reference or a class reference. E.g., if clsProc is a class procedure of class tClass and Obj is an object of type tClass, then the following two class procedure invocations are both legal:
tClass.clsProc(); Obj.clsProc();There is a big difference between these two calls. The first one calls clsProc with ESI containing zero (NULL) while the second invocation loads the address of Obj into ESI before the call. We can use this fact to determine within a method the particular calling mechanism.
10.10.1 Dynamic Object Allocation Within the Constructor
As it turns out, most programs allocate objects dynamically using malloc and refer to those objects indirectly using pointers. This adds one more step to the initialization process - allocating storage for the object. The constructor is the perfect place to allocate this storage. Since you probably won't need to allocate all objects dynamically, you'll need two types of constructors: one that allocates storage and then initializes the object, and another that simply initializes an object that already has storage.
Another constructor convention is to merge these two constructors into a single constructor and differentiate the type of constructor call by the value in ESI. On entry into the class' Create procedure, the program checks the value in ESI to see if it contains NULL (zero). If so, the constructor calls malloc to allocate storage for the object and returns a pointer to the object in ESI. If ESI does not contain NULL upon entry into the procedure, then the constructor assumes that ESI points at a valid object and skips over the memory allocation statements. At the very least, a constructor initializes the pointer to the VMT; therefore, the minimalist constructor will look like the following:
procedure tBaseClass.mBase; nodisplay; begin mBase; if( ESI = 0 ) then push( eax ); // Malloc returns its result here, so save it. malloc( @size( tBaseClass )); mov( eax, esi ); // Put pointer into ESI; pop( eax ); endif; // Initialize the pointer to the VMT: // (remember, "this" is shorthand for (type tBaseClass [esi])" mov( &tBaseClass._VMT_, this._pVMT_ ); // Other class initialization would go here. end mBase;After you write a constructor like the one above, you choose an appropriate calling mechanism based on whether your object's storage is already allocated. For pre-allocated objects (i.e., those you've declared in VAR, STATIC, or STORAGE sections1 or those you've previously allocated storage for via malloc) you simply load the address of the object into ESI and call the constructor. For those objects you declare as a variable, this is very easy - just call the appropriate Create constructor:
var bc0: tBaseClass; bcp: pointer to tBaseClass; . . . bc0.Create(); // Initializes pre-allocated bc0 object. . . . malloc( @size( tBaseClass )); // Allocate storage for bcp object. mov( eax, bcp ); . . . bcp.Create(); // Initializes pre-allocated bcp object.Note that although bcp is a pointer to a tBaseClass object, the Create method does not automatically allocate storage for this object. The program already allocates the storage earlier. Therefore, when the program calls bcp.Create it loads ESI with the address contained within bcp; since this is not NULL, the tBaseClass.Create procedure does not allocate storage for a new object. By the way, the call to bcp.Create emits the following sequence of machine instructions:
mov( bcp, esi ); call tBaseClass.Create;Until now, the code examples for a class procedure call always began with an LEA instruction. This is because all the examples to this point have used object variables rather than pointers to object variables. Remember, a class procedure (method/iterator) call passes the address of the object in the ESI register. For object variables HLA emits an LEA instruction to obtain this address. For pointers to objects, however, the actual object address is the value of the pointer variable; therefore, to load the address of the object into ESI, HLA emits a MOV instruction that copies the value of the pointer into the ESI register.
In the example above, the program preallocates the storage for an object prior to calling the object constructor. While there are several reasons for preallocating object storage (e.g., you're creating a dynamic array of objects), you can achieve most simple object allocations like the one above by calling a standard Create method (i.e., one that allocates storage for an object if ESI contains NULL). The following example demonstrates this:
var bcp2: pointer to tBaseClass; . . . tBaseClass.Create(); // Calls Create with ESI=NULL. mov( esi, bcp2 ); // Save pointer to new class object in bcp2.Remember, a call to a tBaseClass.Create constructor returns a pointer to the new object in the ESI register. It is the caller's responsibility to save the pointer this function returns into the appropriate pointer variable; the constructor does not automatically do this for you.
10.10.2 Constructors and Inheritance
Constructors for derived (child) classes that inherit fields from a base class represent a special case. Each class must have its own constructor but needs the ability to call the base class constructor. This section explains the reasons for this and how to do this.
A derived class inherits the Create procedure from its base class. However, you must override this procedure in a derived class because the derived class probably requires more storage than the base class and, therefore, you will probably need to use a different call to malloc to allocate storage for a dynamic object. Hence, it is very unusual for a derived class not to override the definition of the Create procedure.
However, overriding a base class' Create procedure has problems of its own. When you override the base class' Create procedure, you take the full responsibility of initializing the (entire) object, including all the initialization required by the base class. At the very least, this involves putting duplicate code in the overridden procedure to handle the initialization usually done by the base class constructor. In addition to make your program larger (by duplicating code already present in the base class constructor), this also violates information hiding principles since the derived class must be aware of all the fields in the base class (including those that are logically private to the base class). What we need here is the ability to call a base class' constructor from within the derived class' destructor and let that call do the lower-level initialization of the base class' fields. Fortunately, this is an easy thing to do in HLA.
Consider the following class declarations (which does things the hard way):
type tBase: class var i:uns32; j:int32; procedure Create(); returns( "esi" ); endclass; tDerived: class inherits( tBase ); var r: real64; override procedure Create(); returns( "esi" ); endclass; procedure tBase.Create; @nodisplay; begin Create; if( esi = 0 ) then push( eax ); mov( malloc( @size( tBase )), esi ); pop( eax ); endif; mov( &tBase._VMT_, this._pVMT_ ); mov( 0, this.i ); mov( -1, this.j ); end Create; procedure tDerived.Create; @nodisplay; begin Create; if( esi = 0 ) then push( eax ); mov( malloc( @size( tDerived )), esi ); pop( eax ); endif; // Initialize the VMT pointer for this object: mov( &tDerived._VMT_, this._pVMT_ ); // Initialize the "r" field of this particular object: fldz(); fstp( this.r ); // Duplicate the initialization required by tBase.Create: mov( 0, this.i ); mov( -1, this.j ); end Create;Let's take a closer look at the tDerived.Create procedure above. Like a conventional constructor, it begins by checking ESI and allocates storage for a new object if ESI contains NULL. Note that the size of a tDerived object includes the size required by the inherited fields, so this properly allocates the necessary storage for all fields in a tDerived object.
Next, the tDerived.Create procedure initializes the VMT pointer field of the object. Remember, each class has its own VMT and, specifically, derived classes do not use the VMT of their base class. Therefore, this constructor must initialize the _pVMT_ field with the address of the tDerived VMT.
After initializing the VMT pointer, the tDerived constructor initializes the value of the r field to 0.0 (remember, FLDZ loads zero onto the FPU stack). This concludes the tDerived-specific initialization.
The remaining instructions in tDerived.Create are the problem. These statements duplicate some of the code appearing in the tBase.Create procedure. The problem with code duplication becomes really apparent when you decide to modify the initial values of these fields; if you've duplicated the initialization code in derived classes, you will need to change the initialization code in more than one Create procedure. More often than not, this results in defects in the derived class Create procedures, especially if those derived classes appear in different source files than the base class.
Another problem with burying base class initialization in derived class constructors is the violation of the information hiding principle. Some fields of the base class may be logically private. Although HLA does not explicitly support the concept of public and private fields in a class (as, say, C++ does), well-disciplined programmers will still partition the fields as private or public and then only use the private fields in class routines belonging to that class. Initializing these private fields in derived classes is not acceptable to such programmers. Doing so will make it very difficult to change the definition and implementation of some base class at a later date.
Fortunately, HLA provides an easy mechanism for calling the inherited constructor within a derived class' constructor. All you have to do is call the base constructor using the classname syntax, e.g., you could call tBase.Create directly from within tDerived.Create. By calling the base class constructor, your derived class constructors can initialize the base class fields without worrying about the exact implementation (or initial values) of the base class.
Unfortunately, there are two types of initialization that every (conventional) constructor does that will affect the way you call a base class constructor: all conventional constructors allocate memory for the class if ESI contains zero and all conventional constructors initialize the VMT pointer. Fortunately, it is very easy to deal with these two problems
The memory required by an object of some most base class is usually less than the memory required for an object of a class you derive from that base class (because the derived classes usually add more fields). Therefore, you cannot allow the base class constructor to allocate the storage when you call it from inside the derived class' constructor. This problem is easily solved by checking ESI within the derived class constructor and allocating any necessary storage for the object before calling the base class constructor.
The second problem is the initialization of the VMT pointer. When you call the base class' constructor, it will initialize the VMT pointer with the address of the base class' virtual method table. A derived class object's _pVMT_ field, however, must point at the virtual method table for the derived class. Calling the base class constructor will always initialize the _pVMT_ field with the wrong pointer; to properly initialize the _pVMT_ field with the appropriate value, the derived class constructor must store the address of the derived class' virtual method table into the _pVMT_ field after the call to the base class constructor (so that it overwrites the value written by the base class constructor).
The tDerived.Create constructor, rewritten to call the tBase.Create constructors, follows:
procedure tDerived.Create; @nodisplay; begin Create; if( esi = 0 ) then push( eax ); mov( malloc( @size( tDerived )), esi ); pop( eax ); endif; // Call the base class constructor to do any initialization // needed by the base class. Note that this call must follow // the object allocation code above (so ESI will always contain // a pointer to an object at this point and tBase.Create will // never allocate storage). tBase.Create(); // Initialize the VMT pointer for this object. This code // must always follow the call to the base class constructor // because the base class constructor also initializes this // field and we don't want the initial value supplied by // tBase.Create. mov( &tDerived._VMT_, this._pVMT_ ); // Initialize the "r" field of this particular object: fldz(); fstp( this.r ); end Create;This solution solves all the above concerns with derived class constructors.
10.10.3 Constructor Parameters and Procedure Overloading
All the constructor examples to this point have not had any parameters. However, there is nothing special about constructors that prevent the use of parameters. Constructors are procedures therefore you can specify any number and types of parameters you choose. You can use these parameter values to initialize certain fields or control how the constructor initializes the fields. Of course, you may use constructor parameters for any purpose you'd use parameters in any other procedure. In fact, about the only issue you need concern yourself with is the use of parameters whenever you have a derived class. This section deals with those issues.
The first, and probably most important, problem with parameters in derived class constructors actually applies to all overridden procedures, iterators, and methods: the parameter list of an overridden routine must exactly match the parameter list of the corresponding routine in the base class. In fact, HLA doesn't even give you the chance to violate this rule because OVERRIDE routine prototypes don't allow parameter list declarations - they automatically inherit the parameter list of the base routine. Therefore, you cannot use a special parameter list in the constructor prototype for one class and a different parameter list for the constructors appearing in base or derived classes. Sometimes it would be nice if this weren't the case, but there are some sound and logical reasons why HLA does not support this2.
Some languages, like C++, support function overloading letting you specify several different constructors whose parameter list specifies which constructor to use. HLA does not directly support procedure overloading in this manner, but you can use macros to simulate this language feature (see "Simulating Function Overloading with Macros" on page 990). To use this trick with constructors you would create a macro with the name Create. The actual constructors could have names that describe their differences (e.g., CreateDefault, CreateSetIJ, etc.). The Create macro would parse the actual parameter list to determine which routine to call.
HLA does not support macro overloading. Therefore, you cannot override a macro in a derived class to call a constructor unique to that derived class. In certain circumstances you can create a small workaround by defining empty procedures in your base class that you intend to override in some derived class (this is similar to an abstract method, see "Abstract Methods" on page 1091). Presumably, you would never call the procedure in the base class (in fact, you would probably want to put an error message in the body of the procedure just in case you accidentally call it). By putting the empty procedure declaration in the base class, the macro that simulates function overloading can refer to that procedure and you can use that in derived classes later on.
1You generally do not declare objects in READONLY sections because you cannot initialize them.
2Calling virtual methods and iterators would be a real problem since you don't really know which routine a pointer references. Therefore, you couldn't know the proper parameter list. While the problems with procedures aren't quite as drastic, there are some subtle problems that could creep into your code if base or derived classes allowed overridden procedures with different parameter lists.
|