10.11 Destructors
A destructor is a class routine that cleans up an object once a program finishes using that object. Like constructors, HLA does not provide a special syntax for creating destructors nor does HLA automatically call a destructor; unlike constructors, a destructor is usually a method rather than a procedure (since virtual destructors make a lot of sense while virtual constructors do not).
A typical destructor will close any files opened by the object, free the memory allocated during the use of the object, and, finally, free the object itself if it was created dynamically. The destructor also handles any other clean-up chores the object may require before it ceases to exist.
By convention, most HLA programmers name their destructors Destroy. Destructors generally do not have any parameters, so the issue of overloading the parameter list rarely arises. About the only code that most destructors have in common is the code to free the storage associated with the object. The following destructor demonstrates how to do this:
procedure tBase.Destroy; nodisplay; begin Destroy; push( eax ); // isInHeap uses this // Place any other clean up code here. // The code to free dynamic objects should always appear last // in the destructor. /*************/ // The following code assumes that ESI still contains the address // of the object. if( isInHeap( esi )) then free( esi ); endif; pop( eax ); end Destroy;The HLA Standard Library routine isInHeap returns true if its parameter is an address that malloc returned. Therefore, this code automatically frees the storage associated with the object if the program originally allocated storage for the object by calling malloc. Obviously, on return from this method call, ESI will no longer point at a legal object in memory if you allocated it dynamically. Note that this code will not affect the value in ESI nor will it modify the object if the object wasn't one you've previously allocated via a call to malloc.
10.12 HLA's "_initialize_" and "_finalize_" Strings
Although HLA does not automatically call constructors and destructors associated with your classes, HLA does provide a mechanism whereby you can cause these calls to happen automatically: by using the _initialize_ and _finalize_ compile-time string variables (i.e., VAL constants) HLA automatically declares in every procedure.
Whenever you write a procedure, iterator, or method, HLA automatically declares several local symbols in that routine. Two such symbols are _initialize_ and _finalize_. HLA declares these symbols as follows:
val _initialize_: string := ""; _finalize_: string := "";HLA emits the _initialize_ string as text at the very beginning of the routine's body, i.e., immediately after the routine's BEGIN clause1. Similarly, HLA emits the _finalize_ string at the very end of the routine's body, just before the END clause. This is comparable to the following:
procedure SomeProc; << declarations >> begin SomeProc; @text( _initialize_ ); << procedure body >> @text( _finalize_ ); end SomeProc;Since _initialize_ and _finalize_ initially contain the empty string, these expansions have no effect on the code that HLA generates unless you explicitly modify the value of _initialize_ prior to the BEGIN clause or you modify _finalize_ prior to the END clause of the procedure. So if you modify either of these string objects to contain a machine instruction, HLA will compile that instruction at the beginning or end of the procedure. The following example demonstrates how to use this technique:
procedure SomeProc; ?_initialize_ := "mov( 0, eax );"; ?_finalize_ := "stdout.put( eax );" begin SomeProc; // HLA emits "mov( 0, eax );" here in response to the _initialize_ // string constant. add( 5, eax ); // HLA emits "stdout.put( eax );" here. end SomeProc;Of course, these examples don't save you much. It would be easier to type the actual statements at the beginning and end of the procedure than assign a string containing these statements to the _initialize_ and _finalize_ compile-time variables. However, if we could automate the assignment of some string to these variables, so that you don't have to explicitly assign them in each procedure, then this feature might be useful. In a moment, you'll see how we can automate the assignment of values to the _initialize_ and _finalize_ strings. For the time being, consider the case where we load the name of a constructor into the _initialize_ string and we load the name of a destructor in to the _finalize_ string. By doing this, the routine will "automatically" call the constructor and destructor for that particular object.
The example above has a minor problem. If we can automate the assignment of some value to _initialize_ or _finalize_, what happens if these variables already contain some value? For example, suppose we have two objects we use in a routine and the first one loads the name of its constructor into the _initialize_ string; what happens when the second object attempts to do the same thing? The solution is simple: don't directly assign any string to the _initialize_ or _finalize_ compile-time variables, instead, always concatenate your strings to the end of the existing string in these variables. The following is a modification to the above example that demonstrates how to do this:
procedure SomeProc; ?_initialize_ := _initialize_ + "mov( 0, eax );"; ?_finalize_ := _finalize_ + "stdout.put( eax );" begin SomeProc; // HLA emits "mov( 0, eax );" here in response to the _initialize_ // string constant. add( 5, eax ); // HLA emits "stdout.put( eax );" here. end SomeProc;When you assign values to the _initialize_ and _finalize_ strings, HLA almost guarantees that the _initialize_ sequence will execute upon entry into the routine. Sadly, the same is not true for the _finalize_ string upon exit. HLA simply emits the code for the _finalize_ string at the end of the routine, immediately before the code that cleans up the activation record and returns. Unfortunately, "falling off the end of the routine" is not the only way that one could return from that routine. One could explicitly return from somewhere in the middle of the code by executing a RET instruction. Since HLA only emits the _finalize_ string at the very end of the routine, returning from that routine in this manner bypassing the _finalize_ code. Unfortunately, other than manually emitting the _finalize_ code, there is nothing you can do about this2. Fortunately, this mechanism for exiting a routine is completely under your control; if you never exit a routine except by "falling off the end" then you won't have to worry about this problem (note that you can use the EXIT control structure to transfer control to the end of a routine if you really want to return from that routine from somewhere in the middle of the code).
Another way to prematurely exit a routine which, unfortunately, you have no control over, is by raising an exception. Your routine could call some other routine (e.g., a standard library routine) that raises an exception and then transfers control immediately to whomever called your routine. Fortunately, you can easily trap and handle exceptions by putting a TRY..ENDTRY block in your procedure. Here is an example that demonstrates this:
procedure SomeProc; << declarations that modify _initialize_ and _finalize_ >> begin SomeProc; << HLA emits the code for the _initialize_ string here. >> try // Catch any exceptions that occur: << Procedure Body Goes Here >> anyexception push( eax ); // Save the exception #. @text( _finalize_ ); // Execute the _finalize_ code here. pop( eax ); // Restore the exception #. raise( eax ); // Reraise the exception. endtry; // HLA automatically emits the _finalize_ code here. end SomeProc;Although the code above handles some problems that exist with _finalize_, by no means that this handle every possible case. Always be on the look out for ways your program could inadvertently exit a routine without executing the code found in the _finalize_ string. You should explicitly expand _finalize_ if you encounter such a situation.
There is one important place you can get into trouble with respect to exceptions: within the code the routine emits for the _initialize_ string. If you modify the _initialize_ string so that it contains a constructor call and the execution of that constructor raises an exception, this will probably force an exit from that routine without executing the corresponding _finalize_ code. You could bury the TRY..ENDTRY statement directly into the _initialize_ and _finalize_ strings but this approach has several problems, not the least of which is the fact that one of the first constructors you call might raise an exception that transfers control to the exception handler that calls the destructors for all objects in that routine (including those objects whose constructors you have yet to call). Although no single solution that handles all problems exists, probably the best approach is to put a TRY..ENDTRY block around each constructor call if it is possible for that constructor to raise some exception that is possible to handle (i.e., doesn't require the immediate termination of the program).
Thus far this discussion of _initialize_ and _finalize_ has failed to address one important point: why use this feature to implement the "automatic" calling of constructors and destructors since it apparently involves more work that simply calling the constructors and destructors directly? Clearly there must be a way to automate the assignment of the _initialize_ and _finalize_ strings or this section wouldn't exist. The way to accomplish this is by using a macro to define the class type. So now it's time to take a look at another HLA feature that makes is possible to automate this activity: the FORWARD keyword.
You've seen how to use the FORWARD reserved word to create procedure and iterator prototypes (see "Forward Procedures" on page 501), it turns out that you can declare forward CONST, VAL, TYPE, and variable declarations as well. The syntax for such declarations takes the following form:
ForwardSymbolName: forward( undefinedID );This declaration is completely equivalent to the following:
?undefinedID: text := "ForwardSymbolName";
Especially note that this expansion does not actually define the symbol ForwardSymbolName. It just converts this symbol to a string and assigns this string to the specified TEXT object (undefinedID in this example).
Now you're probably wonder how something like the above is equivalent to a forward declaration. The truth is, it isn't. However, FORWARD declarations let you create macros that simulate type names by allowing you to defer the actual declaration of an object's type until some later point in the code. Consider the following example:
type myClass: class var i:int32; procedure Create; returns( "esi" ); procedure Destroy; endclass; #macro _myClass: varID; forward( varID ); ?_initialize_ := _initialize_ + @string:varID + ".Create(); "; ?_finalize_ := _finalize_ + @string:varID + ".Destroy(); "; varID: myClass #endmacro;Note, and this is very important, that a semicolon does not follow the "varID: myClass" declaration at the end of this macro. You'll find out why this semicolon is missing in a little bit.
If you have the class and macro declarations above in your program, you can now declare variables of type _myClass that automatically invoke the constructor and destructor upon entry and exit of the routine containing the variable declarations. To see how, take a look at the following procedure shell:
procedure HasmyClassObject; var mco: _myClass; begin HasmyClassObject; << do stuff with mco here >> end HasmyClassObject;Since _myClass is a macro, the procedure above expands to the following text during compilation:
procedure HasmyClassObject; var mco: // Expansion of the _myClass macro: forward( _0103_ ); // _0103_ symbol is and HLA supplied text symbol // that expands to "mco". ?_initialize_ := _initialize_ + "mco" + ".Create(); "; ?_finalize_ := _finalize_ + "mco" + ".Destroy(); "; mco: myClass; begin HasmyClassObject; mco.Create(); // Expansion of the _initialize_ string. << do stuff with mco here >> mco.Destroy(); // Expansion of the _finalize_ string. end HasmyClassObject;You might notice that a semicolon appears after "mco: myClass" declaration in the example above. This semicolon is not actually a part of the macro, instead it is the semicolon that follows the "mco: _myClass;" declaration in the original code.
If you want to create an array of objects, you could legally declare that array as follows:
var mcoArray: _myClass[10];Because the last statement in the _myClass macro doesn't end with a semicolon, the declaration above will expand to something like the following (almost correct) code:
mcoArray: // Expansion of the _myClass macro: forward( _0103_ ); // _0103_ symbol is and HLA supplied text symbol // that expands to "mcoArray". ?_initialize_ := _initialize_ + "mcoArray" + ".Create(); "; ?_finalize_ := _finalize_ + "mcoArray" + ".Destroy(); "; mcoArray: myClass[10];The only problem with this expansion is that it only calls the constructor for the first object of the array. There are several ways to solve this problem; one is to append a macro name to the end of _initialize_ and _finalize_ rather than the constructor name. That macro would check the object's name (mcoArray in this example) to determine if it is an array. If so, that macro could expand to a loop that calls the constructor for each element of the array (the implementation appears as a programming project at the end of this chapter).
Another solution to this problem is to use a macro parameter to specify the dimensions for arrays of myClass. This scheme is easier to implement than the one above, but it does have the drawback of requiring a different syntax for declaring object arrays (you have to use parentheses around the array dimension rather than square brackets).
The FORWARD directive is quite powerful and lets you achieve all kinds of tricks. However, there are a few problems of which you should be aware. First, since HLA emits the _initialize_ and _finalize_ code transparently, you can be easily confused if there are any errors in the code appearing within these strings. If you start getting error messages associated with the BEGIN or END statements in a routine, you might want to take a look at the _initialize_ and _finalize_ strings within that routine. The best defense here is to always append very simple statements to these strings so that you reduce the likelihood of an error.
Fundamentally, HLA doesn't support automatic constructor and destructor calls. This section has presented several tricks to attempt to automate the calls to these routines. However, the automation isn't perfect and, indeed, the aforementioned problems with the _finalize_ strings limit the applicability of this approach. The mechanism this section presents is probably fine for simple classes and simple programs. However, one piece of advice is probably worth following: if your code is complex or correctness is critical, it's probably a good idea to explicitly call the constructors and destructors manually.
10.13 Abstract Methods
An abstract base class is one that exists solely to supply a set of common fields to its derived classes. You never declare variables whose type is an abstract base class, you always use one of the derived classes. The purpose of an abstract base class is to provide a template for creating other classes, nothing more. As it turns out, the only difference in syntax between a standard base class and an abstract base class is the presence of at least one abstract method declaration. An abstract method is a special method that does not have an actual implementation in the abstract base class. Any attempt to call that method will raise an exception. If you're wondering what possible good an abstract method could be, well, keep on reading...
Suppose you want to create a set of classes to hold numeric values. One class could represent unsigned integers, another class could represent signed integers, a third could implement BCD values, and a fourth could support real64 values. While you could create four separate classes that function independently of one another, doing so passes up an opportunity to make this set of classes more convenient to use. To understand why, consider the following possible class declarations:
type uint: class var TheValue: dword; method put; << other methods for this class >> endclass; sint: class var TheValue: dword; method put; << other methods for this class >> endclass; r64: class var TheValue: real64; method put; << other methods for this class >> endclass;The implementation of these classes is not unreasonable. They have fields for the data, they have a put method (which, presumably, writes the data to the standard output device), Presumably they have other methods and procedures in implement various operations on the data. There is, however, two problems with these classes, one minor and one major, both occurring because these classes do not inherit any fields from a common base class.
The first problem, which is relatively minor, is that you have to repeat the declaration of several common fields in these classes. For example, the put method declaration appears in each of these classes3. This duplication of effort involves results in a harder to maintain program because it doesn't encourage you to use a common name for a common function since it's easy to use a different name in each of the classes.
A bigger problem with this approach is that it is not generic. That is, you can't create a generic pointer to a "numeric" object and perform operations like addition, subtraction, and output on that value (regardless of the underlying numeric representation).
We can easily solve these two problems by turning the previous class declarations into a set of derived classes. The following code demonstrates an easy way to do this:
type numeric: class procedure put; << Other common methods shared by all the classes >> endclass; uint: class inherits( numeric ) var TheValue: dword; override method put; << other methods for this class >> endclass; sint: class inherits( numeric ) var TheValue: dword; override method put; << other methods for this class >> endclass; r64: class inherits( numeric ) var TheValue: real64; override method put; << other methods for this class >> endclass;This scheme solves both the problems. First, by inheriting the put method from numeric, this code encourages the derived classes to always use the name put thereby making the program easier to maintain. Second, because this example uses derived classes, it's possible to create a pointer to the numeric type and load this pointer with the address of a uint, sint, or r64 object. That pointer can invoke the methods found in the numeric class to do functions like addition, subtraction, or numeric output. Therefore, the application that uses this pointer doesn't need to know the exact data type, it only deals with numeric values in a generic fashion.
One problem with this scheme is that it's possible to declare and use variables of type numeric. Unfortunately, such numeric variables don't have the ability to represent any type of number (notice that the data storage for the numeric fields actually appears in the derived classes). Worse, because you've declared the put method in the numeric class, you've actually got to write some code to implement that method even though one should never really call it; the actual implementation should only occur in the derived classes. While you could write a dummy method that prints an error message (or, better yet, raises an exception), there shouldn't be any need to write "dummy" procedures like this. Fortunately, there is no reason to do so - if you use abstract methods.
The ABSTRACT keyword, when it follows a method declaration, tells HLA that you are not going to provide an implementation of the method for this class. Instead, it is the responsibility of all derived class to provide a concrete implementation for the abstract method. HLA will raise an exception if you attempt to call an abstract method directly. The following is the modification to the numeric class to convert put to an abstract method:
type numeric: class method put; abstract; << Other common methods shared by all the classes >> endclass;An abstract base class is a class that has at least one abstract method. Note that you don't have to make all methods abstract in an abstract base class; it is perfectly legal to declare some standard methods (and, of course, provide their implementation) within the abstract base class.
Abstract method declarations provide a mechanism by which a base class enforces the methods that the derived classes must implement. In theory, all derived classes must provide concrete implementations of all abstract methods or those derived classes are themselves abstract base classes. In practice, it's possible to bend the rules a little and use abstract methods for a slightly different purpose.
A little earlier, you read that one should never create variables whose type is an abstract base class. For if you attempt to execute an abstract method the program would immediately raise an exception to complain about this illegal method call. In practice, you actually can declare variables of an abstract base type and get away with this as long as you don't call any abstract methods. We can use this fact to provide a better form of method overloading (that is, providing several different routines with the same name but different parameter lists). Remember, the standard trick in HLA to overload a routine is to write several different routines and then use a macro to parse the parameter list and determine which actual routine to call (see "Simulating Function Overloading with Macros" on page 860). The problem with this technique is that you cannot override a macro definition in a class, so if you want to use a macro to override a routine's syntax, then that macro must appear in the base class. Unfortunately, you may not need a routine with a specific parameter list in the base class (for that matter, you may only need that particular version of the routine in a single derived class), so implementing that routine in the base class and in all the other derived classes is a waste of effort. This isn't a big problem. Just go ahead and define the abstract method in the base class and only implement it in the derived class that needs that particular method. As long as you don't call that method in the base class or in the other derived classes that don't override the method, everything will work fine.
One problem with using abstract methods to support overloading is that this trick does not apply to procedures - only methods and iterators. However, you can achieve the same effect with procedures by declaring a (non-abstract) procedure in the base class and overriding that procedure only in the class that actually uses it. You will have to provide an implementation of the procedure in the base class, but that is a minor issue (the procedure's body, by the way, should simply raise an exception to indicate that you should have never called it).
An example of routine overloading in a class appears in this chapter's sample program.
1If the routine automatically emits code to construct the activation record, HLA emits _initialize_'s text after the code that builds the activation record.
2Note that you can manually emit the _finalize_ code using the statement "@text( _finalize_ );".
3Note, by the way, that TheValue is not a common class because this field has a different type in the r64 class.
|