1.3 HLA Support for Data Alignment
In order to write the fastest running programs, you need to ensure that your data objects are properly aligned in memory. Data becomes misaligned whenever you allocate storage for different sized objects in adjacent memory locations. Since it is nearly impossible to write a (large) program that uses objects that are all the same size, some other facility is necessary in order to realign data that would normally be unaligned in memory.
Consider the following HLA variable declarations:
static dw: dword; b: byte; w: word; dw2: dword; w2: word; b2: byte; dw3: dword;The first static declaration in a program (running under Windows, Linux, and most 32-bit operating systems) places its variables at an address that is an even multiple of 4096 bytes. Since 4096 is a power of two, whatever variable first appears in the static declaration is guaranteed to be aligned on a reasonable address. Each successive variable is allocated at an address that is the sum of the sizes of all the preceding variables plus the starting address. Therefore, assuming the above variables are allocated at a starting address of 4096, then each variable will be allocated at the following addresses:
// Start Adrs Length dw: dword; // 4096 4 b: byte; // 4100 1 w: word; // 4101 2 dw2: dword; // 4103 4 w2: word; // 4107 2 b2: byte; // 4109 1 dw3: dword; // 4110 4With the exception of the first variable (which is aligned on a 4K boundary) and the byte variables (whose alignment doesn't matter), all of these variables are misaligned in memory. The w, w2, and dw2 variables are aligned on odd addresses and the dw3 variable is aligned on an even address that is not an even multiple of four.
An easy way to guarantee that your variables are aligned on an appropriate address is to put all the dword variables first, the word variables second, and the byte variables last in the declaration:
static dw: dword; dw2: dword; dw3: dword; w: word; w2: word; b: byte; b2: byte;This organization produces the following addresses in memory (again, assuming the first variable is allocated at address 4096):
// Start Adrs Length dw: dword; // 4096 4 dw2: dword; // 4100 4 dw3: dword; // 4104 4 w: word; // 4108 2 w2: word; // 4110 2 b: byte; // 4112 1 b2: byte; // 4113 1As you can see, these variables are all aligned at reasonable addresses.
Unfortunately, it is rarely possible for you to arrange your variables in this manner. While there are lots of technical reasons that make this alignment impossible, a good practical reason for not doing this is because it doesn't let you organize your variable declarations by logical function (that is, you probably want to keep related variables next to one another regardless of their size).
To resolve this problem, HLA provides two solutions. The first is an alignment option whenever you encounter a static section. If you follow the static keyword by an integer constant inside parentheses, HLA will align the very next variable declaration at an address that is an even multiple of the specified constant, e.g..,
static( 4 ) dw: dword; b: byte; w: word; dw2: dword; w2: word; b2: byte; dw3: dword;Of course, if you have only a single static section in your entire program, this declaration doesn't buy you much because the first declaration in the section is already aligned on a 4096 byte boundary. However, HLA does allow you to put multiple static sections into your program, so you can specify an alignment constant for each static section:
static( 4 ) dw: dword; b: byte; static( 2 ) w: word; static( 4 ) dw2: dword; w2: word; b2: byte; static( 4 ) dw3: dword;This particular sequence guarantees that all double word variables are aligned on addresses that are multiples of four and all word variables are aligned on even addresses (note that a special section was not created for w2 since its address is going to be an even multiple of four).
While the alignment parameter to the static directive is useful on occasion, there are two problems with it: The first problem is that inserting so many static directives into the middle of your variable declarations tends to disrupt the readability of your variable declarations. Part of this problem can be overcome by simply placing a static directive before every variable declaration:
static( 4 ) dw: dword; static( 1 ) b: byte; static( 2 ) w: word; static( 4 ) dw2: dword; static( 2 ) w2: word; static( 1 ) b2: byte; static( 4 ) dw3: dword;While this approach can, arguably, make a program easier to read, it certainly involves more typing and it doesn't address the second problem: variables appearing in separate static sections are not guaranteed to be allocated in adjacent memory locations. Once in a while it is very important to ensure that two variables are allocated in adjacent memory cells and most programmers assume that variables declared next to one another in the source code are allocated in adjacent memory cells. The mechanism above does not guarantee this.
The second facility HLA provides to help align adjacent memory locations is the align directive. The align directive uses the following syntax:
align( integer_constant );
The integer constant must be one of the following small unsigned integer values: 1, 2, 4, 8, or 16. If HLA encounters the align directive in a static section, it will align the very next variable on an address that is an even multiple of the specified alignment constant. The previous example could be rewritten, using the align directive, as follows:
static( 4 ) dw: dword; b: byte; align( 2 ); w: word; align( 4 ); dw2: dword; w2: word; b2: byte; align( 4 ); dw3: dword;If you're wondering how the align directive works, it's really quite simple. If HLA determines that the current address is not an even multiple of the specified value, HLA will quietly emit extra bytes of padding after the previous variable declaration until the current address in the static section is an even multiple of the specified value. This has the effect of making your program slightly larger (by a few bytes) in exchange for faster access to your data; Given that your program will only grow by a small number of bytes when you use this feature, this is a good trade off.
|