4.2.13 Extended Precision I/O
Once you have the ability to compute using extended precision arithmetic, the next problem is how do you get those extended precision values into your program and how do you display those extended precision values to the user? HLA's Standard Library provides routines for unsigned decimal, signed decimal, and hexadecimal I/O for values that are eight, 16, 32, or 64 bits in length. So as long as you're working with values whose size is less than or equal to 64 bits in length, you can use the Standard Library code. If you need to input or output values that are greater than 64 bits in length, you will need to write your own procedures to handle the operation. This section discusses the strategies you will need to write such routines.
The examples in this section work specifically with 128-bit values. The algorithms are perfectly general and extend to any number of bits (indeed, the 128-bit algorithms in this section are really nothing more than an extension of the algorithms the HLA Standard Library uses for 64-bit values). If you need a set of 128-bit unsigned I/O routines, you will probably be able to use the following code as-is. If you need to handle larger values, simple modifications to the following code is all that should be necessary.
The following examples all assume a common data type for 128-bit values. The HLA type declaration for this data type is one of the following depending on the type of value
type bits128: dword[4]; uns128: bits128; int128: bits128;4.2.13.1 Extended Precision Hexadecimal Output
Extended precision hexadecimal output is very easy. All you have to do is output each double word component of the extended precision value from the H.O. double word to the L.O. double word using a call to the stdout.putd routine. The following procedure does exactly this to output a bits128 value:
procedure putb128( b128: bits128 ); nodisplay; begin putb128; stdout.putd( b128[12] ); stdout.putd( b128[8] ); stdout.putd( b128[4] ); stdout.putd( b128[0] ); end putb128;Since HLA provides the stdout.putq procedure, you can shorten the code above by calling stdout.putq just twice:
procedure putb128( b128: bits128 ); nodisplay; begin putb128; stdout.putq( (type qword b128[8]) ); stdout.putq( (type qword b128[0]) ); end putb128;Note that this code outputs the two quad words with the H.O. quad word output first and L.O. quad word output second.
4.2.13.2 Extended Precision Unsigned Decimal Output
Decimal output is a little more complicated than hexadecimal output because the H.O. bits of a binary number affect the L.O. digits of the decimal representation (this was not true for hexadecimal values which is why hexadecimal output is so easy). Therefore, we will have to create the decimal representation for a binary number by extracting one decimal digit at a time from the number.
The most common solution for unsigned decimal output is to successively divide the value by ten until the result becomes zero. The remainder after the first division is a value in the range 0..9 and this value corresponds to the L.O. digit of the decimal number. Successive divisions by ten (and their corresponding remainder) extract successive digits in the number.
Iterative solutions to this problem generally allocate storage for a string of characters large enough to hold the entire number. Then the code extracts the decimal digits in a loop and places them in the string one by one. At the end of the conversion process, the routine prints the characters in the string in reverse order (remember, the divide algorithm extracts the L.O. digits first and the H.O. digits last, the opposite of the way you need to print them).
In this section, we will employ a recursive solution because it is a little more elegant. The recursive solution begins by dividing the value by 10 and saving the remainder in a local variable. If the quotient was not zero, the routine recursively calls itself to print any leading digits first. On return from the recursive call (which prints all the leading digits), the recursive algorithm prints the digit associated with the remainder to complete the operation. Here's how the operation works when printing the decimal value "123":
- (1) Divide 123 by 10. Quotient is 12, remainder is 3.
- (2) Save the remainder (3) in a local variable and recursively call the routine with the quotient.
- (3) [Recursive Entry 1] Divide 12 by 10. Quotient is 1, remainder is 2.
- (4) Save the remainder (2) in a local variable and recursively call the routine with the quotient.
- (5) [Recursive Entry 2] Divide 1 by 10. Quotient is 0, remainder is 1.
- (6) Save the remainder (1) in a local variable. Since the Quotient is zero, don't call the routine recursively.
- (7) Output the remainder value saved in the local variable (1). Return to the caller (Recursive Entry 1).
- (8) [Return to Recursive Entry 1] Output the remainder value saved in the local variable in recursive entry 1 (2). Return to the caller (original invocation of the procedure).
- (9) [Original invocation] Output the remainder value saved in the local variable in the original call (3). Return to the original caller of the output routine.
The only operation that requires extended precision calculation through this entire algorithm is the "divide by 10" requirement. Everything else is simple and straight-forward. We are in luck with this algorithm, since we are dividing an extended precision value by a value that easily fits into a double word, we can use the fast (and easy) extended precision division algorithm that uses the DIV instruction (see "Extended Precision Division" on page 864). The following program implements a 128-bit decimal output routine utilizing this technique.
program out128; #include( "stdlib.hhf" ); // 128-bit unsigned integer data type: type uns128: dword[4]; // DivideBy10- // // Divides "divisor" by 10 using fast // extended precision division algorithm // that employs the DIV instruction. // // Returns quotient in "quotient" // Returns remainder in eax. // Trashes EBX, EDX, and EDI. procedure DivideBy10( dividend:uns128; var quotient:uns128 ); @nodisplay; begin DivideBy10; mov( quotient, edi ); xor( edx, edx ); mov( dividend[12], eax ); mov( 10, ebx ); div( ebx, edx:eax ); mov( eax, [edi+12] ); mov( dividend[8], eax ); div( ebx, edx:eax ); mov( eax, [edi+8] ); mov( dividend[4], eax ); div( ebx, edx:eax ); mov( eax, [edi+4] ); mov( dividend[0], eax ); div( ebx, edx:eax ); mov( eax, [edi+0] ); mov( edx, eax ); end DivideBy10; // Recursive version of putu128. // A separate "shell" procedure calls this so that // this code does not have to preserve all the registers // it uses (and DivideBy10 uses) on each recursive call. procedure recursivePutu128( b128:uns128 ); @nodisplay; var remainder: byte; begin recursivePutu128; // Divide by ten and get the remainder (the char to print). DivideBy10( b128, b128 ); mov( al, remainder ); // Save away the remainder (0..9). // If the quotient (left in b128) is not zero, recursively // call this routine to print the H.O. digits. mov( b128[0], eax ); // If we logically OR all the dwords or( b128[4], eax ); // together, the result is zero if and or( b128[8], eax ); // only if the entire number is zero. or( b128[12], eax ); if( @nz ) then recursivePutu128( b128 ); endif; // Okay, now print the current digit. mov( remainder, al ); or( '0', al ); // Converts 0..9 -> '0..'9'. stdout.putc( al ); end recursivePutu128; // Non-recursive shell to the above routine so we don't bother // saving all the registers on each recursive call. procedure putu128( b128:uns128 ); @nodisplay; begin putu128; push( eax ); push( ebx ); push( edx ); push( edi ); recursivePutu128( b128 ); pop( edi ); pop( edx ); pop( ebx ); pop( eax ); end putu128; // Code to test the routines above: static b0: uns128 := [0, 0, 0, 0]; // decimal = 0 b1: uns128 := [1234567890, 0, 0, 0]; // decimal = 1234567890 b2: uns128 := [$8000_0000, 0, 0, 0]; // decimal = 2147483648 b3: uns128 := [0, 1, 0, 0 ]; // decimal = 4294967296 // Largest uns128 value // (decimal=340,282,366,920,938,463,463,374,607,431,768,211,455): b4: uns128 := [$FFFF_FFFF, $FFFF_FFFF, $FFFF_FFFF, $FFFF_FFFF ]; begin out128; stdout.put( "b0 = " ); putu128( b0 ); stdout.newln(); stdout.put( "b1 = " ); putu128( b1 ); stdout.newln(); stdout.put( "b2 = " ); putu128( b2 ); stdout.newln(); stdout.put( "b3 = " ); putu128( b3 ); stdout.newln(); stdout.put( "b4 = " ); putu128( b4 ); stdout.newln(); end out128; Program 4.4 128-bit Extended Precision Decimal Output Routine4.2.13.3 Extended Precision Signed Decimal Output
Once you have an extended precision unsigned decimal output routine, writing an extended precision signed decimal output routine is very easy. The basic algorithm takes the following form:
- Check the sign of the number. If it is positive, call the unsigned output routine to print it.
- If the number is negative, print a minus sign. Then negate the number and call the unsigned output routine to print it.
To check the sign of an extended precision integer, of course, you simply test the H.O. bit of the number. To negate a large value, the best solution is to probably subtract that value from zero. Here's a quick version of puti128 that uses the putu128 routine from the previous section.
procedure puti128( i128: int128 ); nodisplay; begin puti128; if( (type int32 i128[12]) < 0 ) then stdout.put( '-' ); // Extended Precision Negation: push( eax ); mov( 0, eax ); sub( i128[0], eax ); mov( eax, i128[0] ); mov( 0, eax ); sbb( i128[4], eax ); mov( eax, i128[4] ); mov( 0, eax ); sbb( i128[8], eax ); mov( eax, i128[8] ); mov( 0, eax ); sbb( i128[12], eax ); mov( eax, i128[12] ); pop( eax ); endif; putu128( (type uns128 i128)); end puti128;4.2.13.4 Extended Precision Formatted I/O
The code in the previous two sections prints signed and unsigned integers using the minimum number of necessary print positions. To create nicely formatted tables of values you will need the equivalent of a puti128Size or putu128Size routine. Once you have the "unformatted" versions of these routines, implementing the formatted versions is very easy.
The first step is to write an "i128Size" and a "u128Size" routine that computes the minimum number of digits needed to display the value. The algorithm to accomplish this is very similar to the numeric output routines. In fact, the only difference is that you initialize a counter to zero upon entry into the routine (e.g., the non-recursive shell routine) and you increment this counter rather than outputting a digit on each recursive call. (Don't forget to increment the counter inside "i128Size" if the number is negative; you must allow for the output of the minus sign.) After the calculation is complete, these routines should return the size of the operand in the EAX register.
Once you have the "i128Size" and "u128Size" routines, writing the formatted output routines is very easy. Upon initial entry into puti128Size or putu128Size, these routines call the corresponding "size" routine to determine the number of print positions for the number to display. If the value that the "size" routine returns is greater than the absolute value of the minimum size parameter (passed into puti128Size or putu128Size) all you need to do is call the put routine to print the value, no other formatting is necessary. If the absolute value of the parameter size is greater than the value i128Size or u128Size returns, then the program must compute the difference between these two values and print that many spaces (or other filler character) before printing the number (if the parameter size value is positive) or after printing the number (if the parameter size value is negative). The actual implementation of these two routines is left as an exercise at the end of the volume. If you have any further questions about how to do this, you can take a look at the HLA Standard Library code for routines like stdout.putu32Size.
4.2.13.5 Extended Precision Input Routines
There are a couple of fundamental differences between the extended precision output routines and the extended precision input routines. First of all, numeric output generally occurs without possibility of error1; numeric input, on the other hand, must handle the very real possibility of an input error such as illegal characters and numeric overflow. Also, HLA's Standard Library and run-time system encourages a slightly different approach to input conversion. This section discusses those issues that differentiate input conversion from output conversion.
Perhaps the biggest difference between input and output conversion is the fact that output conversion is unbracketed. That is, when converting a numeric value to a string of characters for output, the output routine does not concern itself with characters preceding the output string nor does it concerning itself with the characters following the numeric value in the output stream. Numeric output routines convert their data to a string and print that string without considering the context (i.e., the characters before and after the string representation of the numeric value). Numeric input routines cannot be so cavalier; the contextual information surrounding the numeric string is very important.
A typical numeric input operation consists of reading a string of characters from the user and then translating this string of characters into an internal numeric representation. For example, a statement like "stdin.get(i32);" typically reads a line of text from the user and converts a sequence of digits appearing at the beginning of that line of text into a 32-bit signed integer (assuming i32 is an int32 object). Note, however, that the stdin.get routine skips over certain characters in the string that may appear before the actual numeric characters. For example, stdin.get automatically skips any leading spaces in the string. Likewise, the input string may contain additional data beyond the end of the numeric input (for example, it is possible to read two integer values from the same input line), therefore the input conversion routine must somehow determine where the numeric data ends in the input stream. Fortunately, HLA provides a simple mechanism that lets you easily determine the start and end of the input data: the Delimiters character set.
The Delimiters character set is a variable, internal to HLA, that contains the set of legal characters that may precede or follow a legal numeric value. By default, this character set includes the end of string marker (a zero byte), a tab character, a line feed character, a carriage return character, a space, a comma, a colon, and a semicolon. Therefore, HLA's numeric input routines will automatically ignore any characters in this set that occur on input before a numeric string. Likewise, characters from this set may legally follow a numeric string on input (conversely, if any non-delimiter character follows the numeric string, HLA will raise an ex.ConversionError exception).
The Delimiters character set is a private variable inside the HLA Standard Library. Although you do not have direct access to this object, the HLA Standard Library does provide two accessor functions, conv.setDelimiters and conv.getDelimiters that let you access and modify the value of this character set. These two functions have the following prototypes (found in the "conv.hhf" header file):
procedure conv.setDelimiters( Delims:cset ); procedure conv.getDelimiters( var Delims:cset );The conv.SetDelimiters procedure will copy the value of the Delims parameter into the internal Delimiters character set. Therefore, you can use this procedure to change the character set if you want to use a different set of delimiters for numeric input. The conv.getDelimiters call returns a copy of the internal Delimiters character set in the variable you pass as a parameter to the conv.getDelimiters procedure. We will use the value returned by conv.getDelimiters to determine the end of numeric input when writing our own extended precision numeric input routines.
When reading a numeric value from the user, the first step will be to get a copy of the Delimiters character set. The second step is to read and discard input characters from the user as long as those characters are members of the Delimiters character set. Once a character is found that is not in the Delimiters set, the input routine must check this character and verify that it is a legal numeric character. If not, the program should raise an ex.IllegalChar exception if the character's value is outside the range $00..$7f or it should raise the ex.ConversionError exception if the character is not a legal numeric character. Once the routine encounters a numeric character, it should continue reading characters as long as they valid numeric characters; while reading the characters the conversion routine should be translating them to the internal representation of the numeric data. If, during conversion, an overflow occurs, the procedure should raise the ex.ValueOutOfRange exception.
Conversion to numeric representation should end when the procedure encounters the first delimiter character at the end of the string of digits. However, it is very important that the procedure does not consume the delimiter character that ends the string. That is, the following is incorrect:
static Delimiters: cset; . . . conv.getDelimiters( Delimiters ); // Skip over leading delimiters in the string: while( stdin.getc() in Delimiters ) do /* getc did the work */ endwhile; while( al in {'0'..'9'}) do // Convert character in AL to numeric representation and // accumulate result... stdin.getc(); endwhile; if( al not in Delimiters ) then raise( ex.ConversionError ); endif;The first WHILE loop reads a sequence of delimiter characters. When this first WHILE loop ends, the character in AL is not a delimiter character. So far, so good. The second WHILE loop processes a sequence of decimal digits. First, it checks the character read in the previous WHILE loop to see if it is a decimal digit; if so, it processes that digit and reads the next character. This process continues until the call to stdin.getc (at the bottom of the loop) reads a non-digit character. After the second WHILE loop, the program checks the last character read to ensure that it is a legal delimiter character for a numeric input value.
The problem with this algorithm is that it consumes the delimiter character after the numeric string. For example, the colon symbol is a legal delimiter in the default Delimiters character set. If the user types the input "123:456" and executes the code above, this code will properly convert "123" to the numeric value one hundred twenty-three. However, the very next character read from the input stream will be the character "4" not the colon character (":"). While this may be acceptable in certain circumstances, Most programmers expect numeric input routines to consume only leading delimiter characters and the numeric digit characters. They do not expect the input routine to consume any trailing delimiter characters (e.g., many programs will read the next character and expect a colon as input if presented with the string "123:456"). Since stdin.getc consumes an input character, and there is no way to "put the character back" onto the input stream, some other way of reading input characters from the user, that doesn't consume those characters, is needed2.
The HLA Standard Library comes to the rescue by providing the stdin.peekc function. Like stdin.getc, the stdin.peekc routine reads the next input character from HLA's internal buffer. There are two major differences between stdin.peekc and stdin.getc. First, stdin.peekc will not force the input of a new line of text from the user if the current input line is empty (or you've already read all the text from the input line). Instead, stdin.peekc simply returns zero in the AL register to indicate that there are no more characters on the input line. Since #0 is (by default) a legal delimiter character for numeric values, and the end of line is certainly a legal way to terminate numeric input, this works out rather well. The second difference between stdin.getc and stdin.peekc is that stdin.peekc does not consume the character read from the input buffer. If you call stdin.peekc several times in a row, it will always return the same character; likewise, if you call stdin.getc immediately after stdin.peekc, the call to stdin.getc will generally return the same character as returned by stdin.peekc (the only exception being the end of line condition). So although we cannot put characters back onto the input stream after we've read them with stdin.getc, we can peek ahead at the next character on the input stream and base our logic on that character's value. A corrected version of the previous algorithm might be the following:
static Delimiters: cset; . . . conv.getDelimiters( Delimiters ); // Skip over leading delimiters in the string: while( stdin.peekc() in Delimiters ) do // If at the end of the input buffer, we must explicitly read a // new line of text from the user. stdin.peekc does not do this // for us. if( al = #0 ) then stdin.ReadLn(); else stdin.getc(); // Remove delimiter from the input stream. endif; endwhile; while( stdin.peekc in {'0'..'9'}) do stdin.getc(); // Remove the input character from the input stream. // Convert character in AL to numeric representation and // accumulate result... endwhile; if( al not in Delimiters ) then raise( ex.ConversionError ); endif;Note that the call to stdin.peekc in the second WHILE does not consume the delimiter character when the expression evaluates false. Hence, the delimiter character will be the next character read after this algorithm finishes.
The only remaining comment to make about numeric input is to point out that the HLA Standard Library input routines allow arbitrary underscores to appear within a numeric string. The input routines ignore these underscore characters. This allows the user to input strings like "FFFF_F012" and "1_023_596" which are a little more readable than "FFFFF012" or "1023596". To allow underscores (or any other symbol you choose) within a numeric input routine is quite simple; just modify the second WHILE loop above as follows:
while( stdin.peekc in {'0'..'9', '_'}) do stdin.getc(); // Read the character from the input stream. // Ignore underscores while processing numeric input. if( al <> '_' ) then // Convert character in AL to numeric representation and // accumulate result... endif; endwhile;4.2.13.6 Extended Precision Hexadecimal Input
As was the case for numeric output, hexadecimal input is the easiest numeric input routine to write. The basic algorithm for hexadecimal string to numeric conversion is the following:
- Initialize the extended precision value to zero.
- For each input character that is a valid hexadecimal digit, do the following:
- Convert the hexadecimal character to a value in the range 0..15 ($0..$F).
- If the H.O. four bits of the extended precision value are non-zero, raise an exception.
- Multiply the current extended precision value by 16 (i.e., shift left four bits).
- Add the converted hexadecimal digit value to the accumulator.
- Check the last input character to ensure it is a valid delimiter. Raise an exception if it is not.
The following program implements this extended precision hexadecimal input routine for 128-bit values.
program Xin128; #include( "stdlib.hhf" ); // 128-bit unsigned integer data type: type b128: dword[4]; procedure getb128( var inValue:b128 ); @nodisplay; const HexChars := {'0'..'9', 'a'..'f', 'A'..'F', '_'}; var Delimiters: cset; LocalValue: b128; begin getb128; push( eax ); push( ebx ); // Get a copy of the HLA standard numeric input delimiters: conv.getDelimiters( Delimiters ); // Initialize the numeric input value to zero: xor( eax, eax ); mov( eax, LocalValue[0] ); mov( eax, LocalValue[4] ); mov( eax, LocalValue[8] ); mov( eax, LocalValue[12] ); // By default, #0 is a member of the HLA Delimiters // character set. However, someone may have called // conv.setDelimiters and removed this character // from the internal Delimiters character set. This // algorithm depends upon #0 being in the Delimiters // character set, so let's add that character in // at this point just to be sure. cs.unionChar( #0, Delimiters ); // If we're at the end of the current input // line (or the program has yet to read any input), // for the input of an actual character. if( stdin.peekc() = #0 ) then stdin.readLn(); endif; // Skip the delimiters found on input. This code is // somewhat convoluted because stdin.peekc does not // force the input of a new line of text if the current // input buffer is empty. We have to force that input // ourselves in the event the input buffer is empty. while( stdin.peekc() in Delimiters ) do // If we're at the end of the line, read a new line // of text from the user; otherwise, remove the // delimiter character from the input stream. if( al = #0 ) then stdin.readLn(); // Force a new input line. else stdin.getc(); // Remove the delimiter from the input buffer. endif; endwhile; // Read the hexadecimal input characters and convert // them to the internal representation: while( stdin.peekc() in HexChars ) do // Actually read the character to remove it from the // input buffer. stdin.getc(); // Ignore underscores, process everything else. if( al <> '_' ) then if( al in '0'..'9' ) then and( $f, al ); // '0'..'9' -> 0..9 else and( $f, al ); // 'a'/'A'..'f'/'F' -> 1..6 add( 9, al ); // 1..6 -> 10..15 endif; // Conversion algorithm is the following: // // (1) LocalValue := LocalValue * 16. // (2) LocalValue := LocalValue + al // // Note that "* 16" is easily accomplished by // shifting LocalValue to the left four bits. // // Overflow occurs if the H.O. four bits of LocalValue // contain a non-zero value prior to this operation. // First, check for overflow: test( $F0, (type byte LocalValue[15])); if( @nz ) then raise( ex.ValueOutOfRange ); endif; // Now multiply LocalValue by 16 and add in // the current hexadecimal digit (in EAX). mov( LocalValue[8], ebx ); shld( 4, ebx, LocalValue[12] ); mov( LocalValue[4], ebx ); shld( 4, ebx, LocalValue[8] ); mov( LocalValue[0], ebx ); shld( 4, ebx, LocalValue[4] ); shl( 4, ebx ); add( eax, ebx ); mov( ebx, LocalValue[0] ); endif; endwhile; // Okay, we've encountered a non-hexadecimal character. // Let's make sure it's a valid delimiter character. // Raise the ex.ConversionError exception if it's invalid. if( al not in Delimiters ) then raise( ex.ConversionError ); endif; // Okay, this conversion has been a success. Let's store // away the converted value into the output parameter. mov( inValue, ebx ); mov( LocalValue[0], eax ); mov( eax, [ebx] ); mov( LocalValue[4], eax ); mov( eax, [ebx+4] ); mov( LocalValue[8], eax ); mov( eax, [ebx+8] ); mov( LocalValue[12], eax ); mov( eax, [ebx+12] ); pop( ebx ); pop( eax ); end getb128; // Code to test the routines above: static b1:b128; begin Xin128; stdout.put( "Input a 128-bit hexadecimal value: " ); getb128( b1 ); stdout.put ( "The value is: $", b1[12], '_', b1[8], '_', b1[4], '_', b1[0], nl ); end Xin128; Program 4.5 Extended Precision Hexadecimal InputExtending this code to handle objects that are not 128 bits long is very easy. There are only three changes necessary: you must zero out the whole object at the beginning of the getb128 routine; when checking for overflow (the "test( $F, (type byte LocalValue[15]));" instruction) you must test the H.O. four bits of the new object you're processing; and you must modify the code that multiplies LocalValue by 16 (via SHLD) so that it multiplies your object by 16 (i.e., shifts it to the left four bits).
4.2.13.7 Extended Precision Unsigned Decimal Input
The algorithm for extended precision unsigned decimal input is nearly identical to that for hexadecimal input. In fact, the only difference (beyond only accepting decimal digits) is that you multiply the extended precision value by 10 rather than 16 for each input character (in general, the algorithm is the same for any base; just multiply the accumulating value by the input base). The following code demonstrates how to write a 128-bit unsigned decimal input routine.
program Uin128; #include( "stdlib.hhf" ); // 128-bit unsigned integer data type: type u128: dword[4]; procedure getu128( var inValue:u128 ); @nodisplay; var Delimiters: cset; LocalValue: u128; PartialSum: u128; begin getu128; push( eax ); push( ebx ); push( ecx ); push( edx ); // Get a copy of the HLA standard numeric input delimiters: conv.getDelimiters( Delimiters ); // Initialize the numeric input value to zero: xor( eax, eax ); mov( eax, LocalValue[0] ); mov( eax, LocalValue[4] ); mov( eax, LocalValue[8] ); mov( eax, LocalValue[12] ); // By default, #0 is a member of the HLA Delimiters // character set. However, someone may have called // conv.setDelimiters and removed this character // from the internal Delimiters character set. This // algorithm depends upon #0 being in the Delimiters // character set, so let's add that character in // at this point just to be sure. cs.unionChar( #0, Delimiters ); // If we're at the end of the current input // line (or the program has yet to read any input), // for the input of an actual character. if( stdin.peekc() = #0 ) then stdin.readLn(); endif; // Skip the delimiters found on input. This code is // somewhat convoluted because stdin.peekc does not // force the input of a new line of text if the current // input buffer is empty. We have to force that input // ourselves in the event the input buffer is empty. while( stdin.peekc() in Delimiters ) do // If we're at the end of the line, read a new line // of text from the user; otherwise, remove the // delimiter character from the input stream. if( al = #0 ) then stdin.readLn(); // Force a new input line. else stdin.getc(); // Remove the delimiter from the input buffer. endif; endwhile; // Read the decimal input characters and convert // them to the internal representation: while( stdin.peekc() in '0'..'9' ) do // Actually read the character to remove it from the // input buffer. stdin.getc(); // Ignore underscores, process everything else. if( al <> '_' ) then and( $f, al ); // '0'..'9' -> 0..9 mov( eax, PartialSum[0] ); // Save to add in later. // Conversion algorithm is the following: // // (1) LocalValue := LocalValue * 10. // (2) LocalValue := LocalValue + al // // First, multiply LocalValue by 10: mov( 10, eax ); mul( LocalValue[0], eax ); mov( eax, LocalValue[0] ); mov( edx, PartialSum[4] ); mov( 10, eax ); mul( LocalValue[4], eax ); mov( eax, LocalValue[4] ); mov( edx, PartialSum[8] ); mov( 10, eax ); mul( LocalValue[8], eax ); mov( eax, LocalValue[8] ); mov( edx, PartialSum[12] ); mov( 10, eax ); mul( LocalValue[12], eax ); mov( eax, LocalValue[12] ); // Check for overflow. This occurs if EDX // contains a none zero value. if( edx /* <> 0 */ ) then raise( ex.ValueOutOfRange ); endif; // Add in the partial sums (including the // most recently converted character). mov( PartialSum[0], eax ); add( eax, LocalValue[0] ); mov( PartialSum[4], eax ); adc( eax, LocalValue[4] ); mov( PartialSum[8], eax ); adc( eax, LocalValue[8] ); mov( PartialSum[12], eax ); adc( eax, LocalValue[12] ); // Another check for overflow. If there // was a carry out of the extended precision // addition above, we've got overflow. if( @c ) then raise( ex.ValueOutOfRange ); endif; endif; endwhile; // Okay, we've encountered a non-decimal character. // Let's make sure it's a valid delimiter character. // Raise the ex.ConversionError exception if it's invalid. if( al not in Delimiters ) then raise( ex.ConversionError ); endif; // Okay, this conversion has been a success. Let's store // away the converted value into the output parameter. mov( inValue, ebx ); mov( LocalValue[0], eax ); mov( eax, [ebx] ); mov( LocalValue[4], eax ); mov( eax, [ebx+4] ); mov( LocalValue[8], eax ); mov( eax, [ebx+8] ); mov( LocalValue[12], eax ); mov( eax, [ebx+12] ); pop( edx ); pop( ecx ); pop( ebx ); pop( eax ); end getu128; // Code to test the routines above: static b1:u128; begin Uin128; stdout.put( "Input a 128-bit decimal value: " ); getu128( b1 ); stdout.put ( "The value is: $", b1[12], '_', b1[8], '_', b1[4], '_', b1[0], nl ); end Uin128; Program 4.6 Extended Precision Unsigned Decimal InputAs for hexadecimal input, extending this decimal input to some number of bits beyond 128 is fairly easy. All you need do is modify the code that zeros out the LocalValue variable and the code that multiplies LocalValue by ten (overflow checking is done in this same code, so there are only two spots in this code that require modification).
4.2.13.8 Extended Precision Signed Decimal Input
Once you have an unsigned decimal input routine, writing a signed decimal input routine is easy. The following algorithm describes how to accomplish this:
- Consume any delimiter characters at the beginning of the input stream.
- If the next input character is a minus sign, consume this character and set a flag noting that the number is negative.
- Call the unsigned decimal input routine to convert the rest of the string to an integer.
- Check the return result to make sure it's H.O. bit is clear. Raise the ex.ValueOutOfRange exception if the H.O. bit of the result is set.
- If the sign flag was set in step two above, negate the result.
The actual code is left as a programming exercise at the end of this volume.
1Technically speaking, this isn't entirely true. It is possible for a device error (e.g., disk full) to occur. The likelihood of this is so low that we can effectively ignore this possibility.
2The HLA Standard Library routines actually buffer up input lines in a string and process characters out of the string. This makes it easy to "peek" ahead one character when looking for a delimiter to end the input value. Your code can also do this, however, the code in this chapter will use a different approach.
|