[Next] [Art of Assembly] [Randall Hyde] [WEBster Home Page]


Art of Assembly Language: Chapter Twenty



Art of Assembly/Win32 Edition is now available. Let me read that version.


PLEASE: Before emailing me asking how to get a hard copy of this text, read this.


PDF version of text. The Best Way to read "The Art of Assembly Language Programming"
Support Software for "Art of Assembly"


Important Notice: As you have probably discovered by now, I am no longer updating this document. The reason is quite simple: I'm working on a Windows version of "The Art of Assembly Language Programming". In the past I have encouraged individuals to send me corrections to this text. However, as I am no longer updating this material, don't expect those correctioins to appear in a future release. I am collecting errata that I will post to Webster someday, so feel free to continue sending corrections to AoA/DOS (16-bit) to rhyde@cs.ucr.edu. If you're more interested in leading edge material, please see the information about the Win/32 edition, above.


The Legal Stuff (Copyrights, etc.)


Chapter 20 - The PC Keyboard
20.1 - Keyboard Basics
20.2 - The Keyboard Hardware Interface
20.3 - The Keyboard DOS Interface
20.4 - The Keyboard BIOS Interface
20.5 - The Keyboard Interrupt Service Routine
20.6 - Patching into the INT 9 Interrupt Service Routine
20.7 - Simulating Keystrokes
20.7.1 - Stuffing Characters in the Type Ahead Buffer
20.7.2 - Using the 80x86 Trace Flag to Simulate IN AL, 60H Instructions
20.7.3 - Using the 8042 Microcontroller to Simulate Keystrokes



Chapter 20 The PC Keyboard


The PC's keyboard is the primary human input device on the system. Although it seems rather mundane, the keyboard is the primary input device for most software, so learning how to program the keyboard properly is very important to application developers.

IBM and countless keyboard manufacturers have produced numerous keyboards for PCs and compatibles. Most modern keyboards provide at least 101 different keys and are reasonably compatible with the IBM PC/AT 101 Key Enhanced Keyboard. Those that do provide extra keys generally program those keys to emit a sequence of other keystrokes or allow the user to program a sequence of keystrokes on the extra keys. Since the 101 key keyboard is ubiquitous, we will assume its use in this chapter.

When IBM first developed the PC, they used a very simple interface between the keyboard and the computer. When IBM introduced the PC/AT, they completely redesigned the keyboard interface. Since the introduction of the PC/AT, almost every keyboard has conformed to the PC/AT standard. Even when IBM introduced the PS/2 systems, the changes to the keyboard interface were minor and upwards compatible with the PC/AT design. Therefore, this chapter will also limit its attention to PC/AT compatible devices since so few PC/XT keyboards and systems are still in use.

There are five main components to the keyboard we will consider in this chapter - basic keyboard information, the DOS interface, the BIOS interface, the int 9 keyboard interrupt service routine, and the hardware interface to the keyboard. The last section of this chapter will discuss how to fake keyboard input into an application.


20.1 Keyboard Basics


The PC's keyboard is a computer system in its own right. Buried inside the keyboards case is an 8042 microcontroller chip that constantly scans the switches on the keyboard to see if any keys are down. This processing goes on in parallel with the normal activities of the PC, hence the keyboard never misses a keystroke because the 80x86 in the PC is busy.

A typical keystroke starts with the user pressing a key on the keyboard. This closes an electrical contact in the switch so the microcontroller and sense that you've pressed the switch. Alas, switches (being the mechanical things that they are) do not always close (make contact) so cleanly. Often, the contacts bounce off one another several times before coming to rest making a solid contact. If the microcontroller chip reads the switch constantly, these bouncing contacts will look like a very quick series of key presses and releases. This could generate multiple keystrokes to the main computers, a phenomenon known as keybounce, common to many cheap and old keyboards. But even on the most expensive and newest keyboards, keybounce is a problem if you look at the switch a million times a second; mechanical switches simply cannot settle down that quickly. Most keyboard scanning algorithms, therefore, control how often they scan the keyboard. A typical inexpensive key will settle down within five milliseconds, so if the keyboard scanning software only looks at the key every ten milliseconds, or so, the controller will effectively miss the keybounce.

Simply noting that a key is pressed is not sufficient reason to generate a key code. A user may hold a key down for many tens of milliseconds before releasing it. The keyboard controller must not generate a new key sequence every time it scans the keyboard and finds a key held down. Instead, it should generate a single key code value when the key goes from an up position to the down position (a down key operation). Upon detecting a down key stroke, the microcontroller sends a keyboard scan code to the PC. The scan code is not related to the ASCII code for that key, it is an arbitrary value IBM chose when they first developed the PC's keyboard.

The PC keyboard actually generates two scan codes for every key you press. It generates a down code when you press a key and an up code when you release the key. The 8042 microcontroller chip transmits these scan codes to the PC where they are processed by the keyboard's interrupt service routine. Having separate up and down codes is important because certain keys (like shift, control, and alt) are only meaningful when held down. By generating up codes for all the keys, the keyboard ensures that the keyboard interrupt service routine knows which keys are pressed while the user is holding down one of these modifier keys. The following table lists the scan codes that the keyboard microcontroller transmits to the PC:

PC Keyboard Scan Codes (in hex)
Key Down Up Key Down Up Key Down Up Key Down Up
Esc 1 81 [ { 1A 9A , < 33 B3 center 4C CC
1 ! 2 82 ] } 1B 9B . > 34 B4 right 4D CD
2 @ 3 83 Enter 1C 9C / ? 35 B5 + 4E CE
3 # 4 84 Ctrl 1D 9D R shift 36 B6 end 4F CF
4 $ 5 85 A 1E 9E * PrtSc 37 B7 down 50 D0
5 % 6 86 S 1F 9F alt 38 B8 pgdn 51 D1
6 ^ 7 87 D 20 A0 space 39 B9 ins 52 D2
7 & 8 88 F 21 A1 CAPS 3A BA del 53 D3
8 * 9 89 G 22 A2 F1 3B BB / E0 35 B5
9 ( 0A 8A H 23 A3 F2 3C BC enter E0 1C 9C
0 ) 0B 8B J 24 A4 F3 3D BD F11 57 D7
- _ 0C 8C K 25 A5 F4 3E BE F12 58 D8
= + 0D 8D L 26 A6 F5 3F BF ins E0 52 D2
Bksp 0E 8E ; : 27 A7 F6 40 C0 del E0 53 D3
Tab 0F 8F ' " 28 A8 F7 41 C1 home E0 47 C7
Q 10 90 ` ~ 29 A9 F8 42 C2 end E0 4F CF
W 11 91 L shift 2A AA F9 43 C3 pgup E0 49 C9
E 12 92 \ | 2B AB F10 44 C4 pgdn E0 51 D1
R 13 93 Z 2C AC NUM 45 C5 left E0 4B CB
T 14 94 X 2D AD SCRL 46 C6 right E0 4D CD
Y 15 95 C 2E AE home 47 C7 up E0 48 C8
U 16 96 V 2F AF up 48 C8 down E0 50 D0
I 17 97 B 30 B0 pgup 49 C9 R alt E0 38 B8
O 18 98 N 31 B1 - 4A CA R ctrl E0 1D 9D
P 19 99 M 32 B2 left 4B CB Pause E1 1D 45 E1 9D C5  

-


The keys in italics are found on the numeric keypad. Note that certain keys transmit two or more scan codes to the system. The keys that transmit more than one scan code were new keys added to the keyboard when IBM designed the 101 key enhanced keyboard.

When the scan code arrives at the PC, a second microcontroller chip receives the scan code, does a conversion on the scan code, makes the scan code available at I/O port 60h, and then interrupts the processor and leaves it up to the keyboard ISR to fetch the scan code from the I/O port.

The keyboard (int 9) interrupt service routine reads the scan code from the keyboard input port and processes the scan code as appropriate. Note that the scan code the system receives from the keyboard microcontroller is a single value, even though some keys on the keyboard represent up to four different values. For example, the "A" key on the keyboard can produce A, a, ctrl-A, or alt-A. The actual code the system yields depends upon the current state of the modifier keys (shift, ctrl, alt, capslock, and numlock). For example, if an A key scan code comes along (1Eh) and the shift key is down, the system produces the ASCII code for an uppercase A. If the user is pressing multiple modifier keys the system prioritizes them from low to high as follows:

Numlock and capslock affect different sets of keys, so there is no ambiguity resulting from their equal precedence in the above chart. If the user is pressing two modifier keys at the same time, the system only recognizes the modifier key with the highest priority above. For example, if the user is pressing the ctrl and alt keys at the same time, the system only recognizes the alt key. The numlock, capslock, and shift keys are a special case. If numlock or capslock is active, pressing the shift key makes it inactive. Likewise, if numlock or capslock is inactive, pressing the shift key effectively "activates" these modifiers.

Not all modifiers are legal for every key. For example, ctrl-8 is not a legal combination. The keyboard interrupt service routine ignores all keypresses combined with illegal modifier keys. For some unknown reason, IBM decided to make certain key combinations legal and others illegal. For example, ctrl-left and ctrl-right are legal, but ctrl-up and ctrl-down are not. You'll see how to fix this problem a little later.

The shift, ctrl, and alt keys are active modifiers. That is, modification to a keypress occurs only while the user holds down one of these modifier keys. The keyboard ISR keeps track of whether these keys are down or up by setting an associated bit upon receiving the down code and clearing that bit upon receiving the up code for shift, ctrl, or alt. In contrast, the numlock, scroll lock, and capslock keys are toggle modifiers. The keyboard ISR inverts an associated bit every time it sees a down code followed by an up code for these keys.

Most of the keys on the PC's keyboard correspond to ASCII characters. When the keyboard ISR encounters such a character, it translates it to a 16 bit value whose L.O. byte is the ASCII code and the H.O. byte is the key's scan code. For example, pressing the "A" key with no modifier, with shift, and with control produces 1E61h, 1E41h, and 1E01h, respectively ("a", "A", and ctrl-A). Many key sequences do not have corresponding ASCII codes. For example, the function keys, the cursor control keys, and the alt key sequences do not have corresponding ASCII codes. For these special extended code, the keyboard ISR stores a zero in the L.O. byte (where the ASCII code typically goes) and the extended code goes in the H.O. byte. The extended code is usually, though certainly not always, the scan code for that key.

The only problem with this extended code approach is that the value zero is a legal ASCII character (the NUL character). Therefore, you cannot directly enter NUL characters into an application. If an application must input NUL characters, IBM has set aside the extended code 0300h (ctrl-3) for this purpose. You application must explicitly convert this extended code to the NUL character (actually, it need only recognize the H.O. value 03, since the L.O. byte already is the NUL character). Fortunately, very few programs need to allow the input of the NUL character from the keyboard, so this problem is rarely an issue.

The following table lists the scan and extended key codes the keyboard ISR generates for applications in response to a keypress with various modifiers. Extended codes are in italics. All other values (except the scan code column) represent the L.O. eight bits of the 16 bit code. The H.O. byte comes from the scan code column.

Keyboard Codes (in hex)
Key Scan Code ASCII Shift Ctrl Alt Num Caps Shift Caps Shift Num
Esc 01 1B 1B 1B - 1B 1B 1B 1B
1 ! 02 31 21   7800 31 31 31 31
2 @ 03 32 40 0300 7900 32 32 32 32
3 # 04 33 23 - 7A00 33 33 33 33
4 $ 05 34 24 - 7B00 34 34 34 34
5 % 06 35 25 - 7C00 35 35 35 35
6 ^ 07 36 5E 1E 7D00 36 36 36 36
7 & 08 37 26 - 7E00 37 37 37 37
8 * 09 38 2A - 7F00 38 38 38 38
9 ( 0A 39 28 - 8000 39 39 39 39
0 ) 0B 30 29 - 8100 30 30 30 30
- _ 0C 2D 5F 1F 8200 2D 2D 5F 5F
= + 0D 3D 2B - 8300 3D 3D 2B 2B
Bksp 0E 08 08 7F - 08 08 08 08
Tab 0F 09 0F00   - 09 09 0F00 0F00
Q 10 71 51 11 1000 71 51 71 51
W 11 77 57 17 1100 77 57 77 57
E 12 65 45 05 1200 65 45 65 45
R 13 72 52 12 1300 72 52 72 52
T 14 74 54 14 1400 74 54 74 54
Y 15 79 59 19 1500 79 59 79 59
U 16 75 55 15 1600 75 55 75 55
I 17 69 49 09 1700 69 49 69 49
O 18 6F 4F 0F 1800 6F 4F 6F 4F
P 19 70 50 10 1900 70 50 70 50
[ { 1A 5B 7B 1B - 5B 5B 7B 7B
] } 1B 5D 7D 1D - 5D 5D 7D 7D
enter 1C 0D 0D 0A - 0D 0D 0A 0A
ctrl 1D - - - - - - - -
A 1E 61 41 01 1E00 61 41 61 41
S 1F 73 53 13 1F00 73 53 73 53
D 20 64 44 04 2000 64 44 64 44
F 21 66 46 06 2100 66 46 66 46
G 22 67 47 07 2200 67 47 67 47
H 23 68 48 08 2300 68 48 68 48
J 24 6A 4A 0A 2400 6A 4A 6A 4A
K 25 6B 4B 0B 2500 6B 4B 6B 4B
L 26 6C 4C 0C 2600 6C 4C 6C 4C
; : 27 3B 3A - - 3B 3B 3A 3A
' " 28 27 22 - - 27 27 22 22
` ~ 29 60 7E - - 60 60 7E 7E
Lshift 2A - - - - - - - -
\ | 2B 5C 7C 1C - 5C 5C 7C 7C
Z 2C 7A 5A 1A 2C00 7A 5A 7A 5A
X 2D 78 58 18 2D00 78 58 78 58
C 2E 63 43 03 2E00 63 43 63 43
V 2F 76 56 16 2F00 76 56 76 56
B 30 62 42 02 3000 62 42 62 42
N 31 6E 4E 0E 3100 6E 4E 6E 4E
M 32 6D 4D 0D 3200 6D 4D 6D 4D
, < 33 2C 3C - - 2C 2C 3C 3C
. > 34 2E 3E - - 2E 2E 3E 3E
/ ? 35 2F 3F - - 2F 2F 3F 3F
Rshift 36 - - - - - - - -
* PrtSc 37 2A INT 5 10 - 2A 2A INT 5 INT 5
alt 38 - - - - - - - -
space 39 20 20 20 - 20 20 20 20
caps 3A - - - - - - - -
F1 3B 3B00 5400 5E00 6800 3B00 3B00 5400 5400
F2 3C 3C00 5500 5F00 6900 3C00 3C00 5500 5500
F3 3D 3D00 5600 6000 6A00 3D00 3D00 5600 5600
F4 3E 3E00 5700 6100 6B00 3E00 3E00 5700 5700
F5 3F 3F00 5800 6200 6C00 3F00 3F00 5800 5800
F6 40 4000 5900 6300 6D00 4000 4000 5900 5900
F7 41 4100 5A00 6400 6E00 4100 4100 5A00 5A00
F8 42 4200 5B00 6500 6F00 4200 4200 5B00 5B00
F9 43 4300 5C00 6600 7000 4300 4300 5C00 5C00
F10 44 4400 5D00 6700 7100 4400 4400 5D00 5D00
num 45 - - - - - - - -
scrl 46 - - - - - - - -
home 47 4700 37 7700 - 37 4700 37 4700
up 48 4800 38 - - 38 4800 38 4800
pgup 49 4900 39 8400 - 39 4900 39 4900
- (kpd) 4A 2D 2D - - 2D 2D 2D 2D
left 4B 4B00 34 7300 - 34 4B00 34 4B00
center 4C 4C00 35 - - 35 4C00 35 4C00
right 4D 4D00 36 7400 - 36 4D00 36 4D00
+ (kpd) 4E 2B 2B - - 2B 2B 2B 2B
end 4F 4F00 31 7500 - 31 4F00 31 4F00
down 50 5000 32 - - 32 5000 32 5000
pgdn 51 5100 33 7600 - 33 5100 33 5100
ins 52 5200 30 - - 30 5200 30 5200
del 53 5300 2E - - 2E 5300 2E 5300
Key

 
Scan Code ASCII Shift Ctrl Alt Num Caps Shift Caps Shift Num


The 101-key keyboards generally provide an enter key and a "/" key on the numeric keypad. Unless you write your own int 9 keyboard ISR, you will not be able to differentiate these keys from the ones on the main keyboard. The separate cursor control pad also generates the same extended codes as the numeric keypad, except it never generates numeric ASCII codes. Otherwise, you cannot differentiate these keys from the equivalent keys on the numeric keypad (assuming numlock is off, of course).

The keyboard ISR provides a special facility that lets you enter the ASCII code for a keystroke directly from the keyboard. To do this, hold down the alt key and typing out the decimal ASCII code (0..255) for a character on the numeric keypad. The keyboard ISR will convert these keystrokes to an eight-bit value, attach at H.O. byte of zero to the character, and use that as the character code.

The keyboard ISR inserts the 16 bit value into the PC's type ahead buffer. The system type ahead buffer is a circular queue that uses the following variables














40:1A - HeadPtr word ?
40:1C - TailPtr word ?
40:1E - Buffer  word 16 dup (?)

The keyboard ISR inserts data at the location pointed at by TailPtr. The BIOS keyboard function removes characters from the location pointed at by the HeadPtr variable. These two pointers almost always contain an offset into the Buffer array. If these two pointers are equal, the type ahead buffer is empty. If the value in HeadPtr is two greater than the value in TailPtr (or HeadPtr is 1Eh and TailPtr is 3Ch), then the buffer is full and the keyboard ISR will reject any additional keystrokes.

Note that the TailPtr variable always points at the next available location in the type ahead buffer. Since there is no "count" variable providing the number of entries in the buffer, we must always leave one entry free in the buffer area; this means the type ahead buffer can only hold 15 keystrokes, not 16.

In addition to the type ahead buffer, the BIOS maintains several other keyboard-related variables in segment 40h. The following table lists these variables and their contents:

Keyboard Related BIOS Variables
Name Address Size Description
KbdFlags1 (modifier flags) 40:17 Byte This byte maintains the current status of the modifier keys on the keyboard. The bits have the following meanings:

bit 7: Insert mode toggle

bit 6: Capslock toggle (1=capslock on)

bit 5: Numlock toggle (1=numlock on)

bit 4: Scroll lock toggle (1=scroll lock on)

bit 3: Alt key (1=alt is down)

bit 2: Ctrl key (1=ctrl is down)

bit 1: Left shift key (1=left shift is down)

bit 0: Right shift key (1=right shift is down)
KbdFlags2

(Toggle keys down)
40:18 Byte Specifies if a toggle key is currently down.

bit 7: Insert key (currently down if 1)

bit 6: Capslock key (currently down if 1)

bit 5: Numlock key (currently down if 1)

bit 4: Scroll lock key (currently down if 1)

bit 3: Pause state locked (ctrl-Numlock) if one

bit 2: SysReq key (currently down if 1)

bit 1: Left alt key (currently down if 1)

bit 0: Left ctrl key (currently down if 1)
AltKpd 40:19 Byte BIOS uses this to compute the ASCII code for an alt-Keypad sequence.
BufStart 40:80 Word Offset of start of keyboard buffer (1Eh). Note: this variable is not supported on many systems, be careful if you use it.
BufEnd 40:82 Word Offset of end of keyboard buffer (3Eh). See the note above.
KbdFlags3 40:96 Byte Miscellaneous keyboard flags.

bit 7: Read of keyboard ID in progress

bit 6: Last char is first kbd ID character

bit 5: Force numlock on reset

bit 4: 1 if 101-key kbd, 0 if 83/84 key kbd.

bit 3: Right alt key pressed if 1

bit 2: Right ctrl key pressed if 1

bit 1: Last scan code was E0h

bit 0: Last scan code was E1h
KbdFlags4 40:97 Byte More miscellaneous keyboard flags.

bit 7: Keyboard transmit error

bit 6: Mode indicator update

bit 5: Resend receive flag

bit 4: Acknowledge received

bit 3: Must always be zero

bit 2: Capslock LED (1=on)

bit 1: Numlock LED (1=on)

bit 0: Scroll lock LED (1=on)


One comment is in order about KbdFlags1 and KbdFlags4. Bits zero through two of the KbdFlags4 variable is BIOS' current settings for the LEDs on the keyboard. periodically, BIOS compares the values for capslock, numlock, and scroll lock in KbdFlags1 against these three bits in KbdFlags4. If they do not agree, BIOS will send an appropriate command to the keyboard to update the LEDs and it will change the values in the KbdFlags4 variable so the system is consistent. Therefore, if you mask in new values for numlock, scroll lock, or caps lock, the BIOS will automatically adjust KbdFlags4 and set the LEDs accordingly.

20.1 - Keyboard Basics
20.2 - The Keyboard Hardware Interface
20.3 - The Keyboard DOS Interface
20.4 - The Keyboard BIOS Interface
20.5 - The Keyboard Interrupt Service Routine
20.6 - Patching into the INT 9 Interrupt Service Routine
20.7 - Simulating Keystrokes
20.7.1 - Stuffing Characters in the Type Ahead Buffer
20.7.2 - Using the 80x86 Trace Flag to Simulate IN AL, 60H Instructions
20.7.3 - Using the 8042 Microcontroller to Simulate Keystrokes


Art of Assembly: Chapter Twenty - 29 SEP 1996

[Next] [Art of Assembly][Randall Hyde]



Number of Web Site Hits since Jan 1, 2000: