Hi there. In this article, I will continue with the VGA topic, and I have a fantastic example from 1994, that I want to show you. I wanted to write about this code in detail, therefore I didn't want to cram it into the previous post.;
I uploaded this code, I mentioned, to my github account. I must have downloaded it in late 90s, as the comment header indicates, it's a Basic code written in 1994 (Dear William Yu, if you're reading this, feel free to reach me out). There are two points I want to highlight in this code. The first one is the code snippet on line 12, that accesses the ninth CRT controller register:
OUT &H3D5, 1
This register consists of following bits [1]:
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| SD | LC9 | SVB9 | Maximum Scan Line | ||||
and the "Maximum Scan Line" field, which is modified by the code, repeats the pixels by one more than the value in this field on vertical axis in graphics modes. If its value is 1, each pixel appears twice as large vertically, as if the pixel just below it were also set. If the value of this field were 9, the pixels would be 10 times higher. Since the vertical resolution of the screen is a constant -in our example 640x480 pixels in mode 12h (line 11)- doubling the pixel height would mean reducing the visible screen size by two. In this example, 640x240 pixels would be visible on the screen. If we had expanded the pixels by 10, we would have obtained a resolution of 640x48 pixels on the visible screen. Obviously, the video memory size does not change, pixels below the half of the screen just won't be visible. They will only be visible again, when 0 is written to this field. In standard text mode, this field contains the value 15. As I mentioned in the previous article, this is the height of a standard character in pixels. If greater values are written to this field, the lines get spaced out. For quite smaller values, the characters get jumbled together and and the screen gets unreadable.
As the calculations in this code are based on 640x240 pixels (e.g. lines 30 and 35), this line cannot be commented out in an easy way. The pixels are doubled in size, the text on the screen is also twice as large (see right). At the loop between the lines 13 and 20, stars are printed on the screen and the planet is drawn between the lines 22 and 27. In the section, up to the line 35, a triangle (spacecraft) is drawn, and in section up to the line 43, this triangle is moved on the screen as an image block with GET and PUT commands. The remaining graphic effects are not quite important. The most important part is the EarthQuake SUB procedure (lines 87 to 94). Here, some values are written to the eighth register sequentially.
FOR X = 1 TO Delay
OUT &H3D4, 8: OUT &H3D5, X
NEXT X
That's the deal with the eighth register [1]:
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| Byte Panning | Preset Row Scan | ||||||
In my observation, the "Preset Row Scan" field has either no effect in graphics mode, or DosBOX cannot emulate it properly. Normally, this field shifts the origin of the text screen with pixel precision, and it works flawlessly in DosBOX. In other words, if I write 8 to this field, the upper half of the first character line will disappear and another half character line appears at the bottom of the screen. Basically, the screen is shifted up by the number of pixels written in this field. This is the exact register, which I also used for smooth scrolling. The "Byte Panning" field shifts the screen one character wide to the left. Therefore, the screen can be scrolled (but not smoothly) horizontally by 1, 2 or 3 characters wide, depending on the value of this field. For mode 12h, this is 8 pixels per character (640 pixels / 80 characters).
Other than this register, there is another register pair, I'd like to mention. These are Start Address Register Low and High.
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| Start Address Low | |||||||
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| Start Address High | |||||||
These don't have any bit fields. Normally, the top left corner of the screen is the origin, i.e. (0, 0) point of the screen and its memory location is 0x0 in VGA regardsless of text or graphics mode. But sometimes, it may be more convenient to take the center of the screen as the origin. In Mode 13h (320 x 200), taking the center of the screen as origin also means drawing the first image pixel at 160x100 point. The linear address of this pixel is 320 * 100 + 160 = 32160 = 7DA0h.
OUT &H3D4, &HC: OUT &H3D5, &H7D
With this code snippet, the origin is moved to the center of the screen. The value written to the zeroth offset of the video memory will appear in the center of the screen, after this point. This is roughly, what the WINDOW command in QBasic does. Of course, QBasic will also convert the negative coordinates given to PSET, LINE etc., by itself, whereas in lower level programming languages this task is left to the programmer.
If sequentially increasing values are written to this register, it will seen as a left scrolling effect on the screen. Obviously, it is not the characters, that are actually being scrolled, but the origin. Similarly, if values are written to the register in increments equal to the width of the screen (the number of characters per row for text modes, or the number of pixels on X-axis for graphic modes), it will seen as an upwards scrolling effect on the screen. The characters aren't moved from one memory block to another. The processor is just busy writing some values thru the ports. This is in theory the most optimal way to scroll the entire screen in VGA. However, to copy the characters, which disappear from the screen, to the part, which will be appearing on the screen, moving memory blocks is inevitable.
VGA Text Mode Structure
VGA text mode is pretty simple. Here, I'll explain 80x25 standard text mode. Even though the logic of the 40x25 text mode is quite similar, some addresses need to be recalculated for this mode. Monochrome mode video memory starts at the segment 0xB000 and ends at 0xB777. Color video memory starts at the segment 0xB800 and ends at 0xBFFF spanning 32 KB for each. Each character is one word, i.e. 16 bits. The low byte of this word contains the ASCII code of the character, and the high byte holds the color codes for character foreground and background [3]. Four lower bits of the color byte is the color of the character. There are 8 standard VGA colors; 0: black, 1: blue, 2: green, 3: cyan, 4: red, 5: magenta, 6: brown/yellow, 7: gray. Adding 8 to these values yields the high intensity versions of these colors. Bits 4, 5 and 6. keep the background color. And the most significant bit (7th) makes the character blink. Here is an example for this. The characters 'u' and 'g' normally blink.
VGA text mode consists of 8 pages. I already mentioned, that visible area of the screen consists of 80 x 25 = 2000 characters, i.e. 2000 words. In this case, the visible screen is 4000 bytes (0x0FA0) long. Since the video memory is 32 KB, it can be divided into 8 pages. First page starts at address 0xB800:0, next one at 0xB800:0xFA0, next one 0xB800:0x1F40 and so on. The number of pages for each screen mode can be seen in this table and you can switch between pages using Int 10h/AH=05h.
With that much background info, I can now move on to my own smooth scrolling code.
SCROLL.C: Smooth Scrolling in VGA Text Mode
First of all, why didn't I use the start address register for scrolling? Shortest answer is, I could have. If I copied the zeroth (actual) page to the first and second pages and switch to the first page, I would get a copy of the visible screen above and below. Then, I could implement scrolling by increasing or decreasing the start address register by 160. I'll leave this for another post for now.
My code, OTOH, interfere with other pages as little as possible (just a single line). This is the difference compared to the method, described above. I uploaded the code to my github account as usual.
I wrote it in Turbo C v3.0. It compiles and runs without problem. As I mentioned in the previous article, everything is done in DosBOX. Since Turbo C does not have true and false as built-in data types, I define them as shown on line seven and eight. I also define two pointers on sixteenth and seventeenth lines. First one is VGA video memory pointer for text mode [4]. Second one is a pointer, to hold a local copy of the screen memory, but even though I named it DoubleBuff, it doesn't really do double buffering [2], IMO. I'll get back to double buffering in a future article. This is a short pointer, and 80x25 words of memory are allocated on line 23. Why words? There are two reasons for this. First, processing data word by word is faster than byte by byte. Characters are moved with a single instruction along with their color codes. The second reason is code readability.
Scan codes of up arrow and down arrow keys are fetched, and based on this ScrollUp() or ScrollDown() functions are assigned to the 'fp' function pointer. Both functions scroll the screen by just a single line. Calling these functions 25 times in a for loop does full-screen scrolling.
ScrollUp
While scrolling the screen up by increasing the value at the preset row scan (PRS) field, the first row of the next page, which is normally invisible, becomes partially visible. Therefore, on line 60 of the code, I copy the top row of the visible screen to the top row of the next page before it gets visible. Then, I copy the first row of the screen to the bottom of the buffer, then second row to the first row of buffer and the subsequent rows to the one row above corresponding screen row in the DoubleBuff array. In other words, the line N+1 on the screen to the line N in the buffer.
Next, I save the initial value of the PRS field and increment it one by one from 0 to 15 (line 75). This makes the screen look like scrolling up pixel by pixel. At the same time, the row I copied from the top on the 60th line, which was initially invisible, begins to appear from the bottom of the screen. On line 80, I copy DoubleBuff to the visible screen using inline assembly. I used assembly here for speed, because the C code, I wrote, that does the same job, runs slower and causes flickering on the screen.
Finally, I write the old value of the PRS back. Here, I would actually need to write zero rather than its old value.
ScrollDown
In ScrollDown(), I used a different approach than in ScrollUp(). Since the PRS only scrolls the screen upward, for the downward scrolling effect, I first copied the bottom row into a local array (line 108), as this bottom row is going to disappear from the screen during scroll down. After moving the characters in the video memory, I write this line I copied, to the top row and finally write the highest possible value -which is 15- to the PRS field of the register (line 113).
I moved the video memory using assembly code again for speed, like in ScrollUp(). SI has 4000 and DI has 4160. This means, SI points to the first character of the first page (the invisible bottom page), and DI points to the first character on the second row of the first page. When CX, as a counter, has a value of 2000 (4000 in CX is divided by two with shr on line 136), the last character doesn't get copied, as copying starts including from the first character of the first page. Decreasing SI and DI by two (one word) requires 4 bytes of code (lines 137 .. 140, dec instructions). Instead of that, I increase CX and copy one extra character, but increasing CX is done with a single one byte inc instruction on line 141, instead of four. I could have decreased the value on line 132, too, but in this case (if I'm not mistaken), I'd have had to increase CX later, because I decremented the counter. Finally, I set the direction flag and copy backwards, so that the pointer values decrease. Since I'm scrolling downwards, if I had copied in forwards direction, I would have overwritten data, which I'll need for the next row. After copying (rep movsw, line 148) is done, I reset the direction flag to its original state.
Immediately after this, since the top row is now empty after copying, I write the row I copied to TempLine at the beginning of the function (line 156. Why didn't I do this with assembly?), but since I've already written the largest value to the PRS field, only one pixel of this row is yet visible (and it is largely black as well). Therefore it doesn't look too bad. On the 160th line, I complete the scrolling effect, by slowly decreasing the value at the PRS field.
waitlinefull and waitlinehalf
CRT monitors create images by scanning the screen. Electron guns normally send electrons to the center of the screen. To display an image on the screen, this electron beam is deflected by vertical and horizontal deflection coils -or yokes to be more specific- in color monitors. Scanning starts from the upper left corner of the screen and moves to the upper right first. During that, the voltage in the vertical coil stays constant, and sawtooth wave is applied to horizontal coil. This draws the first row of pixels on the screen. There is actually no direct correspondence between a pixel and a point on phosphor coating of the screen. Yet let's still assume it as a pixel for simplicity. After first line is done, the voltage on the vertical coil is increased, and this process is repeated for each row until scanning reaches the bottom right corner of the screen. So basically both coils get a sawtooth wave with different frequency. Of course, during this scan, the electron guns don't send electrons continuously, they turn on and off, depending on the picture. If they remained continuously on, a blank screen would appear.
![]() |
| Deflection Yokes (Source: Wikipedia) |
This scanning operation is called "vertical retrace" (VR) in the terminology, and if the video memory is changed while this process is ongoing, the image appears to flicker. To check scanning status, 0x3DA control register of VGA comes to rescue. The third bit of this register is set during VR. The programmer can (and should) check this bit, before writing anything to the screen. This check is made in waitlinefull() and waitlinehalf() functions.
In waitlinefull(), if there is no VR, the execution waits in the loop on line 173, because even if VR isn't active at that moment, it can start anytime before video memory operations are complete, and the image may still flicker. If VR is already in progress, this line has no effect and the execution continues from the next line and waits here in the loop, until VT is complete.
In waitlinehalf(), the execution waits in the loop only if VR is already in progress. It doesn't wait for the next VR cycle. In waitlinefull(), waiting until the next VR cycle starts, wastes too many CPU cycles on fast computers. Therefore this is skipped in waitlinehalf(). It waits for a shorter period, but it may sometimes fail to prevent flickering.
I put both functions into my code, and after completing the skeleton of the effect, I tried them out on various lines, until I achieved a smooth effect. Since waitlinehalf() takes a quite short period of time, I used waitlinefull() everywhere. As I mention in the previos post, DosBOX is just a VGA emulation. Therefore these procedures would need to be readjusted for real hardware. I actually chose waitlinefull() in the code, just because it provides a stable wait time, that doesn't vary much from machine to machine.
So, I only scratched the surface of VGA topic here. The smooth scrolling effect looks quite well, but this is just a small example of what can be done with VGA. By playing around with these registers, it's possible to create a wide variety of effects. In future posts, if I find time, I would like to explain a few more effects. And finally, here is a video of the effect:
[1]: http://www.osdever.net/FreeVGA/vga/crtcreg.htm#09
[2]: http://wiki.osdev.org/Double_Buffering
[3]: https://en.wikipedia.org/wiki/VGA_text_mode#Text_buffer
[4]: https://stackoverflow.com/questions/47588486/cannot-write-to-screen-memory-in-c




No comments:
Post a Comment