Thursday, August 21, 2025

Programming VGA: Smooth Scrolling in Text Mode #1


Hi there. The title of the article sounds probably like a chapter title from a computer book written in Monospace fonts from 80s, because this time, I'll be indeed discussing VGA, a technology from 80s, with a fairly low level approach. 

80s Tech (illustrative image) [1]

More clearly, I explained in this article, how to create smooth scrolling effect in VGA Text mode, how to access to video memory and VGA registers directly without using interrupts. In first part, I gave a brief introduction to VGA and explain some basic registers. I also put some nostalgia in between. Therefore, this post turned out to be longer than I expected, so I had to divide it into two parts. The essence of smooth scrolling will be covered in the next article, as I want to discuss double buffering technique along with smooth scrolling. This is also a relatively long topic.

Disclaimer: Writing improper values to VGA registers may cause permanent damage to the hardware. The information provided here, might not be accurate, as it has not been tested on a real CRT monitor, and therefore the risk of using it is solely yours. If there is any odd chance of damage, the author of this article warns you and accepts no responsibility for anything that may happen.

Original VGA Graphics Card (wikipedia)
Well, let me explain the warning above a little bit. Forcing a VGA Card to operate at a frequency, that a CRT doesn't support, can indeed damage your monitor [2]. I haven't seen this before, but I know such thing exists. On the other hand, I threw my last CRT monitor away, maybe ten years ago (is there anybody out there, still using a CRT?). My graphics card isn't a real VGA card, just VGA-compatible. So, even the registers I've accessed aren't real. I developed and ran my code in DOSBox and also optimized in DOSBox. So, the timings would likely be off on a real CRT, and I'd need to readjust waitretrace routines (waitretrace will be covered in the next article). TBH, I don't even know, how this code would run on a real 80386. 

Introduction

We usually access to the VGA card using int 10h BIOS interface. Everything from selecting screen mode to moving cursor around or adjusting its size can be done with this interrupt. What this interrupt really does, is accessing VGA registers in a "correct way". VGA BIOS can overwrite int 10h routines, when needed.

I also do not think that int 10h causes a significant slowdown (except few routines like putpixel), and one advantage of it is that it allows us to avoid the complexity in the code. VGA has a lot of registers [3]*. Understanding the function of some of these registers requires CRT knowledge at some context. On the other hand, I mentioned that I used DOSBox for this article. Although DOSBox can correctly emulate many VGA features, it still cannot display some (non-standard) effects properly. It is known that some DOS games don't work well in DOSBox (configuration errors aside). Yet DOSBox still deserves credit; it is more compatible to DOS compared to VMware or VirtualBox (and yes, I compared apples to oranges here, first one is dedicated to DOS and just DOS, while others are generic virtualization solutions). 

* The source mentioned here refers to more than 300 registers, but not even 100 of them are documented there. They probably count non-standard registers in this number, which are added by different vendors. There are approximately 60 standard VGA registers, which is still a big number. 


VGA Registers and Accessing the Card

As I mentioned above, there are a lot of VGA registers. In this link, these are grouped into six categories. These categories are formed according to the HW port numbers used for access. In this article, I mainly worked with CRT controller (CRTC) registers. 

The registers are accessed via six pairs of hardware ports, roughly. In other words, not every register has been assigned to a port. A list of ports is given in [3]. Generally speaking, you put the register number, you want to access, on port 0x3DX and then read the value from the register at the port 0x3DX+1, or write to it. There is an exception for 0x3D0, but I won't mention it here. I'll provide plenty examples on these registers, in the following sections.

I didn't want to mention all the registers here and turn this article into a reference manual. So, I'll focus only on the interesting parts. For example, CRTC is accessed via 0x3D4 and 0x3D5 ports. 0x3D4 is the address register and 0x3D5 is the data register [5].


Cursor Start Register (Index 0Ah)
76543210


CDCursor Scan Line Start


Now, let's take a look at the code for disabling the cursor in [6]:

void disable_cursor()
{
    outb(0x3D4, 0x0A);
    outb(0x3D5, 0x20);
}

First, we write 0xA to 0x3D4 and tell the card, that we are going to access the register 0xA. Then we write the value 0x20 to this register via 0x3D5 port. This value sets the 5th bit of the register, which is Cursor Disable (CD) bit [5]. Pretty easy.

Cursor Scan Line Start bits hold the pixel, which the cursor will start from. In the standard 80x25 character mode (mode 3), each character and the cursor itself are actually 8 pixel to 16 pixel (px) images. So that, by changing these images, custom fonts can be loaded in DOS. In the screenshot below, which I took from a font editor for DOS, an example character can be seen closely, so that you can count. The font table is located in VGA BIOS (check Int 10h / 1130h, if you're interested) and custom fonts are written to this area temporarily. After reboot, standard font will come back. As fonts are a complicated topic, I don't want to go into further detail.

Back to the topic: In edit environment, the cursor starts at the 14th pixel row and ends at the 15th, but when Insert key is pressed in DOS edit, the cursor gets bigger. It starts from the 0th pixel row and ends at 15th. The first element, that causes this effect, is the low five bits of the Cursor Start Register, and the second element is the register 0xB or Cursor End Register, or more precisely its low five bits:


 Cursor End Register (Index 0Bh)
76543210

Cursor SkewCursor Scan Line End


This register holds the lower pixel row of the cursor. But if a character is 16 px high, why 5 bits? VGA actually supports characters up to 32 px high in text mode [6]. Similarly, VGA font table is also 8 KB in size: Number of chars (256) * height (32 px) * width (8 px) / bits per byte (8). Therefore, 5 bits are allocated on each register for cursor, but the fourth bit has no meaning in any text mode, as no text mode supports chars higher than 16 pixels. Cursor Skew field is reserved for EGA compatibility and has no meaning in VGA either.

Another easy to understand pair of registers are Cursor Location High (0xE) and Cursor Location Low (0xF) registers. They keep the linear position information of the cursor. This value divided by the number of character columns (in our case 80), the quotioent is the y-position, and the remainder is the x-position of the cursor. Or from the opposite direction: D = Y * 80 + X. Since these registers are byte sized, high byte of D is written to 0xE and low byte to 0xF.

Cursor Location High Register (Index 0Eh)
76543210
Cursor Location High


Cursor Location Low Register (Index 0Fh)
76543210
Cursor Location Low


Back to the 80s: QBasic

Now, I am going to do a little demo with these two pairs of registers, and interestingly, I'm going to use QBasic for that. Like many others, I started programming with Basic: just a little C64 Basic, then GW-Basic (big thanks to TRT (Turkish State Television) at that time, especially computer programming courses on the Channel 4 (TRT4) from Open Education Faculty) and finally QBasic. And I claim, that everyone who was born in 80s and had a computer in the 90s, has seen the IDE below at least once. I used to write my scripts in QB, when .bat files were inefficient for a specific task. Later, when Qbasic's notorious speed became a visible problem to me, this pushed me to learn C and Assembly. -There was also a short Pascal period somewhere in between.- BTW, Qbasic was an interpreter and being unable to compile .exe files was another huge drawback for me. Even though, I had the chance to work with Quick Basic v4.5 at almost the same time I started learning C; the horizons, C opened up, were completely different. Additionally, finding Quick Basic 4.5 IDE at that time was quite hard, at least for me.

Only, neither QBasic nor Quick Basic (QB for short) were incapable programming languages. With QB, I could also do anything, that I could do with C (except speed). Looking at my old code, I wrote a QB - Int 33h interface for mouse, and an int 13h interface for low lever disk operations. There are also incredible QB codes written by others. But aside from its slowness, the programming logic in QB was also a bit "different", and it was IMHO diverging from where the programming paradigm was heading to. I also tried my luck at Visual Basic, but I realized more or less around that time, that Windows, in general, wasn't for me. In the early 2000s, I tried Win32 Assembly for visual programming, but it felt too cumbersome to me.

And even though I've been blogging for more than ten years, I've written many code snippets in various programming languages in these blog posts, I realized that I haven't given any single QB example. However, I believe, QB is more suitable for such simple code snippets, because neither you would need as many lines as in Assembly to do a simple thing, nor would you have to worry about things like including headers, type casts or paying attention to buffers, pointers etc. Anyway, enough retrospective. I do the following example of this post in QB:


DECLARE SUB ENABLECURSOR (CURSTART%, CUREND%)
DECLARE SUB DISABLECURSOR ()
DECLARE SUB MOVECURSOR (CURSORX%, CURSORY%)

FOR X% = 0 TO 15
  FOR Y% = X% TO 15
    CALL ENABLECURSOR(X%, Y%)
    SLEEP 1
    CALL DISABLECURSOR
    SLEEP 1
  NEXT Y%
NEXT X%

CALL ENABLECURSOR(0, 15)

FOR Y% = 0 TO 10
  FOR X% = 0 TO 10
    CALL MOVECURSOR(X%, Y%)
    SLEEP 1
  NEXT X%
NEXT Y%

SUB DISABLECURSOR
    OUT &H3D4, &HA
    OUT &H3D5, &H20
END SUB

SUB ENABLECURSOR (CURSTART%, CUREND%)
    OUT &H3D4, &HA
    CS1% = INP(&H3D5)
    OUT &H3D5, (CS1% AND &HC0) OR CURSTART%

    OUT &H3D4, &HB
    CE1% = INP(&H3D5)
    OUT &H3D5, (CE1% AND &HE0) OR CUREND%
END SUB

SUB MOVECURSOR (CURSORX%, CURSORY%)
    POSITION% = CURSORY% * 80 + CURSORX%

    OUT &H3D4, &HF
    OUT &H3D5, POSITION% AND 255
    OUT &H3D4, &HE
    OUT &H3D5, POSITION% \ 256
END SUB


The code is a bit long, but it basically contains the same code in [4]. In the first part, all combinations for the cursor is being set in a for loop. The parameters are sent to relevant VGA registers in ENABLECURSOR subroutine. The delay (SLEEP) can be skipped by holding down the CTRL key.

After the first for loop, I set the cursor to its biggest size, so that it could be easily seen. Then I moved it around the 10 x 10 section of the screen using MOVECURSOR subroutine. Assuming that the screen is 80 columns wide, I calculated the linear position of the cursor from (X, Y) coordinates in MOVECURSOR subroutine.

In the following article, I will continue to discussing VGA registers, primarily those required for smooth scrolling and provide additional examples in QB. However, since scroll operation requires high speed, I wrote it in Assembly + C, and used waitretrace function, I mentioned, I will discuss it at the beginning of this article.



[1]: DEC PDP8 Family User's Guide TSS/8 (1970). Link
[2]: https://retrocomputing.stackexc....damage-my-vga-card-by-programming-it-in-assembly-throu
[3]: http://wiki.osdev.org/VGA_Hardware
[4]: http://wiki.osdev.org/Text_Mode_Cursor
[5]: http://www.osdever.net/FreeVGA/vga/crtcreg.htm
[6]: https://en.wikipedia.org/wiki/VGA_text_mode#Fonts

Thursday, July 10, 2025

Add Captions or Labels to Images Using Python

Hi there. In this blog post, I'll be addressing a need-based problem. My problem is adding labels or any kind of text to images in a directory in a bulk way. For example a copyright note or an address information. I think, this can be solved by creating a stencil in GIMP. But what if the text to be added is not a constant? My actual problem was adding sequence numbers to images. Consequtive numbers starting from one in the corner of each photo. In this case, a stencil wouldn't be a solution. And even if it can be done by editing each and every image with GIMP, it won't be very practical for several hundred images. Most practical solution is writing a simple script for this. By the way, I could have done this sequencing with their filenames, but I didn't want to touch then, because I also want to preserve the timestamp of the file. And let's assume, I want to use these photos on a web page, where their filenames won't be visible at first glance. Finally, with such a script, you can generate time stamps, similar like the cameras from 90s. 

When I talk about scripting, bash comes to (my) mind very first, however it's unfortunately not the best tool for this job. AFAIK, there is no image library usable with bash. Of course, nothing is impossible. It can be done, but as you don't use a hammer to knock down a wall, when you can have a sledgehammer, choosing the appropriate tools for the job is the first step of all solutions. Python lovers may get angry for that, but python, the second best scripting language after bash, has a library for exactly this purpose, which is called Python Imaging Library (PIL). I used a fork of PIL called Pillow in my script. This library is easily imported with pip install pillow command. 

I again uploaded the code to my github account the keep the article short. It's a small script, with just some tricks in it. First of all, like any other python script, there are import statements at the very beginning. The Image module of the library contains image related functions. ImageDraw contains simple 2D image effects, which is needed to rotate the image in my script and finally ImageFont for fonts and other text effects.


Exif Data and Orientation

The first challenge of this project is that the images cannot be viewed on the computer as easily as we see it on the phone. Let's take the image below, that I took with my cell phone, as an example:


Above, it appears vertically on the browser. In Gwenview, also appears in vertical format. But, when I open it with the following code

>>> from PIL import Image
>>> img = Image.open("image.jpg")
>>> img.show()

it appears horizontally.

and when I open it with GIMP, a strange dialog box says that the image contains "Exif orienation data" and asks if I want to rotate it. But why?

While holding a cell phone horizontally* and taking a photo, it doesn't actually rotate the photo. When I open such photos, taken with a cell phone or a digital camera, in python, those which are not taken vertically, are shown on the screen with the same orientation as they're taken. The camera saves the orientation info (thanks to their gravity sensors) inside the picture, and rotates them while viewing. If the orientation was not saved, we would have to turn the phone to the exact orientation, at which it was originally taken, each time.

*: The default position of some cameras are horizontal, but some are vertical. 

From the above statement, it's clear that an image doesn't only consist of pixel data. There is a field called Exif, where the metadata of an image is stored and today, all image formats as well as cameras support Exif.  There is a table on Wikipedia about the data stored in Exif. Typical fields are the manufacturer and the model of the camera, image orientation, the time and date the photo was taken, image resolution etc. For example, saving the timestamp of the photo allows you to find out when the photo was actually taken and to sort it by date, even if the file is renamed afterwards. On the other hand, some phones put the coordinates there from phones GPS, and reveal where the photo was taken and some platforms can then automatically tag the location of the photo, when it is shared. These are occasions, that will send shivers down the spine of those who is sensitive to the privacy of personal data. According to legend, Ukrainians asked Russian soldiers online for their photos, and used the coordinate data of this photos to launch attacks. 

In Linux, a tool called exiftool can be used to review this information (exiftool -list <filename>), be manipulated or wiped completely  (exiftool -all= <filename>). For the example image above, there is a difference of around 77 KB between the original image and the image with all Exif data wiped. 

After a long (and unnecessary) explanation about Exif, let's go back to the image orientation. In Pillow, there is an Image.getexif() function [1], to read and parse Exif data. When I print the output of this function to screen (22., 23. and 24. lines, commented out), I can see the orientation info of .jpg files in the directory. As mentioned in [1], the values 2, 7, 4 and 5 were not in my images. Likewise 8, therefore I didn't implement 8 in my code either, but it's easy. For the value 1, I used "else" in the if structure (line 31), so for 8 and all other values, the image is not rotated. 

For rotation, Pillow has Image.rotate() function [2]. Using that, I rotated the image 90 degrees for Orientation=3, and 270 degrees Orientation=6. I have to open a parenthesis here for the line 27: If there is no image orientation field in Exif, or if there is no Exif information in the image at all, the code will throw an error. For this reason, the best practice says, better check the return value of getexif() function first and then rotate, if there is no error. Since all my images have Exif, I did not have any problem and kicked the can down the road. It doesn't mean that you also won't have any issues.

So far, I've corrected the image orientation. I used [3] to solve the main problem. There, it is demonstrated how to add text in a simple way, I only adjusted the parameters for my system. In line 35, an ImageDraw object is created to add text. The next line creates a font object with ImageFont.truetype() function. As the font used in [3] isn't on my machine (and as I don't check the return value of the function for errors), I chose a font among my fonts under /usr/share/fonts/ . The second parameter is font size. I found its value by trial and error. My photos were relatively large (8 MP), so 128 could generate a barely visible caption. In the next line, at the given coordinate of the image (25, 25 - top left), I added the sequence number "sirano" with the font I created at the previous line with red color. At this step, the date and time of the photo could be added to the image automatically, either from its filename or from its Exif data. An example of an enumerated photo is below:


Finally, line 40 can be uncommented to show image on the screen and/or the line 43 can be uncommented to save the tagged image with "_enum" suffix. While working on this script, I ran it in a directory with several hundred images, so I neither wanted to display nor save that many images for each run, therefore commented out. 


Not: Alternatively, it's possible to do the same thing in OpenCV with cv2.putText() function, but I'm keeping this for another post. 


[1]: https://jdhao.github.io/2019/07/31/image_rotation_exif_info/
[2]: https://note.nkmk.me/en/python-pillow-rotate/
[3]: https://www.geeksforgeeks.org/python/adding-text-on-image-using-python-pil/