Sunday, August 11, 2019

What is the Disk System and What isn't? #2: MBR


Hi there. I mentioned hard disk hardware in the previous article. This article will be a bit more software oriented as much as an article about interrupts and IO ports could be. Let's start with the most fundamental data structure of hard disks.


Master Boot Record (MBR)
MBR is a structure in the first sector of each hard disk, holding partitioning scheme and containing some code. During a power-on of a computer, BIOS takes the execution and after it is done with its job, it reads the first sector from one of the accessible storage devices according to the boot sequence setting in CMOS. This process has not changed much since 1980s. If the boot medium is a floppy disk, that first sector is the boot sector, if it is a hard disk then the first sector is MBR. For BIOS, boot sector and MBR has no difference. BIOS only copies the first sector of the first available drive to memory location 0000:7C00 and makes a far jmp here to boot. Obviously, it has nothing to do with protected mode yet, since I was talking about 80s and using real mode addressing.

The code in MBR determines which partition is set for booting, reads the first sector of that partition to memory and branches there. MBR keeps disk partitions in a table. 

I used HxD disk editor throughout this article to read MBR under Windows. I ran HxD as administrator, selected "Open Disk" (Ctrl + Shift + D) and then selected one of the physical disks. I recommend to "Open as Readonly".

Linux users are lucky because following command is sufficient:

sudo dd if=/dev/sda count=1 bs=512 | hexdump -C -v

While I was trying to learn assembly, being able to view MBR after 3-4 months was a great success for me. I am mentioning a blast from the past.


The screenshot from HxD is above. I colored the image to make easy to distinguish. The red area is code, blue area is partition table and the green area is MBR signature. Even though the sector is intact, BIOS will not boot without that 0x55 0xAA signature. So why this number? Because it is (0101 0101 1010 1010)2.

Same output under Linux is here:


Since GRUB is installed on both machines, both outputs above contain GRUB MBR code. Before examining the code, let's take a look at the partition table colored in blue. In the example I gave with debug.exe in the previous article, same code and signature can be easily seen.


Partition Table
Partition table is a 4*16 byte field starting from the offset 01BEh of MBR. There is a 16-byte record for each primary partition.If I exclude extended partitions for now, according to the previous statement, there can be 4 primary partitions at most. The table below shows the structure of the records:

OffsetSize
Description
0x00byteBootable Flag
0x013 byteStarting CHS address
0x04byteSystem ID
0x053 byteEnding CHS address
0x08dwordStarting LBA address
0x0CdwordTotal sectors in partiton

Bootable flag indicates that, it will be booted from that partition at startup. There can be one and only one partition with bootable flag set in each MBR. If this value is 00h, that means the partition is not bootable. If it is 80h, that means the partition is bootable but some MBR code checks only its seventh bit. Thus the values between [01h, 7Fh] are invalid.

I skip CHS address part shortly right now and come back later. 

Although system ID field specifies the type of filesystem used in that partition, this byte is not that important. Some OS and OS installers often do not check this field. This field has also no standard. There were some values used by Microsoft and IBM in 80s. During minix and linux were developed, the values that are not used by Microsoft or IBM at that time, were randomly chosen as their ID number and subsequently de facto standardized. Detailed information about this and the list of these IDs are available on Wikipedia: https://en.wikipedia.org/wiki/Partition_type

The starting LBA address field is the value obtained by translating starting CHS address into LBA. The purpose of the total number of sectors field can be understood from its name. Multiplying this value with 512 bytes, gives the size of the partition but this value is its raw space not the formattable space. I will explain the difference of the two in the boot sector article.

CHS addresses are stored with a complex scheme. The first byte of this three byte field is the head number. The low six bits of the second byte hold the sector number. The third byte is the low eight bits of cylinder number. And the high two bits of second byte also hold two high bits of cylinder number (which makes it 10 bits in total). I tried to visualise this below:

1. Byte 2. Byte 3. Byte
H7H6 H5H4 H3H2 H1H0 C9C8 S5S4 S3S2 S1S0 C7C6 C5C4 C3C2 C1C0

Okay, does it sound familiar? Yes, int 13h. First byte to DH, second to CL and third to CH, et voila. Fits perfectly. 

In the previous article, I wrote that CHS addressing is insufficient for hard disks over 8 GB:
With a more optimistic approach, without fuss; assuming that the MBR code does not need end CHS address because to load the boot sector of a partition only starting CHS address is sufficient, it can be concluded that bootable disk partitions can only be created within the first 8 GB of hard disks. It is impossible to boot from partitions beyond the 8 GB limit using CHS!

OK, how can a partition be created beyond 8 GB? There is a 32-bit sized starting LBA address field. If a boot loader checks this field, instead of CHS, LBA supports 232 * 512 byte = 2 TB, it can boot from partitions in first 2 TB space. Additionally, since the total number of sectors in partition field has also 32 bits, a 2 TB partition can be created just before the end of this 2 TB limit and thus MBR can still be used on disks up to 4 TB, in theory. I am saying "in theory" becuase there is no guarantee that a 25 year old MBR code will run successfully with these extreme values.

I have added the screenshot with debug.exe from the previous article below again. Let's try to understand the meaning of the bytes here:


80 | 01 01 00 | 06 | 0F 3F FF | 00 00 00 C1 | EF 03 00 00
00 | 00 41 00 | 06 | 0F 7F FF | 00 F0 03 00 | 00 F0 03 00

There are two disk partitions above. First one is bootable. Both system IDs (06h) correspond to FAT16. CHS addresses of the first partition are (0, 1, 1) - (255, 15, 63) starting from 63rd sector and 257 985 sectors long. CHS addresses of the second partition are (256, 0, 1) - (511, 15, 63) starting from sector 258 048 and 258 048 sectors long. 
 
Why is this table so important? If I had been creating my own OS which will run in a computer alone, it has no importance at all. I can ignore all of this, say that I have designed a better table and going to use mine. But since all other OS developers have agreed on this structure and if I use my own partition table, my OS cannot be compatible with others.


Damn the Partition Table and Let's Write Code...
Well yes, I still could not start writing code but I will start giving examples in this section. First, I created two VMs for testing. In previous article, I had installed DOS6.22 on one of the VMs in VirtualBox for the first example. 32MB of RAM and a 500MB hard disk are enough for this VM. This machine comes with a network card after it is created. I set it host-only (Settings -> Network -> Enable Network Adapter, Attached to: Host-only Adapter). DOS floppy images can be found on Internet. I also used Damn Small Linux (DSL) iso. Live distros like Ubuntu, Knoppix etc. do not run on 32 MB RAM. DSL iso is just 50MB. Since it is no longer developed it can be downloaded either here or here. TinyCore Linux, which is the closest alternative to DSL, has an iso of 14MB but runs on 64 MB RAM.

After I booted the machine with the first MS-DOS installation disk, setup asks to create a partition but I quit setup and manually created a partition with fdisk:

1 - 1 - N - 25% - Esc


After doing that, the warning on the screen informs me that "No partitions are set active". This means, there is no partition whose first byte (bootable flag) is 80h. This warning will disappear after pressing 2 - 1 - Esc (setting partition active). Afterwards, when I tried to create another primary partition, I got an error. Unfortunately, fdisk cannot create more than one primary partition. An extended partition can be created but I will explain extended partitions at the end of this article. I pressed Esc to exit the main menu, removed the floppy disk and booted the machine with DSL. By the way, after disk partitions are created or deleted, DOS is unable to recognize them unless it is restarted.

I found the partition I created, by running sudo fdisk -l command in terminal and I added another partition by running following commands:

sudo fdisk /dev/hda
n - p - 2 - (Enter) - 512
t - 2 - 6
p
w

After powering my virtual machine again with DOS floppies, OS recognized and formatted C: and D:. A standard Microsoft installation is completed by pressing Enter key repeatedly, removing previous floppy disk and inserting next one during the installation. And by the way; a warning to those who forgot the last floppy disk in the drive and got "Non-system disk error" after reboot: In the past, when somebody boots a computer with a floppy disk whose boot sector is infected with a boot virus, the virus writes itself into the memory during the boot, shows this error and waits for the user to press a key to spread further. To avoid this, it would be necessary to restart the computer instead of pressing a key. In order to clean the boot viruses, the MBR code of harddisks could be rewritten with fdisk /mbr command. Nowadays, it can be used to remove GRUB installation from harddisk.

After installing DOS, I checked the MBR again using debug.exe: 


I booted the machine again with DSL. Machine IP address should appear on the right of the screen. I will use this IP to copy files from my host to VM. Since my host runs linux, I can copy files easily with scp. Windows users have to start sshd in VM by following: DSL -> System -> Daemons -> ssh -> start. Then WinSCP can be used (Cygwin works too).

The file, I am going to copy, is the disk editor in Norton Utilities package. There are two different versions of disk editor with same file name DISKEDIT.EXE. First one is from Norton Utilities v8.0 (1994) release and the other comes in Norton Utilities 2000 release. First one supports CHS and second one supports LBA. The file from Norton 2000 release is bigger in size (about 600K). It is really practical to work with a disk editor, therefore those who can find this file, should consider him-/herself lucky. (Disclaimer: If these files are still copyrighted, downloading them could violate the law).

I wrote the disk editor files to the disk of VM, restarted the computer once again and I started the editor. I listed the disks by pressing Alt+D. I selected "Physical Disks" as type and opened the harddisk. It showed the sectors like HxD. I selected "View as Partition Table" under View menu (or pressed F6), and all the information, I had to decode manually in previous section, has appeared. Disk editor will help very much for the boot sector and FAT. Therefore, it is useful to keep it in disk.

If you have iso file of Norton Utilities, you have to install CD drive in DOS. Readers who haven't used DOS before, may be surprised at the necessity of installing a CD drive but well those were the days. I will explain the installation in a separate article and give its link here.

According to the screenshot, second partition ends at the end of 511th cylinder. By pressing Alt+I and selecting "Drive Info" in this menu, the editor shows the properties of the hard disk (next figure).

Drive Number: 80 Hex (First physical disk)
Int 13x: Yes (BIOS supports LBA)
Sides: 16 (Heads)
Tracks: 1015 (Cylinders)
Sectors/track: 63
Total Sectors: 1 024 000

This information can also be retrieved using Int 13h/08 subroutine:

Number of cylinders is 3F6h, number of sectors per cylinder is 3Fh and number of heads is 0Fh. The return values in the registers can be interpreted using Ralf Brown's interrupt list (Int13h/08). As it can be seen, DL = 02 in the output above, because I had added another hard disk to the machine at that time for testing.

Now, I will write "denizyildizi" (a totally random word meaning starfish in Turkish) into a random sector of the disk. Let me write to CHS 600,0,1. This address corresponds to 604 800 in LBA (using the formula to convert from CHS to LBA in previous article). I opened the disk editor, pressed Alt+P and typed LBA address in the dialog box. In order to be able to write to the disk, I unchecked the Read-Only box under the menu Tools -> Configuration and pressed OK. I pressed Tab key and switched to the right column to insert text instead of hexadecimal numbers. And I wrote what I wanted. Then I hit Alt+X and clicked to "Write" button in the "Write changes to disk" dialog box.

To do the same with debug.exe (in absence of disk editor), I wrote a code snippet and used subroutine Int13h/43h. This subroutine with the 42h subroutine, are referred as Int13h extentions.

mov cx,000C      ; 12 characters
mov si,0120      ; From 0120h (source)
mov di,0200      ; to 0200h (dest)
repz movsb       ; copy
xor ax,ax
mov cl,FA        ; 0FAh words more
repz stosw       ; copy
inc si           ; DAP exist in next address
push ds
pop ax           ; Get DS
mov word ptr [si+06],ax ; and write it to DAP
mov ax,4300
mov dl,80
int 13           ; call int 13h
int 3
db 00
db 'denizyildizi',0
db 10 00 01 00 00 02 00 00 80 3A 09 00 00 00 00 00

The last line in the snippet contains the data structure named DAP (Disk Address Packet). First byte, 10h is the length of DAP. 00 is reserved. 00 01 is the number of sectors to read/write and (0000:0200)16 is buffer address. It could be misleading that the segment address is 0000 here but, the correct segment address will be written there by the code. (09 3A 80)16 value is LBA address (as qword). I ran this with g command of debug.exe and the sector is written as expected. I will read the data, I wrote, again but this time I use IO ports of disk controller with LBA addressing. I rewrote the snippet, I used in previous article with some minor differences (and added offset numbers again to ease reading):

0100    mov ax,0001
0103    mov dx,01F2
0106    out dx,al   ; Sector count = 1
0107    inc dx      ; dx = 01F3
0108    mov al,80
010A    out dx,al   ; LBA address1 = 80h
010B    inc dx      ; dx = 01F4
010C    mov al,3A
010E    out dx,al   ; LBA address2 = 03A
010F    inc dx      ; dx = 01F5
0110    mov al,09
0112    out dx,al   ; LBA address3 = 09h
0113    inc dx      ; dx = 01F6
0114    mov al,E0   ; LBA mode, master disk
0116    out dx,al   ; LBA address4 = 0h
0117    inc dx      ; dx = 1F7
0118    mov al,20
011A    out dx,al   ; 20h ATA read sector command
011B    in al,dx    ; Read status register
011C    test al,58  ; 0101 1000: Drive ready | Seek complete | Buffer ready
011E    jz 011B     ; Wait until disk is completely read
0120    mov dx,01F0
0123    mov bx,0200
0126    in ax,dx    ; Read data (word sized)
0127    mov [bx],ax ; Copy to the buffer
0129    inc bx
012A    inc bx
012B    cmp bx,0400 ; 0200h byte
012F    jnz 0126
0131    int 3       ; Breakpoint

When I checked again with d 0200 command in debug.exe, I verified the data written previously. 


Logical (Extended) Disk Partitions
Before starting this chapter, I created a second VM with 32 MB RAM and 1GB disk in VirtualBox (with a host-only network adapter like the previous one). I downloaded FreeDOS CD iso from FreeDOS.org and inserted it into the virtual drive. The contents of Legacy and CD are same but the only difference is their boot modes. I selected "Install to harddisk", then chose the language and selected "No - Return to DOS" to partition the disk manually and ran fdisk. Unlike DOS6.22's fdisk, FreeDOS' fdisk comes with FAT32 support. FAT32 appeared in 1996 and went into use with Win95 OSR2. I will go into the detail of FAT32 in next articles. Note on this part: fdisk tests the disk's LBA support while creating a partition. On disks that do not support LBA, partition type field in MBR is 06h and 0Bh for FAT16 and FAT32, respectively, while on disks that support LBA, these values are 07h and 0Ch.


The question of whether FAT32 support should be enabled, shown in the figure above, is only important for disks larger than 2GB. I pressed "N" and passed. I created a 340 MB primary partition with key combination of "1 - 1 - N - 340 - (Enter) - (Esc)" and with the option "2" in main menu, I activated the partition, created. Unlike the previous example, I created an extended (logical) partition with the key combination "1 - 2 - (Enter) - (Esc)". At this step, fdisk is asking if I want to create a logical drive in the extended partition. I typed 340 MB here and hit Enter. fdisk asked again, to create another disk in the remaining space. Here, I hit Enter again. Due to a bug at this step, fdisk thinks there is still 2MB unpartitioned space and asks the user again. If Enter is pressed, the partition table is written incorrectly. Therefore, I pressed to Esc.
 

As I selected the fourth option in fdisk, the disks were listed as above. I quit fdisk, rebooted the VM and continued the installation. I answered the "Do you want to format your drive" question with Yes, then selected "Full installation" and started the installation. When I finished, I removed the CD and restarted the machine. 

Now, I can see three hard disk drives (C:, D: and E:) in FreeDOS. However when I check with disk editor or debug.exe, there are only two partitions in MBR. So where is the third? (Btw, if you have the diskedit, copy it to this VM as well)

When I checked the disk with disk editor, there were 520 cylinders on the disk (I used the old version disk editor for this example, without any specific reason). The first partition ends at the cylinder 173. So, the primary partition is fine. Second partition is listed as EXTEND and there is another partition table at the address CHS(173,0,1). By pressing Enter while EXTEND is selected, will show that sector formatted as partition table.

This partition table looks like the previos one, at the address (0,0,1) but they are not same. In the lower right corner of the screen, I can read the address (173,1,1), so I am not reading MBR obviously. When I switch to the Hex view by pressing F2, there is no MBR code in this sector but a partition table and sector signature. The first entry of this partition table is a partition starting at (173,1,1) and ending at the cylinder 346. Second entry is another extended (EXTEND) partition right after where the previous partition ends. When you select this extended partition in editor and press Enter, another partition table without boot code can be seen at CHS(346,0,1) sector. In this partition table, there is a partition starting from (346,1,1) and ending at the end of the disk. As can be understood, the extended partition is a structure that contains the logical partitions inside it and keeps their information in a linked list (sort of) of partition tables.

As far as I know, there is no reason, in theory, for extended partitions not the be bootable but the code in MBR needs to search for the active partition among the extended partitions, if there is no active partition in the main table. This means an increase in the size of the code, which actually has to be limited to 512 bytes. I am not sure whether this is something undesirable for the standard makers or if there is some other reason. Further examples can be found in Wikipedia in Extended Boot Record article. 


How Does MBR Code Work?
To examine their MBR codes, I booted both machines that I set up so far, with DSL and saved their MBR code with the following command:

dd if=/dev/hda of=mbr.bin count=1 bs=512

The same command on a machine with a recent Linux distro would be;
 
dd if=/dev/sda of=mbr.bin count=1 bs=512

Then I disassembled these outputs with the following command;
objdump -M intel -D -b binary -m i8086 --adjust-vma=0x7C00 mbr.bin > mbr.asm
on a linux machine.

In order to keep the article short, I uploaded disassembled outputs of DOS6.22 MBR and FreeDOS MBR to my Google drive with comments, instead of examining the code line by line here. Both codes are same, in general: After setting the stack pointers, the code copies itself to the address 0:0600h in DOS and 1FE0h:7C00h in FreeDOS. Because boot sector code, which will be loaded right after must be placed in the same 7C00h address. As I previously mentioned, there is no difference between the boot sector and MBR codes from the BIOS perspective. The task of the BIOS is to read the first sector of boot media to 0:7C00h, check 55h AAh bytes at the end and jump to this address. If the boot media is floppy, the boot sector is read to this address since there is no MBR to read.

In DOS, it is checked whether there is one and only one 80h flag as bootable flag in partition table. However, in FreeDOS, the boot sector, corresponding to the first partition table entry with the highest bit of the bootable flag is 1, is read. If the boot sector could not be read, a "read error" is thrown. There is no other control than that.

Regarding to disk reading, DOS tries to read the boot sector, whose CHS address is got from the partition table entry,  five times using a counter kept in DI register. If the boot sector cannot be read at the fifth time, it gives the error "Error loading OS". There is no LBA support in DOS MBR code. On the other hand, FreeDOS checks for LBA support first. If it is supported, the code reads the boot sector using the DAP, stored in 7CCDh. If unsupported, almost the same code like DOS MBR runs. Signature check is done after sector is read (in FreeDOS, this check is on the return of the disk read function) and the program jumps to 0:7C00h.

In fact, I must discuss GRUB, GPT and UEFI as a continuation of this topic but since UEFI is a technology that I do not know very well, I will summarize them in a future article. Although what I have discussed in this article is the technology of 1980s, the code that is run today (unless UEFI is not used), is the same except minor differences of course. I have finished MBR, so far. In the next article, I am going to discuss the next step, the boot sector.

Monday, May 6, 2019

What is the Disk System and What isn't? #1: Hard Disks


Hi there. I was thinking about writing file systems especially about FAT since a while but it does not make any sense to write about FAT without mentioning boot sector and to write about boot sector without mentioning MBR and more importantly the hardware. Therefore I decided to deal with this topic from the very beginning. Speaking of the beginning, I will also mention here where BIOS ends at the boot process and where MBR starts but I will not dig BIOS much deeper (maybe in another article).


Addressing the Hard Disk: CHS, LBA and Int 13h
Cylinders, Heads and
Sectors on a Harddisk
CHS is an acronym for Cylinder, Head and Sector. It was standardized with very old disks (from the 1970s) and although the underlying disk structure has changed dramatically, this CHS addressing continued to be used for about 30 years more due to backward compatibility. The internal structure of a rotating disk is seen on the next figure. This hard disk consists of four platters and of eights heads for access. Each platter has concentric data rings called cylinders. Cylinders are also called "tracks" but I'll use the term "cylinder" throughout the article. Finally, when each cylinder is cut into slices at a certain angle (like the red triangle in the figure), the areas where the slices cut the cylinder, are called sectors. Each sector has 512 bytes. Nowadays, 4K sectors are used but the disk controller reports 512 bytes to OS due to backward compatibility. Let's assume, we are in the 1980s for now (I wish). We need three numbers for CHS addressing. Important note: While cylinder and head numbers start from zero, sector numbers always start from one.

I find the image below more descriptive:

https://superuser.com/questions/974581/chs-to-lba-mapping-disk-storage

Cluster is a structure consisting of one or more physical sectors. It is the smallest element of logical file system. I will not mention it in this article because I am defining hardware elements yet and will mention logical elements later. In some sources, the term "cluster" and "sector" used interchangeably but it is absolutely wrong. A cluster can be equal to a sector (in some special cases e.g. floppies) but it doesn't have to be so always.  

5 MB IBM disk being
carried by forklift.
Source: thenextweb.com
CHS logic is simple and useless. Careful readers can notice that outer sectors are larger than the inner ones. CHS causes inefficient usage of outer sectors in terms of data density. In the 1970s, disks of a few tens of MBs with a price of a few ten thousands of dollars had this structure. These disks are called CAV (constant angular velocity). Since the angular velocity is constant, the disk head passes over each sector in a constant time. Having this said, I must also mention that CDs and DVDs are CLA disks (constant linear velocity). When a CD was being read, it would spin first slowly on the inner tracks and it was getting faster on the external tracks near the end of recording. In the 1990s (actually near the end of 80s), zone bit recording technology is developed for the hard disks. With ZBR, large outer cylinders are divided into more sectors while relatively smaller inner cylinders had fewer sectors. For backward compatibility, the controller was receiving CHS addresses from OS and it converts this CHS address into real physical sector in the background using a formula.

By the way, let's come back to the present day for a small moment with a quick flashback. In SSDs, CHS as well as CAV/ZBR are completely irrelevant. Therefore, it can be easily understood that all of these technologies are obsolete now.

Another addressing method is LBA (Logical Block Addressing). In LBA, disk is addressed using only a linear sector number (only one single number). For example, a 1.44MB high density (HD) floppy has 18 cylinders, 2 heads and 80 sectors (or 9 cylinders on double density (DD) floppies). The CHS address of the last sector is (17, 1, 80). Its LBA address is 2879 (18 * 2 * 80 = 2880 (total sectors), since the first LBA sector is zero, last one is 2879).
 
Let c be the cylinder number, h head number and s be the sector number. Nhead is the number of the heads of the disk and let NSPC be the number of sectors in a cylinder. The formula to translate CHS address to LBA is following:
 
LBA(c, h, s) = (c * Nhead + h) * NSPC + (s - 1)    [ 1 ]

One advantage of LBA is that it allows disks to be adressed easily even with a complex sector layout like in the following image. But that's not the only advantage.

Source: https://venam.nixers.net/blog/unix/2017/11/05/unix-filesystem.html

Int 13h is an interrupt service routine (ISR) provided by BIOS for disk access. This ISR was used in the 80s but it started to fall into disfavor because of constantly growing disk sizes. It has many functions but I will focus on 02 and 03. Function 02 is used for reading from disk to memory and function 03 is used for writing to disk. Before calling this interrupt, input values must be as follows:

AH = 02h/03h (read/write)
AL = Number of sectors to read/write ( >0 )
CH = Low 8-bit of cylinder number
CL = Sector number 1-63 (bit 0 .. 5) + High 2-bit of cylinder number (bit 6 .. 7)
DH = Head number
DL = Drive number: 7. bit is set for hard disks.
Example: First floppy drive: 00h, second floppy drive: 01h
First hard drive: 80h, second hard drive: 81h
Birinci sabit disk: 80h, ikinci sabit disk: 81h
ES:BX: Pointer to data to be read/written

 
I created an example for the usage of Int 13h. Even though Windows XP contains debug.exe, it does not allow such a low-level operation. I installed DOS 6.22 in a virtual machine (which I downloaded from my universitys server). I will explain how it's set up in the next article. For now, I just added screenshots:

Reading the disk using debug.exe

In the above example, I read the first sector of the hard disk. I will also explain the output in the upcoming article. There are two screenshots above: one for the beginning and one for the end of the sector.

If I did not want to use Int 13h, I could also do this operation by accessing the IDE controller directly using hardware ports. I will give an example for this a few paragraphs later.

Let's make a calculation. There are 3 bytes reserved for CHS addressing in total. These are registers CH, CL and DH. In terms of bits: 10 + 6 + 8 = 24 bits. Since a sectors capacity is 512 bytes, 512 byte * 224 = 8 GB is addressable using CHS scheme. Moreover, since the sector numbers starts with 1 instead of zero, the capacity is actually slightly less than 8 GB. 218 * 63 * 512 byte = 7.875 GB. This means CHS is insufficient to address drives larger than 8GB. LBA scheme has also an advantage in addressing over CHS, but before digging deep into that, we need to take a look at ATA standards first.

 
(Parallel) ATA Standard
In the previous section, I tried to give an overview to disk access from BIOS (Int 13h) side. From that side, the things might look bit complicated. As if this were not that complicated, the things from hardware side is complicated, too. In 1986, when Western Digital had announced the first IDE / ATA standard, 22 bits were reserved to address hard disk: 10 bits for cylinders, 4 bits for heads and 8 bits for sectors, if I am not mistaken. In 1994, in new EIDE / ATA-2 standard, bits for cylinders became 16 bits (from 10), making total number of address bits 28 bit (from 22). Moreover, 28 bit LBA addresses were also supported besides CHS. However, IBM had already designed BIOS interrupts and other standards (like MBR) and according to these standards, 10, 8 and 6 bits were reserved for cylinders, heads and sectors respectively. If we take the smallest common number of bits of the three standards into consideration, we see that CHS could support disks up to 512 MB at the largest, with 10 + 4 + 6 = 20 bits and 512 bytes per sector. In fact, even slightly less than that because sector numbers start at 1 instead of 0.*

Changing the standards (easily) was nearly impossible due to backward compatibility. Fortunately, the foundations of LBA were being laid around those years and a "nonsense" in IBM's standard provided a workaround for this problem: In their standards, IBM had allocated 8 bits to address disk heads without considering how to fit 128 platters (or 256 IO heads in other words) into a 1 inch high hard disk with 3.5 inch form factor. Because this was practically impossible, more heads (than actually exist in hard disk) were reported to OS, to keep the number of cylinders still addressable using 10 bits in hard disks larger than 512 MB. In other words, the product of physical number of cylinders and heads were equal to the product the number of cylinders and heads reported to OS. In the Int 13h code, these numbers were put in the formula [1] and converted to LBA if LBA is enabled in BIOS, of course.**

*: Detailed info on this paragraph:
https://en.wikipedia.org/wiki/Logical_block_addressing#Enhanced_BIOS
https://en.wikipedia.org/wiki/Parallel_ATA

**: Detailed info on this paragraph:
https://en.wikipedia.org/wiki/Logical_block_addressing#LBA-assisted_translation


Example:
Suppose that, a 2GB hard disk physically has 8 heads, 8320 cylinders and each cylinder has 63 sectors. When OS gets disk parameters using Int 13h, return value will be 128 heads, 520 cylinders and 63 sectors. The product has not changed. If same OS would have tried to access this disk directly via IO ports, it would not know how to write number of heads to the IO port, which is greater than 16.

Now assume that CHS 100, 17, 17 on this disk needs to be accessed. Substituting the values in the formula [1]:

LBA = (100 * 128 + 17) * 63 + (17 - 1) = 807 487. This sector is to be accessed. 

In 1990s, ATA-2 standard could support hard drives up to 128 GB using more bits in cylinder numbers, however IBM's existing programming interface could only support disks up to 8 GB with LBA conversion. By the way, IBM reserved 32 bits for LBA in MBR which I will mention in upcoming article. In the mid-1990s, IBM and Microsoft extended Int 13h capabilities with functions such as 42h and 43h. On the other hand, since LBA field is 32 bits in MBR, ATA standard also had room to progress. In 2003, with ATA-6 standard, physical addressing of hard disks became 48 bits wide and at the same time CHS addressing became history.


How BIOS Accesses to Disk? Disk Controller
Documentation about IDE/ATA disk controller ports can be found in ports.b file in the D section of the famous Ralf Brown Interrupt List. Motherboards had in the past two EIDE disk controllers. Two disks could be connected to each of these controllers as master or slave. The controllers' base addresses were 01F0h and 0170h respectively. The ports are as follows:


Port | IO | Description
-----+----+----------------------------------------------------
01F0 | RW | Data register
01F1 | R- | Error register++
01F1 | -W | (Write Precompensation Cylinder divided by 4)+
01F2 | RW | Sector count
01F3 | RW | Sector number (CHS mode)
     |    | 0-7. address bits (LBA mode)
01F4 | RW | Low byte of cylinder number (CHS mode)
     |    | 15-8. address bits (LBA mode)
01F5 | RW | High byte of cylinder number (CHS mode)
     |    | 23-16. address bits (LBA)
01F6 | RW | Drive and head number
     |    | 27-24. address bits (LBA)
01F7 | R- | Status register++
01F7 | -W | Command register++


+: This value is unused on newer disks and it can be zero. I will explain this later. ++: Please see Ralf Brown Interrupt list for the meaning of the bits in these registers.

01F6h is a little bit more complicated than it looks. Bit 5 and 7 must always be set due to backward compatibility (these bits were used with MFM disks whose sectors are not 512 byte in size). If the bit 6 is zero, the access to be made is in CHS mode. If one, it is in LBA mode. If the bit 4 is zero, the operation will be made on the master disk and if one, on the slave disk. The bits between 3 and 0 are address bits. The information so far is enough to write the code but we don't have setup a place to write the code yet. The reader can set up a DOS or FreeDOS VM and try it himself.

I wrote offsets at the beginning of each line, so that it is easy to read. And also wrote comments on the left. The code must be entered in debug.exe by giving a (Assemble) command:

0100    mov ax,0001
0103    mov dx,01F2
0106    out dx,al    ; Sector count = 1
0107    inc dx       ; dx = 01F3
0108    out dx,al    ; Sector number = 1
0109    inc dx       ; dx = 01F4
010A    dec ax       ; ax = 0000
010B    out dx,al    ; Cylinder Lo = 0
010C    inc dx       ; dx = 01F5
010D    out dx,al    ; Cylinder Hi = 0
010E    inc dx       ; dx = 01F6
010F    mov al,A0    ; CHS mode, master disk
0111    out dx,al
0112    inc dx       ; dx = 1F7
0113    mov al,20
0115    out dx,al    ; 20h ATA read sector command
0116    in al,dx     ; Read status register
0117    test al,58   ; 0101 1000: Drive ready | Seek complete | Buffer ready
0119    jz 0116      ; Wait until disk is completely read
011B    mov dx,01F0
011E    mov bx,0200
0121    in ax,dx     ; Read data (word sized)
0122    mov [bx],ax  ; Copy to the buffer
0124    inc bx
0125    inc bx
0126    cmp bx,0400  ; 0200h byte
012A    jnz 0121
012C    int 3        ; Breakpoint


To run the code, I give g (Go) command and when I check memory with d (Dump) command, I see that it is read without problem (same as previous int 13h output): 
 

This code can only run in debugger. After replacing "int 3" in last line with int 20, it can be saved as a file using following commands:

rbx
0000
rcx
002E
nreadmbr.com
w

Of course, it should be noted that the code does not generate any output to the screen. To load the file again to debug.exe, the filename is either given as a command line parameter to debug.exe or it is entered with n command and loaded with l command after that. I am leaving the explanation of LBA code to the next article.


More Theory
Write Precompensation Cylinder (WPC): In the beginning of the article, I mentioned that non-ZBR CHS disks have larger space on their outer sectors. Manufacturer sets a cylinder on these disks and the disk changes read/write encoding on the cylinders which are beyond this WPC. I really don't know whether BIOS reports this cylinder to programmer or the programmer finds it. I never had such an old disk to test.

In very old disks, heads were driven by a stepper motor. Since these motors are very sensible to heat, a technology called "voice coil motor" has been used. This video below will give an idea of how it works:

These motors were dragging the head above the platters at the applied current rate, and when the current was cut off, the heads were automatically moved to an empty area using springs.

Landing Zone (LZ): Before turning the computers with stepper motor disks off, disk heads had to be moved to a cylinder (a special cylinder which doesn't contain data) called LZ, in other words "parked". If a hard disk was moved without parking the heads, they could move on platters and scratch the disk. 19h subroutine of Int 13h does this parking. Parking is obsolete today, so the LZ is also. 
 
Voice coil motors were working robustly with heat but they had less spatial sensitivity (to position the heads accurately). And manufacturers have developed a solution for that. For example, they placed certain markings on one side of a three platter (6 heads) disk to increase position accuracy. This side was not used to store data. The top head was reading position markings, determining its accurate position according to these markings and correcting its position using feedback if necessary. Thus, interestingly, hard disks with odd numbered heads had appeared.

Source: https://www.brainbell.com/tutors/A+/Hardware/_Actuator_Arms.htm

Now I am done with the hardware part and in the next article I will write about MBR. 

Sunday, March 10, 2019

Streaming SIMD Extensions (SSE) and glibc


Hi there. This article will be like the "behind the scenes" of previous floating point article. I had a strange problem while writing the article on floating point numbers. First, I realized that gcc compiles floating point operations with SSE instruction set by default when I checked assembly output, while compiling the code on Oracle Blog. (Platforms: Linux Mint 17.3, Centos 6.x ve 7.x)

...
40054d:  f2 0f 10 45 e8        movsd xmm0,QWORD PTR [rbp-0x18]
400552:  bf 18 06 40 00        mov edi,0x400618
400557:  b8 01 00 00 00        mov eax,0x1
40055c:  e8 af fe ff ff        call 400410 <printf@plt>
400561:  f2 0f 10 45 f8        movsd xmm0,QWORD PTR [rbp-0x8]
400566:  f2 0f 10 0d b2 00 00  movsd xmm1,QWORD PTR [rip+0xb2]
                               # 400620 <_IO_stdin_used+0x10>
40056d:  00
40056e:  f2 0f 5e c1           divsd xmm0,xmm1

The above code snippet contains printf and division commands.

When I first tried "-mno-sse" compiler parameter, to force this code to run in FPU, gcc generated the following code:

...
40054d:  bf 08 06 40 00      mov    edi,0x400608
400552:  b8 00 00 00 00      mov    eax,0x0
400557:  e8 b4 fe ff ff      call    400410 <printf@plt>
40055c:  dd 45 f8            fld QWORD PTR [rbp-0x8]
40055f:  dd 05 ab 00 00 00   fld QWORD PTR [rip+0xab]
                             # 400610 <_IO_stdin_used+0x10>
400565:  de f9               fdivrp st(1),st

Interestingly, the assembly output of the same code (with different compiler parameters of course) does not print anything out except 4.940656e-324 as its output. By the way, to make the output shorter, I changed double d = 1.0; statement with double d = pow(2.0, -1000);. The code started to produce 75 lines of output instead of 1075 lines.

The number of output lines does not change regardless with or without -mno-sse. And the program terminates. Which leads that the problem is not in the calculation but it is printed incorrectly on the screen. When two code snippets are compared, the result is written to xmm0 in the first one. edi holds the pointer of "%e\n" and eax has the number of extra arguments of printf. When I checked the assembly output of second code, the floating point value is not written to xmm0 because of -mno-sse parameter. edi has the same value and since no other value is available eax has zero (no extra arguments). The C equivalent of this code is actually  and since there are no extra arguments, a garbage value from a previous calculation which remained in xmm0 is written to the screen. By the way, when I check the result of division with gdb, divsd instruction works with correct values. That also proves that the error is in printf function.

I copied the code snippets above from Linux Mint (glibc v2.19) but there was no difference in CentOS either. I made the experiments in CentOS virtual machines. I think, there is no such problem with 32-bit code since xmm registers are 64-bit (they are not used at all, maybe??). But SSE instructions can be used in 32-bit codes as well: https://stackoverflow.com/questions/24386541/gcc-wont-use-sse-in-32bit

CentOS 6.x is using glibc v2.12. When I tried with CentOS 5.6 and 5.9 (glibc v2.5), same thing happened again. CentOS v3.3 is the first 64-bit CentOS ever and uses glibc v2.3.2. When I tried to compile same code in CentOS 3.3, -mno-see that no effect at all, supposedly due to a bug in gcc version (v3.2.3). In the assembly output, the result was stored in xmm0 prior to printf.

It looks like SSE instructions were used in the printf function during the development of glibc and floating point numbers must enter to printf in xmm0 only. Therefore, -mno-sse parameter should never be used because it is even in glibc. Instead of that, using -mfpmath=387 parameter to force the floating point operations executed by the FPU will do the job perfectly.
 

On the other hand, I also mentioned that gcc has no -fns parameter in contrary with Oracle Blog article. There is a -ffast-math parameter in gcc which does the same thing (or maybe even more). This parameter consists of six different flags: -fno-math-errno, -funsafe-math-optimizations, -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans and -fcx-limited-range (ref: man gcc). -funsafe-math-optimizations parameter primarily works on subnormal numbers and the code compiled with this parameter contains following function:

0000000000400440 <set_fast_math>:
  400440:  0f ae 5c 24 fc        stmxcsr DWORD PTR [rsp-0x4]
  400445:  81 4c 24 fc 40 80 00  or DWORD PTR [rsp-0x4],0x8040
  40044c:  00
  40044d:  0f ae 54 24 fc        ldmxcsr DWORD PTR [rsp-0x4]
  400452:  c3                    ret

MXCSR is a register* similar to FPU Control Word (FCW) and it is used to control the execution of SSE instructions. The 15th and 6th bits of this register, namely "Flush to Zero (FZ)" and "Denormals are Zero (DAZ)" respectively, are set using the function above. When the code is compiled with this parameter and run, the last line of the output of that oracle code is 2.225074e-308, the smallest normalized double precision value. Since MXCSR is an SSE register, -ffast-math parameter has no effect on the execution of 80387 instructions.

80387 FPU does not have a similar mechanism for subnormal numbers. In previous article, I mentioned that FCW has a DE bit for subnormals. Using this, operations on subnormal values can be controlled. I wrote following function to access to FCW:

void setfst()   {
    short FPUCW;
    asm("fstcw %0": "=m" (FPUCW));
    FPUCW = FPUCW & ~0x02;
    asm("fldcw %0":: "m" (FPUCW));
}

Or another option:

void setfst2()   {
    short FPUCW;
    asm("fstcw %0\n\t"
        "and  %1,0xFFFD\n\t"
        "fstcw %1"
        : "=m" (FPUCW): "m" (FPUCW));
}

This function must be called just before the while loop and the code must be compiled with "-mfpmath=387 -masm=intel" parameters. When the code runs, it writes 2.225074e-308 and ends with "Floating point exception". This means, manually resetting DE flag causes exceptions when the value of an instruction is subnormal. If an exception handler is written for this and if the result of a subnormal operation (which causes an exception) is rounded to zero in this handler; the function of DAZ flag can be adapted to FPU in that way.

Friday, January 18, 2019

Floating Point Numbers and Round-Off Errors #2

 
Hi there. In the previous article, I wrote about how floating numbers are stored in the computer and mentioned that the constraints of computer architechture cause interesting behaviors. I will give further examples for these interesting behaviors.
 
The first interesting behavior is: A programmer has to test a floating point result, which is (expectedly) close to zero, whether it has in an acceptable distance to zero, rather than comparing results with zero. I will give an example for this in further sections of this article with an asterisk (*) next to it. First, I need to clarify some intermediate points.

The examples in the previous article were mostly with single precision numbers. This does not mean that double precision numbers do not have such problems. Since the coefficient is represented by more bits, the error (from previous article, "1-0.1-0.1-...." operation) in the calculation occurred in seventeenth decimal where with single precision numbers the error was in ninth. The result in Excel (first example in previous article), had an error in sixteenth decimal place. The error of numerical representation of "0.1" is on seventeenth decimal. Since we subtracted ten times, the error is also multiplied by ten. I have chosen examples with single precision numbers for only one reason: to easily show calculations and their errors.

These errors do not only occur with in operations resulting close to zero. Since coefficient bits are limited, there are errors with representation of big numbers, as well. The fact, that the coefficient field has only 23 bits, also causes problems with numbers that can only be expressed with more than 23 fractions. Let's make an experiment in Hexpert. I opened a file, located 00H offset and entered 16 777 215 to FP32 box. Then located to 04H offset and entered 16 777 216 to FP32 box and so on. I entered all numbers up to 16 777 220.


As it can be seen on the next image, all the coefficint bits are filled. Because 16 777 215 can be written as a sum of 23 (binary) fractions (proof is above). Its successor is expressed as 1*224. Bu the exponent has increased so much that no place left to write the last "1" of 224+1. Note that, even though the number at offset 08H is the successor of the number at 04H, they are actually represented with same floating point value. Like the value at offset 0CH is the same as at offset 10H. While storing numbers between 224 and 225, odd numbers cannot be represented. Similarly, the same is also true for the numbers between 225 and 226, that are not multiple of four. If small increments in those large numbers would be considered insignificant, there is no problem. But according to the problem itself 224 may not be a very large number. By the way, using a 32-bit integer, I can still express 224 and its many successors with an absolut precision.

In short, since the bit fields in floating point numbers are finite, numbers must be truncated at some point. Examples for these errors are numerous. For example irrational numbers, like Ï€ or e, cannot be expressed as a sum of finite fractions (in other words cannot be written with finite decimals), therefore can also never be fully expressed with floating point numbers. Moreover, it is impossible to represent any real number from real numbers set, which is an uncountable infinite set, using finite number of bits, at least in theory. In practice, we can still work with these numbers. The only reason for that is, it is fine in engineering when the final result is "sufficiently" accurate when rounded. For example, 4218.14 kg can be rounded as 4200 kg or even 4000 kg. These kinds of errors, which are caused by truncated bits, are called truncation or round-off errors. Although these two errors are not same in numerical analysis they can be used interchangeably because the behavior of CPU is to round-off these numbers. ("Occasionally, round-off error (the consequence of using finite precision floating point numbers on computers) is also called truncation error, especially if the number is rounded by truncation. -- ref: Truncation error (Accessed on 05.09.2020)). 

Let's consider an example of comparing two floating point numbers when they are very close to zero*. We know that sin(30) = sin(Ï€ / 6) = 0.5, thus sin(Ï€ / 6) - 0.5 = 0. Following statement "printf("%e\n", sin( M_PI / 6.0 )); prints "5.000000e-01" out. As I wrote previously, pi contains actually an approximation in itself. There must be also an approximation in sine function because of truncation of MacLaurin series (well, FPU is using mainly CORDIC but I am trying to simplify). As a result, it can be guessed that the result "5.000000e-01" is not exactly 2-1. And as expected, the statement "printf("%e\n", sin( M_PI / 6.0 ) - 0.5);" prints -5.551115e-17. While comparing the result, it should be considered that the difference is "close enough" to zero, not zero. For example, I could check if the result is in a sufficiently small neighborhood of zero, like machine epsilon. i.e.

if(fabs(sin(M_PI / 6.0) - 0.5) < DBL_EPSILON)

is a better comparison. I wrote above that the machine epsilon is only meaningful in ones neighborhood. Therefore is this approach not enough accurate, too. According to the sensitivity of the operation, machine epsilon could be large. Machine epsion of double precision numbers is about 10-16 where the smallest possible double precision number is 2.225 * 10-308 (this number is actually 4.94 * 10-324, however I will explain this later in the paragraph with double asterisks **). Choosing the correct "small enough" vaule is a difficult problem as this value can differ along the number line. For example, 3 is an "enough small" number for the numbers between 224 ile 225 but it is not "enough small" between 225 ile 226.

If the exponent is 00h or 0FFh, it carries special meanings. If the exponent is 0FFh and the coefficient is zero, the value is either positive or negative infinity according to the sign bit. These infinities can come out by dividing a very large number (dividend) by a small divisor, when the quotient doesn't fit into the register. If the exponent is 0FFh but the coefficient is non-zero, this value is called NaN or "Not a Number". This value comes out as a result of any indeterminate form in math or if the operation to be performed in FPU is undefined (like square root of a negative number). All operations with NaNs will also result in NaNs.

The second special case is that the exponent is zero. If the coefficient is zero as well (i.e. 00 00 00 00) the value is obviously zero. However the case with exponent zero, is more complicated than that. So, I previously mentioned that 1 is added to each coefficient during translation. If the coefficient is always greater than or equal to one, zero could never be obtained. Therefore, exponent zero is an exception and the value 127 is not subtracted from the exponent in this case. Instead of that, it is assumed that the exponents value is -126 and no additional 1 is added to the coefficient, in other words d0 = 0 (please refer previous article for the definition of d0). When subtracting two numbers where their exponents are -126, it will yield a difference of order of 2-127. For example:

1.5 * 2-126 - 1.375 * 2-126 = 0.125 * 2-126 = 1.25 * 2-127

To prevent an underflow and in order to be able to represent the difference in the equation above, a class of "very small" numbers are needed in FPU. Therefore, the range [0, 1], with a precision of ±2-126, has been added to floating point number standard. These numbers are called subnormal numbers (below normalized numbers). When working with subnormal numbers, FPU sets DE flag. This means, that there is an ongoing operation with DEnormal (subnormal) numbers and the least significant decimals in the result might be lost. In other words, FPU cannot guarantee the accuracy of the result of subtractions and divisions on subnormal numbers. In the sentence above which I marked with double asterisks (**), I had noted two different smallest possible values of double precision numbers. The larger one is the smallest of the normal numbers and the other one is the smallest of the subnormal numbers. (It can be thought that subnormal numbers are only intended to store results not intended for an operation).

Example code:
I wrote following code to demonstrate operations on subnormal numbers:

#include<stdio.h>
#include<stdint.h>
union uSayi {
    unsigned char bayt[8];
    uint64_t q;
    double s;
};

int main()      {
    union uSayi d1, d2, d3;
    double t1, t2, t3;

    d1.q = 0x0018000000000000;
    d2.q = 0x0016000000000000;
    d3.q = 0x0014000000000000;
    //printf("%e %e %e\n", d1.s, d2.s, d3.s);

    // 1.5 * 2^-1022 - 1.25 * 2^-1022 = 0.25 * 2^-1022
    t1 = d1.s - d3.s;
    // 1.5 * 2^-1022 - 1.375 * 2^-1022 = 0.125 * 2^-1022
    t2 = d1.s - d2.s;
    // 0.25 * 2^-1022 - 0.125 * 2^-1022 = 0.125 * 2^-1022
    t3 = t1 - t2;

    return 0;
}

I defined an union and I have accessed to the variable as a double and as an unsigned integer of 64 bits at the same time. Then I defined three variables d1, d2 and d3 as union uSayi. I have assigned small values near the normalized floating point limit and subtracted them from each other. I stored the intermediate results in variables t1, t2 and t3. I compiled this as follows since I will debug it in gdb.

gcc -mfpmath=387 -g kod2.c
gdb -tui a.out

gcc handles floating point numbers with SSE instructions by default. I forced gcc to use 80837 code because I want to see them in FPU.

I loaded the program with gdb's TUI. And since I prefer intel syntax in assembly, I gave following two commands:

(gdb) set disassembly-flavor intel
(gdb) layout asm


In the above screenshot, the assignmets to d1, d2 and d3 can be clearly seen starting from the offset 0x400574. First subtraction is at 0x4005a4, second is at 0x4005af and the last one is at 0x4005b7 using fsub instruction. I put a breakpoint on the first subtraction with break *main+71 command and run the code with run command. As the execution stopped, I checked FPU stack using info float. The numbers I entered manually, are ready to be processed in FPU:


I executed subtraction using nexti command. Then I checked the result in FPU stack using info float command. Next subtraction will run in the same way. Therefore, I proceeded to the final subtraction using until *main+90. When I give info float command in this step, I see the value of FPU Status Word is changed from 0x3800 to 0x3802, which means DE (DEnormalized) flag is set. I executed the last subtraction with a final nexti command. The result was +2,781342323134001729e-309. Then I completed the program with cont command. In documents and in Wikipedia, it is stated that instructions run slower with denormalized values than with normal values. Hence it is recommended to turn off operations with denormalized values in applications where speed is more important than accuracy of results.

At https://blogs.oracle.com/d/subnormal-numbers, it is mentioned that -fns compiler parameter allows the subnormal results to be rounded to zero during the execution however I could not find such a parameter in gcc.

Saturday, January 5, 2019

Floating Point Numbers and Round-Off Errors

 
Hi there. I was thinking of writing about this topic for a long time but I couldn't find plenty time. Since it's a broad topic, it took time to tidy it up in my mind and start typing. For the very same reason, I splited the article into two parts. In these parts, I will try to explain how floating point numbers are stored in computers using single precision numbers (the simplest one) and what kind of errors could happen using this representation of numbers. First, I have added two interesting screen captures. First is from Excel 2007, but this result could be reproduced in more recent versions:


The results of formulas are in column A. I rewrote the formulas on column A in text again, in column B. i.e. the cell A1 contains =1-0.2-0.2-0.2-0.2-0.2 and the cell A2 contains =1-0.1-0.1...-0.1 (10 times 0.1) etc. It can be seen that both results, that should be zero, deviate from zero. These results in A2 and A3 are interesting. Why did this happen? And second interesting thing is, why did this interesting result not happen in the cell A4 or A5? It seems that same results can be obtained in Octave/Matlab.



I will give a hint about this behavior: Octave's default data type is double precision number. Although 1 is an integer, it is treated as a double (see the next screen capture). The final error increases, when I specify the numbers as "single(0.2)" instead of just "0.2" (converting them from double to single precision numbers) and subtract them from 1 consecutively. For example, for 0.2, the magnitude of error rose from 10-17 to 10-8 however the error is still zero for "single(0.0625)".

Scientific Notation of Numbers
Before digging the reasons of the error deep, we should return to the school years and remember, what normalized number and scientific notation is. To write a very big number or a number with a lot of decimals; first, meaningful digits of the number are written down. Then factor such as 10n is added next to it. For numbers, close to zero, n is negative and for huge numbers, n is positive.

For example:
6 720 000 000 = 6.72 * 109
0.000 000 007 51 = 7.51 * 10-9

Assuming that, m is a real number and n is an integer, scientific notation in decimal number system can be generalized as m * 10n. Let's name m as coefficient, and n as exponent. I had initially chosen m as a real number but I need to put some other constraints over it. Let's assume that di's are significant figures {0, 1, ..., 9} and d0 is not equal to zero. Thus, m can be written as follows:


Assume that d0 is zero: For example: 0.672 * 1010 = 6.72 * 109
Assume that d0 is greater than 10: 67.2 * 108 = 6.72 * 109

Hence, under my assumptions each number can be written uniquely. I especially underlined "decimal number system" because if I generalize it to any number system the above expression would be:

 
For binary number system, b = 2 and considering the constraints, it can be seen that d0 can only 1 in binary system.

Representation of Decimals in Binary Number System
This topic can be outside of high school math but the logic behind is simple. Positional values in decimal system are:


When generalized to binary number system, if positional values go like 20, 21, 22..., the positional values after the point continue as 2-1, 2-2, 2-3..., similarly.


The number above is, 23 + 21 + 20 + 2-2 + 2-3 = 11.375. In short, binary numbers can be uniquely written using scientific notation. I previously wrote that the integer part of the coefficient will always be 1. Integers can be denoted as follows:

1 = 1 * 20
2 = 1 * 21
3 = 1.5 * 21
4 = 1 * 22
5 = 1.25 * 22
6 = 1.5 * 22
7 = 1.75 * 22
8 = 1 * 23
...

To interpret stored floating numbers, I will use a hex editor named Hexpert. I am using this editor ever since I met computers. This is my favorite editor but unfortunately no longer in development. HxD can be my second favorite editor and it is actively developed. I just noticed, that a newer version of HxD was released. In this latest version, file editing capabilities are much more better and it can also be used. I have taken the screenshots from Hexpert, throughout this article. Below is a screenshot of Hexpert, showing the content of a small file. Right now, it would be enough to create a text file with spaces in it and open it in Hexpert.

The bytes in the file are shown in the middle field. The boxes below contains 8-, 16- and 32-bit values of the bytes where the cursor is. Boxes, starting with 'S' contains signed and 'U' contains unsigned values of same byte(s). First bit of a signed number is sign bit, denotes a negative number if one and a positive number if zero. This bit is counted in the number if the byte is interpreted as an unsigned number. The boxes below also allows the user to enter some values manually. For example, 8-bit value (00h) and 16-bit value (00h 00h) at offset 30h are zero. Since this representation is Little Endian, when the four bytes (32-bit value) under the cursor, are read in reverse order (i.e. 3Fh 80h 00h 00h), it will take the value of 63 * 2563 + 128 * 2562 + 0 * 256 + 0 = 1 065 353 216 (please compare this with the value in the box S32 and U32). What I am really interested in, is the value in FP32 box. Since floating point numbers are always handled by computer as signed numbers, there is no UFP32 or UFP64 boxes obviously. FP32 means floating point number 32-bit (single precision) and FP64 means floating point number 64-bit or double precision. I will not go into this right now and only thing, I can say for now is, 40h offset contains the value 1.0 in FP64 format.
 
This representation is determined by IEEE-754 standard. In the standard in 1985, the terms 'single' and 'double' are mentioned, while in the 2008 revision of the standard, these terms are referred as binary32 and binary64 respectively. According to this standard, the first bit of a single precision number is the sign bit, the next 8-bits [bits 30-23] are exponent and the remaining 22 bits [bits 22-0] are coefficient. Let's decompose the value 1.0 accordingly: 

3F 80 00 00 = 0|011 1111 1|000 0000 0000 0000 0000 0000

[ You can switch to binary view with Alt + B keys. ]

Since the integer part is always one, there is no need to store this within the bits of floating value. It is added automatically in each calculation. Therefore, the coefficient is actually 0 + 1 = 1, here. The exponent 0111 1111b = 127 and the constant "127" is subtracted from this value according to the standard. So the real exponent becomes 127 - 127 = 0. This number is positive since the sign bit is 0. When calculated 1 * 2(127-127) = 1 is found.

Two is written as "1 * 21". If 1 is subtracted from the coefficient 1 - 1 = 0 and if  "127" is added to the exponent: 1 + 127 = 128. Thus, 2 is expressed as: 0|100 0000 0|000 0000 0000 0000 0000 0000 = (40 00 00 00)16. Of course, these bytes must be written in reverse order before it's stored since it is Little-Endian.

It should be noted that, the coefficient bits actually represent fractional binary numbers. For example, let's convert 200 to floating point. 200 = 128 + 64 + 8 = 27 + 26 + 23. The biggest exponent must be chosen as the exponent value of floating point which is 7. Therefore, 200 = 27 * (1 + 2-1 + 2-4) = 1.1001b * 27 = 1.5625 * 27. If I remove "1" from the coefficient, the rest is ".1001". Let's calculate exponent of floating point by adding 127 to 7: 7 + 127 = 134 = 1000 0110b. Putting it all together, final result is: 0|100 0011 0|100 1000 0000 0000 0000 0000 = (43 48 00 00)16.

Now, using this information, I can also express numbers with decimals:

Example: 0.25 = 1 * 2-2. Exponent is -2 + 127 = 125: 0|011 1110 1|000 0000 0000 0000 0000 0000

Let's remember repeating decimals. When we try to express the divisions with decimal numbers instead of fractions and if the dividend cannot be divided by divisor without any remainder, decimals start to repeat each other somwhere after the period (not necessarily right after it). For example: 7 / 90 = 0.0707... Behind the logic of computer representation of any floating point number, there are continuous divisons and expressing the coefficient as a sum of binary fractions. Okay, would such repeating decimals occur in floating numbers? Let's take 0.1 as an example. Its single precision floating point representation is: 0|011 1101 1|100 1100 1100 1100 1100 1101. The exponent is 123 - 127 = -4. Let's separate the digits of the coefficient by multiplying them with their positional values and sum all of them, it looks like this:


If the lowest term with 2-23 (on the last line) would not exist in this sum, the total would be equal to 0.599 999 904 632 568 359 15. In other words, the value 0.6 exactly, (actually 1.6) can never be obtained. Adding 1 to this number and multiplying the result with 2-4 yields 0.100 000 001 490 116 119 370 703 125. But when Hexpert, Excel, Matlab or Octave shows the first 6-7 digits of this number, we come up with 0.1 (of course I ignore, the number being handled as double precision numbers in Matlab or Octave, for now). This small difference accumulates on each -0.1 operation (subtraction) and the accumulated error becomes really significant when the result is in the neighborhood of zero.

Please note that, this should not be seen as a deficiency of the binary number system. If computers could use decimal number system and humans used duodecimal (base 12) system the same situation would have happened again. This is caused by the translation itself regardless of number system.

Non-zero values are observed after the ninth decimal in the above number. But when I subtract that number from 1 ten times, the actual operation is "1-1.000 000 014 901 161 193 707 031 25". The error accumulates and overflows to the eighth decimal. I previously wrote that, when the same process is repeated with single precision numbers in Octave, the magnitude of error is 10-8.

>> X=single(0.1)
X =  0.10000
>> 1-X-X-X-X-X-X-X-X-X-X
ans =  -7.4506e-008
>>

The difference between having the last bit of the coefficient of a floating number set or reset, is the smallest difference which a computer can understand between two floating point value. If this difference is between 1 and the first floating point value greater than one, it is called machine epsilon, in scientific literature. In Matlab, eps function returns this value:

>> eps
ans =   2.2204e-016
>> eps('single')
ans =   1.1921e-007

The same values and the examples on Wikipedia are comparable:
Single-precision floating-point format
0|011 1111 1|000 0000 0000 0000 0000 00012 = 3f80 000116 = 1 + 2-23 ≈ 1.0000001192
(smallest number larger than one)

0|011 1111 1111 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00012 ≙ 1 + 2-52 ≈ 1.0000000000000002, the smallest number > 1

Okay, why there was no error in subtraction with 0.0625s? Because, since 0.0625 is actually 0.00012, this number can be actually represented in CPU (or FPU to be more precise) exactly as it is. Therefore no error occurs.

In the second part of the article, I will detail the occurring errors and give further examples.