After a long time..!!

September 6, 2011 Leave a comment

Well I finished my thesis with a 2.0 (German grading system). I would give a special thanks to my guide Dr. Stefan Lankes for guiding me through out the thesis. Along with my bachelor thesis I was also able to manage a pass in all my exams held at RWTH, thus completing my B-Tech.

I came back from Germany on 30th July and was really busy in completing the formalities at Amrita. It was a bit difficult and confusing situation but all in all, its done now and I got all my certificates and degree. 🙂

My convocation was on 27th August. It was in pure traditional Kerala style with Kerala outfits. I was extremely happy to meet all my friends after exactly 1 year. It was kind of nostalgic you all MY ITians. Good to see you guys after really long tme… 🙂

I have applied for my visa and I am expecting to fly back to Germany for my regular masters by Oct mid. Oh..btw, my application for MSc in Software Systems Engineering is accepted by RWTH and thus will continue in RWTH with my masters. And and and…I will not be called as Erasmus anymore….Erasmus tag is removed..hehe 😀

For now I am going to visit my native place and looking forward to enjoy with all my cousins. Gonna enjoy my vacations till Oct mid and then….GOD BLESS ME!!! 😀


Context Switch – Software vs Hardware Approach

A context switch(also known as task switch or process switch) is a mechanism of storing and restoring the state of a CPU which is assigned to a particular process(or task). A context is the contents of CPU’s registers and program counter at any point in time. During a context switch, kernel performs the following steps:

1. It suspends the process which is currently executed and stores the CPU’s state for that process somewhere in the memory.

2. Retrieve the context of the next process from memory and restores the CPU’s registers with the new register values.

3. It then returns to the location indicated by the program counter in order to resume the process.

Multitasking is successfully achieved using the very basic concept of context switching. In a multitasking operating system, multiple process execute in a single CPU seemingly simultaneously. These process does not interfere with each other. The illusion of concurrency is achieved by the means of context switches that occurs in a fraction of second.

The next task to be loaded into the memory is decided by the scheduler based on an algorithm. There are several scheduling algorithms available like, Round Robin, Priority based, Priority based Round Robin etc.

Context Switch can be performed either by using Software mechanism or Hardware mechanism.

Hardware approach uses a special hardware feature of x86 processors. This special feature is called Task State Segment(TSS). TSS is a data segment that contains the state of a processor which is associated with a particular task or a process. Each task has its own execution space like general purpose register, segments, flags, task registers etc. TSS maintain this execution state for each task. A task switch in this case can explicitly be invoked using instructions like CALL or JUMP. Thus, CPU automatically loads the new state of the process from the TSS.

Software approach is used to save and reload only the state that needs to be changed. The basic principle of software approach is to provide a function that saves the current stack pointer and reloads the new stack pointer. When this function is called, the current instruction pointer pointing to current stack pointer is stored in the old stack and the new instruction pointer pointing to the new stack pointer is popped off the new stack when the function returns. General purpose registers, flags, data segment and all other relevant registers must also be pushed on to the old stack and popped off the new stack.

As hardware approach saves almost all the register state it is slow when compared to the software approach. Performance being the main target of any operating system, software context switch is widely used in most modern operating system. Moreover, 64bit architecture does not support hardware context switches and are reliable only on software context switching.

Hope it helps.!! 🙂

Registers in 32 bit and 64 bit Architecture

April 30, 2011 1 comment

After reading Bran’s kernel tutorial and understanding the design of most basic kernel development, it was time for me to begin my real work, i.e the implementation of 64 bit kernel. Thus to begin with, I started reading the different types of registers available in both 32 bit and 64 bit Architecture. Well without wasting much time, let me go though it quickly.

There are following four types of Registers.

1. General Purpose Registers

2. Segment Registers

3. Program Status and Control Register

4. Instruction Pointer Register

1. General Purpose Registers

In 32 bit, 8 general purpose registers are EAX, EBX, ECX, EDX, ESI, EDI, EBP and ESP.

In 64 bit, there 64 bit 16 general purpose registers and default operand size is 32 bit.Registers RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP and R8-R15 are available.

All of them serve the following purpose :

Operands for logical and arithmetic operation, Operands for address calculation and Memory pointers.

2. Segment Register

In 32 bit, there are 6 segment registers. They are, CS, DS, SS, ES, FS and GS. They are all 16 bit segment selectors, which points to the particular segment in memory. A particular segment can be accessed by the corresponding segment selector which has to present in an appropriate segment register. Each of the segment registers is associated with one of the three  types of storages: code, data or stack. CS registers mainly points to code segment. DS, ES, FS and GS register points to data segment and SS register points to Stack segment.

In 64 bit mode, CS, DS, ES, SS are  treated  as if each  segment base is 0, regardless of the value of the associated segment descriptor base. This creates a flat address space for code, data and stack. FS and GS are exception and both are used as additional base registers in linear address calculations.

3. Program Status and Control Register

The 32 bit EFLAGS registers contain the group of status flags, a control flag and a group of system flags.

Status flag: bits 0, 2, 4, 6, 7 and 11 are the status flags and they indicate the result of any arithmetic instructions.

Direction flag(DF) : The direction flag(bit 10) controls string instructions eg: MOVS, CMPS, SCAS, LODS and STOS). Setting up the DF flag causes the string instruction to auto increment whereas clearing the DF flag causes the string instruction to auto decrement. STD and CLD instruction sets and clear the DF flag, respectively.

Apart from above, there are few flags which control the operating system and should not be modified by the application.

In 64 bit mode, EFLAGS is extended to 64 bits and are called RFLAGS. The upper 32 bits of RFLAGS are reserved and the lower 32 bits are same as the EFLAGS.

4. Instruction Pointer Register

Instruction Pointer register contains the offset in the current code segment for the next instruction to be executed. It cannot be accessed directly by the software. It is controlled implicitly by control transfer instructions like JMP, JCC, CALL and RET.

In 32 bit mode, instruction pointer is EIP which is 32 bits long and in 64 bit mode, instruction pointer is 64 bits long and named as RIP.

Reference : Intel and AMD Manuals.

Linker Scripts – Inner Concepts

March 11, 2011 1 comment

Every link is controlled by a linker script. This script is written in the linker command language. The main purpose of the linker script is to describe how the sections in the input files should be mapped into the output file and to control the memory layout of the output file. Linker always uses linker script. If we do not specify then it takes in the default script that is compiled into the linker executable.

Basic Linker Script Concepts :

The linker combines input files into a single output file. Both output file and input file are object files. The output file is often executable. These object file contains different sections, which have its own name and size. A section may be marked as loadable, allocatable or none. The loadable section means that the contents should be loaded into the memory when the output file is run. The allocatable sections with no contents, means a certain area in memory should be kept aside, but nothing in particular should be loaded there A section which is neither loadable nor allocatable contains some sort of debugging information.

Every loadable or allocatable section will have two addresses:

1. VMA – Virtual Memory Address. This is the address the section will have when the output file is run.

2. LMA – load Memory Address. This is the address at which the section will be loaded.

In many cases the two address will be the same. An interesting example where these two address are different is when a data section is loaded into ROM and then later copied into RAM when the program starts. Here the ROM address would be LMA and RAM address would be VMA.

Every object file also has a list of symbols called symbol table. A symbol may be defined or undefined. Each symbol has a name and each defined symbol will have an address. After compiling a C or C++ file, we get a defined symbol for every defined function and global or static variable. Every undefined function or global variable which is referenced in the input file will become an undefined symbol.

Simple Linker Script Example :

The simplest possible linker script has just one command: ‘SECTIONS’. We use the ‘SECTIONS’ command to describe the memory layout of the output file.

Lets assume our program consists only of code, initialized data and uninitialized data. These will be in ‘.text’,’.data’ and ‘.bss’ sections respectively. Assume also that these are the only sections that appear in our input files.

Example : lets say that the code should be loaded at the address 0x10000 and that the data should start at address 0x8000000. The corresponding linker script for this would be


2. {

3.     . = 0x10000;

4.     .text : { *(.text) }

5.     . = 0x8000000;

6.     .data : { *(.data) }

7.     .bss : { *(.bss) }

8. }

‘SECTIONS’ is a keyword followed by a series of symbol assignments and output section descriptions enclosed in curly braces.

Line 3 sets the value of the special symbol ‘.’ which is the location counter. If address is not specified then the address is set from the current value of the location counter. The location counter is then incremented by the size of the output section. At the start of the ‘SECTIONS’ command, the location counter has the value ‘0’.

Line 4 defines an output section ‘.text’. Within the curly braces , we list the names of the input sections which should be placed into this output section. The ‘*’ is a wildcard which matches any file name. The expression ‘*(.text)’ means all ‘.text’ input sections in all input files.

Since the locations counter is 0x10000, when the output section ‘.text’ is defined, the linker will set the address of the ‘.text’ section in the output file to be ‘0x10000’.

The remaining lines define the ‘.data’ and ‘.bss’ sections in the output file. The linker will place ‘.data’ output section at address 0x8000000. After that the location counter will be 0x8000000 plus the size of the ‘.data’ output section. The linker will then place the ‘.bss’ output section immediately after the ‘.data’ output section in memory.

The linker will ensure that each output section has the required alignment, by increasing the location counter if necessary

In the above example, the specified address for the ‘.text’ and ‘.data’ sections will probably satisfy any alignment constraints, but the linker may have to create a small gap between the ‘.data’ and ‘.bss’ sections.

Reference : The GNU Linker

Keyboard – Kernel’s Perspective

The keyboard is the most common way for a user to provide a computer input. Thus it is necessary to create a driver for handling and managing the keyboard inputs.A scancode is simply a key number. The keyboard assigns a number to each key on the keyboard, this is your scancode. The scancodes are numbered generally from top to bottom and left to right. The lookup table is called a keymap, and this lookup table is be used to translate scancodes to ASCII values. The keyboard is attached to the computer through a special microcontroller chip on your mainboard. This keyboard controller chip has 2 channels: one for the keyboard, and one for the mouse. The keyboard controller has an address on the I/O bus that can be used for access and control. The keyboard controller has 2 main registers: a Data register and a Control register. Anything that the keyboard wants to send the computer is stored into the Data register. The keyboard will generate an IRQ1 indicating that it has data ready to be grabbed. When the IRQ happens, we call this handler which reads from  corresponding port. This data is the keyboard’s scancode. For this example, we check if the key was pressed or released. If it was just pressed, we translate the scancode to ASCII, and print that character out with one line.

Reference : Bran’s Kernel Development Tutorial


March 8, 2011 2 comments

IRQ – Interrupt Request

Interrupt Requests or IRQs are interrupts that are raised by hardware devices. Some devices generate an IRQ when they have data ready to be read, or when they finish a command eg : writing the content of buffer to the disk. In short, a device for example, from sound card to network card, mouse, keyboard, serial ports etc will generate an IRQ whenever it wants the processor’s attention.

PIC – Programmable Interrupt Controllers

Any IBM PC/AT Compatible computer has 2 chips that are used to manage IRQs. These chips are called PIC. One acts as a master IRQ controller and another one acts as a slave IRQ controller. The slave is connected to IRQ2 on the master controller, whereas master controller is directly connected to the processor itself. Each PIC can handle 8 IRQs. Master PIC handles IRQ0 to IRQ7 and slave IRQ handles IRQ8 to IRQ15.Whenever a device signals an IRQ, the CPU pauses whatever it was doing and calls the ISR to handle the corresponding IRQ. The CPU then performs whatever actions were required and tells the PIC that the CPU has finished executing the correct routine.

PIT – Programmable Interval Timer

The programmable Interval Timer(PIT) is also called the System Clock, is a chip used for generating interrupts at regular time intervals. It has 3 channels. Channel 0 is mapped to IRQ0 and is used to interrupt the CPU at predictable and regular times. Channel 1 is system specific and Channel 2 is connected to the system speaker and is used in order to make computer beep. Out of three, main Channels to be considered are Channel 0 and 2. Channel 0 allows to accurately schedule new processes later on, as well as allow the current task to wait for certain period of time.

Reference : Bran’s Kernel Development Tutorial


IDT – Interrupt Descriptor Table

The IDT is used to show the processor what Interrupt Service Routine (ISR) to call to handle an exception. IDT entries are also called Interrupt requests whenever a device has completed a request and needs to be serviced. IDT entries are similar to GDT entries. Both have base address, access flag and both are 64 bits long. The main difference lies in the meanings of the address fields. In an IDT, the base address is the address of ISR that the processor should call when this interrupt is called. An IDT entry doesn’t have a limit, instead it has a segment that need to be specified. The segment is the same as located ISR. This will allow the processor to pass on the control to the kernel through an interrupt that has occurred when the processor is in a different ring like when an application is running.

ISR – Interrupt Service Routine

ISRs are used to save the current processor state and set up the appropriate segment registers needed for kernel mode before the kernel’s C-level interrupt handler is called. To handle the right exception, the correct entry in the IDT should be pointed to the correct ISR. An exception is a special case that the processor encounters when it cannot continue the normal execution. For example when dividing  by zero, the result is unknown, thus the processor will throw an exception and kernel will stop that process avoiding any problems. If the processor finds that the program is trying to access a piece of memory that it shouldn’t, it will cause the General Protection Fault.
Some exceptions push an error code onto the stack. Thus to decrease the complexity, a dummy error code of 0 is pushed onto the stack for any ISR that doesn’t push an error code already. This is done to keep a uniform stack frame. To track the exception, the interrupt number is also pushed on to the stack. The assembler opcode ‘cli’ is used to disable  the interrupts and prevent an IRQ (Interrupt Request) from firing, which could cause the conflicts in our kernel. To protect the kernel, each ISR is made to jump to ‘isr_common_stub’. The assembler opcode ‘isr_common_stub’ will save the processor state on the stack, push the current stack address onto the stack, call the ‘fault_handler’ function and finally restore the state of the stack.

Reference : Bran’s Kernel Development Tutorial.