After a long time..!!

September 6, 2011 Leave a comment

Well I finished my thesis with a 2.0 (German grading system). I would give a special thanks to my guide Dr. Stefan Lankes for guiding me through out the thesis. Along with my bachelor thesis I was also able to manage a pass in all my exams held at RWTH, thus completing my B-Tech.

I came back from Germany on 30th July and was really busy in completing the formalities at Amrita. It was a bit difficult and confusing situation but all in all, its done now and I got all my certificates and degree. :)

My convocation was on 27th August. It was in pure traditional Kerala style with Kerala outfits. I was extremely happy to meet all my friends after exactly 1 year. It was kind of nostalgic feeling..love you all MY ITians. Good to see you guys after really long tme… :)

I have applied for my visa and I am expecting to fly back to Germany for my regular masters by Oct mid. Oh..btw, my application for MSc in Software Systems Engineering is accepted by RWTH and thus will continue in RWTH with my masters. And and and…I will not be called as Erasmus anymore….Erasmus tag is removed..hehe :D

For now I am going to visit my native place and looking forward to enjoy with all my cousins. Gonna enjoy my vacations till Oct mid and then….GOD BLESS ME!!! :D

Context Switch – Software vs Hardware Approach

A context switch(also known as task switch or process switch) is a mechanism of storing and restoring the state of a CPU which is assigned to a particular process(or task). A context is the contents of CPU’s registers and program counter at any point in time. During a context switch, kernel performs the following steps:

1. It suspends the process which is currently executed and stores the CPU’s state for that process somewhere in the memory.

2. Retrieve the context of the next process from memory and restores the CPU’s registers with the new register values.

3. It then returns to the location indicated by the program counter in order to resume the process.

Multitasking is successfully achieved using the very basic concept of context switching. In a multitasking operating system, multiple process execute in a single CPU seemingly simultaneously. These process does not interfere with each other. The illusion of concurrency is achieved by the means of context switches that occurs in a fraction of second.

The next task to be loaded into the memory is decided by the scheduler based on an algorithm. There are several scheduling algorithms available like, Round Robin, Priority based, Priority based Round Robin etc.

Context Switch can be performed either by using Software mechanism or Hardware mechanism.

Hardware approach uses a special hardware feature of x86 processors. This special feature is called Task State Segment(TSS). TSS is a data segment that contains the state of a processor which is associated with a particular task or a process. Each task has its own execution space like general purpose register, segments, flags, task registers etc. TSS maintain this execution state for each task. A task switch in this case can explicitly be invoked using instructions like CALL or JUMP. Thus, CPU automatically loads the new state of the process from the TSS.

Software approach is used to save and reload only the state that needs to be changed. The basic principle of software approach is to provide a function that saves the current stack pointer and reloads the new stack pointer. When this function is called, the current instruction pointer pointing to current stack pointer is stored in the old stack and the new instruction pointer pointing to the new stack pointer is popped off the new stack when the function returns. General purpose registers, flags, data segment and all other relevant registers must also be pushed on to the old stack and popped off the new stack.

As hardware approach saves almost all the register state it is slow when compared to the software approach. Performance being the main target of any operating system, software context switch is widely used in most modern operating system. Moreover, 64bit architecture does not support hardware context switches and are reliable only on software context switching.

Hope it helps.!! :)

Registers in 32 bit and 64 bit Architecture

After reading Bran’s kernel tutorial and understanding the design of most basic kernel development, it was time for me to begin my real work, i.e the implementation of 64 bit kernel. Thus to begin with, I started reading the different types of registers available in both 32 bit and 64 bit Architecture. Well without wasting much time, let me go though it quickly.

There are following four types of Registers.

1. General Purpose Registers

2. Segment Registers

3. Program Status and Control Register

4. Instruction Pointer Register

1. General Purpose Registers

In 32 bit, 8 general purpose registers are EAX, EBX, ECX, EDX, ESI, EDI, EBP and ESP.

In 64 bit, there 64 bit 16 general purpose registers and default operand size is 32 bit.Registers RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP and R8-R15 are available.

All of them serve the following purpose :

Operands for logical and arithmetic operation, Operands for address calculation and Memory pointers.

2. Segment Register

In 32 bit, there are 6 segment registers. They are, CS, DS, SS, ES, FS and GS. They are all 16 bit segment selectors, which points to the particular segment in memory. A particular segment can be accessed by the corresponding segment selector which has to present in an appropriate segment register. Each of the segment registers is associated with one of the three  types of storages: code, data or stack. CS registers mainly points to code segment. DS, ES, FS and GS register points to data segment and SS register points to Stack segment.

In 64 bit mode, CS, DS, ES, SS are  treated  as if each  segment base is 0, regardless of the value of the associated segment descriptor base. This creates a flat address space for code, data and stack. FS and GS are exception and both are used as additional base registers in linear address calculations.

3. Program Status and Control Register

The 32 bit EFLAGS registers contain the group of status flags, a control flag and a group of system flags.

Status flag: bits 0, 2, 4, 6, 7 and 11 are the status flags and they indicate the result of any arithmetic instructions.

Direction flag(DF) : The direction flag(bit 10) controls string instructions eg: MOVS, CMPS, SCAS, LODS and STOS). Setting up the DF flag causes the string instruction to auto increment whereas clearing the DF flag causes the string instruction to auto decrement. STD and CLD instruction sets and clear the DF flag, respectively.

Apart from above, there are few flags which control the operating system and should not be modified by the application.

In 64 bit mode, EFLAGS is extended to 64 bits and are called RFLAGS. The upper 32 bits of RFLAGS are reserved and the lower 32 bits are same as the EFLAGS.

4. Instruction Pointer Register

Instruction Pointer register contains the offset in the current code segment for the next instruction to be executed. It cannot be accessed directly by the software. It is controlled implicitly by control transfer instructions like JMP, JCC, CALL and RET.

In 32 bit mode, instruction pointer is EIP which is 32 bits long and in 64 bit mode, instruction pointer is 64 bits long and named as RIP.

Reference : Intel and AMD Manuals.

Linker Scripts – Inner Concepts

March 11, 2011 1 comment

Every link is controlled by a linker script. This script is written in the linker command language. The main purpose of the linker script is to describe how the sections in the input files should be mapped into the output file and to control the memory layout of the output file. Linker always uses linker script. If we do not specify then it takes in the default script that is compiled into the linker executable.

Basic Linker Script Concepts :

The linker combines input files into a single output file. Both output file and input file are object files. The output file is often executable. These object file contains different sections, which have its own name and size. A section may be marked as loadable, allocatable or none. The loadable section means that the contents should be loaded into the memory when the output file is run. The allocatable sections with no contents, means a certain area in memory should be kept aside, but nothing in particular should be loaded there A section which is neither loadable nor allocatable contains some sort of debugging information.

Every loadable or allocatable section will have two addresses:

1. VMA – Virtual Memory Address. This is the address the section will have when the output file is run.

2. LMA – load Memory Address. This is the address at which the section will be loaded.

In many cases the two address will be the same. An interesting example where these two address are different is when a data section is loaded into ROM and then later copied into RAM when the program starts. Here the ROM address would be LMA and RAM address would be VMA.

Every object file also has a list of symbols called symbol table. A symbol may be defined or undefined. Each symbol has a name and each defined symbol will have an address. After compiling a C or C++ file, we get a defined symbol for every defined function and global or static variable. Every undefined function or global variable which is referenced in the input file will become an undefined symbol.

Simple Linker Script Example :

The simplest possible linker script has just one command: ‘SECTIONS’. We use the ‘SECTIONS’ command to describe the memory layout of the output file.

Lets assume our program consists only of code, initialized data and uninitialized data. These will be in ‘.text’,’.data’ and ‘.bss’ sections respectively. Assume also that these are the only sections that appear in our input files.

Example : lets say that the code should be loaded at the address 0×10000 and that the data should start at address 0×8000000. The corresponding linker script for this would be

1. SECTIONS

2. {

3.     . = 0×10000;

4.     .text : { *(.text) }

5.     . = 0×8000000;

6.     .data : { *(.data) }

7.     .bss : { *(.bss) }

8. }

‘SECTIONS’ is a keyword followed by a series of symbol assignments and output section descriptions enclosed in curly braces.

Line 3 sets the value of the special symbol ‘.’ which is the location counter. If address is not specified then the address is set from the current value of the location counter. The location counter is then incremented by the size of the output section. At the start of the ‘SECTIONS’ command, the location counter has the value ‘0’.

Line 4 defines an output section ‘.text’. Within the curly braces , we list the names of the input sections which should be placed into this output section. The ‘*’ is a wildcard which matches any file name. The expression ‘*(.text)’ means all ‘.text’ input sections in all input files.

Since the locations counter is 0×10000, when the output section ‘.text’ is defined, the linker will set the address of the ‘.text’ section in the output file to be ‘0×10000’.

The remaining lines define the ‘.data’ and ‘.bss’ sections in the output file. The linker will place ‘.data’ output section at address 0×8000000. After that the location counter will be 0×8000000 plus the size of the ‘.data’ output section. The linker will then place the ‘.bss’ output section immediately after the ‘.data’ output section in memory.

The linker will ensure that each output section has the required alignment, by increasing the location counter if necessary

In the above example, the specified address for the ‘.text’ and ‘.data’ sections will probably satisfy any alignment constraints, but the linker may have to create a small gap between the ‘.data’ and ‘.bss’ sections.

Reference : The GNU Linker

Keyboard – Kernel’s Perspective

The keyboard is the most common way for a user to provide a computer input. Thus it is necessary to create a driver for handling and managing the keyboard inputs.A scancode is simply a key number. The keyboard assigns a number to each key on the keyboard, this is your scancode. The scancodes are numbered generally from top to bottom and left to right. The lookup table is called a keymap, and this lookup table is be used to translate scancodes to ASCII values. The keyboard is attached to the computer through a special microcontroller chip on your mainboard. This keyboard controller chip has 2 channels: one for the keyboard, and one for the mouse. The keyboard controller has an address on the I/O bus that can be used for access and control. The keyboard controller has 2 main registers: a Data register and a Control register. Anything that the keyboard wants to send the computer is stored into the Data register. The keyboard will generate an IRQ1 indicating that it has data ready to be grabbed. When the IRQ happens, we call this handler which reads from  corresponding port. This data is the keyboard’s scancode. For this example, we check if the key was pressed or released. If it was just pressed, we translate the scancode to ASCII, and print that character out with one line.

Reference : Bran’s Kernel Development Tutorial

IRQ, PIC and PIT

March 8, 2011 2 comments

IRQ – Interrupt Request

Interrupt Requests or IRQs are interrupts that are raised by hardware devices. Some devices generate an IRQ when they have data ready to be read, or when they finish a command eg : writing the content of buffer to the disk. In short, a device for example, from sound card to network card, mouse, keyboard, serial ports etc will generate an IRQ whenever it wants the processor’s attention.

PIC – Programmable Interrupt Controllers

Any IBM PC/AT Compatible computer has 2 chips that are used to manage IRQs. These chips are called PIC. One acts as a master IRQ controller and another one acts as a slave IRQ controller. The slave is connected to IRQ2 on the master controller, whereas master controller is directly connected to the processor itself. Each PIC can handle 8 IRQs. Master PIC handles IRQ0 to IRQ7 and slave IRQ handles IRQ8 to IRQ15.Whenever a device signals an IRQ, the CPU pauses whatever it was doing and calls the ISR to handle the corresponding IRQ. The CPU then performs whatever actions were required and tells the PIC that the CPU has finished executing the correct routine.

PIT – Programmable Interval Timer

The programmable Interval Timer(PIT) is also called the System Clock, is a chip used for generating interrupts at regular time intervals. It has 3 channels. Channel 0 is mapped to IRQ0 and is used to interrupt the CPU at predictable and regular times. Channel 1 is system specific and Channel 2 is connected to the system speaker and is used in order to make computer beep. Out of three, main Channels to be considered are Channel 0 and 2. Channel 0 allows to accurately schedule new processes later on, as well as allow the current task to wait for certain period of time.

Reference : Bran’s Kernel Development Tutorial

IDT and ISR

IDT – Interrupt Descriptor Table

The IDT is used to show the processor what Interrupt Service Routine (ISR) to call to handle an exception. IDT entries are also called Interrupt requests whenever a device has completed a request and needs to be serviced. IDT entries are similar to GDT entries. Both have base address, access flag and both are 64 bits long. The main difference lies in the meanings of the address fields. In an IDT, the base address is the address of ISR that the processor should call when this interrupt is called. An IDT entry doesn’t have a limit, instead it has a segment that need to be specified. The segment is the same as located ISR. This will allow the processor to pass on the control to the kernel through an interrupt that has occurred when the processor is in a different ring like when an application is running.

ISR – Interrupt Service Routine

ISRs are used to save the current processor state and set up the appropriate segment registers needed for kernel mode before the kernel’s C-level interrupt handler is called. To handle the right exception, the correct entry in the IDT should be pointed to the correct ISR. An exception is a special case that the processor encounters when it cannot continue the normal execution. For example when dividing  by zero, the result is unknown, thus the processor will throw an exception and kernel will stop that process avoiding any problems. If the processor finds that the program is trying to access a piece of memory that it shouldn’t, it will cause the General Protection Fault.
Some exceptions push an error code onto the stack. Thus to decrease the complexity, a dummy error code of 0 is pushed onto the stack for any ISR that doesn’t push an error code already. This is done to keep a uniform stack frame. To track the exception, the interrupt number is also pushed on to the stack. The assembler opcode ‘cli’ is used to disable  the interrupts and prevent an IRQ (Interrupt Request) from firing, which could cause the conflicts in our kernel. To protect the kernel, each ISR is made to jump to ‘isr_common_stub’. The assembler opcode ‘isr_common_stub’ will save the processor state on the stack, push the current stack address onto the stack, call the ‘fault_handler’ function and finally restore the state of the stack.

Reference : Bran’s Kernel Development Tutorial.

GDT – Global Descriptor Table

March 6, 2011 2 comments
The GDT is type of data structure used by Intel x86 family processors, in order to define the characteristics of the various memory used during program execution. It defines base access privileges for certain parts of memory. We can use an entry in the GDT to generate segment violation exceptions and kernel is thus given an opportunity to end a process that it shouldn’t be doing. Most modern operating systems use the concept of ‘Paging’ to do this. It is alot more versatile and flexible.
The GDT is a list of 64 bit long entries. These entries defines where exactly in memory that the allowed region will start, the limit of this region and the access privileges associated with the entry. Each entry also defines whether or not the current segment that the processor is running in is for system use (ring 0) or for application use (ring 3). Major OS today only use ring 0 and ring 3. Any application causes an exception if it tries to access system or ring 0 data. This is mainly to prevent an application causing the kernel to crash. As far as GDT is concerned, the ring levels tell the processor if it is allowed to execute special privileged instructions or not. Certain instructions are privileged, meaning that they can only be run in higher ring levels. eg: ‘cli’ – to enable interrupts and ‘sti’ – to disable the interrupts. If an application is allowed to use these instructions then it could effectively stop the kernel from running.
While creating a GDT, mainly 3 entries are important :
One, dummy descriptor in the beginning to act as the NULL segment for the processor’s memory protection features. Entry 0 is known as the NULL descriptor and no segment register should be set to 0 as otherwise this will cause a General Protection fault, and is a protection feature of the processor.
Second, entry for the code segment. The Code Segment (CS) tells the processor which offset into the GDT that it will find the access privilege in which to execute the current code.
Third, entry for the data segment registers. The Data Segment (DS) defines the access privileges for the current data. ES, FS and GS are simply alternate DS registers and are not important as such.

Reference : Bran’s Kernel Development Tutorial

Kernel Entry Point and Linker Script

KERNEL ENTRY POINT :

Kernel Entry point is the piece of code that will be executed first when the bootloader calls the kernel. This chunk of code is mainly written in assembly language as things like new stack or loading a new Global Descriptor Table, (GDT is type of data structure used by Intel x86 family processors, in order to define the characteristics of the various memory used during program execution), Interrupt Descriptor Table (IDT is also a data structure used by the x86 architecture to implement an interrupt vector table. IDT is used by the processor to determine the correct response to interrupts and exceptions), or segment registers are things that cannot be done using C codes. In many kernels will put all their assembler code in this one file. and put all the rest of the sources in several C source files.
As far as code is concerned, all this file does is load a new 8KB stack, and then jump into an infinite loop. The stack is a very small amount of memory and is used to store or pass arguments to functions in C. It can also be used to hold local variables that is declared and used inside the functions. All other global variables are stored in the data and BSS sections. A BSS (Block Started by Symbol) section typically includes all uninitialized variables declared outside any function as well as uninitialized local variables declared with the static keyword.  The program loader initializes the memory allocated for the BSS section when it loads the program. Operating System may use a technique called zero-filled on demand to efficiently implement the bss segment.

LINKER SCRIPT :

The Linker is a tool that takes all the compiler and assembler output files and links them together into one binary file. A binary file can have many formats, most common ones are FLAT, AOUT, COFF, PE and ELF. Regardless of which ever format we choose there are always 3 sections in the outout file. ‘Text’ or ‘Code’ which are executable itself. The ‘Data’ section is for hard coded vales used in the code. eg: suppose you declare a variable and set its vales to 5. The value 5 would get stored in ‘Data’ section. The third section is calles the ‘BSS’ section. It is a virtual section and doesn’t exist in the binary image, but exist in memory when your binary is loaded.

For more details on Linker Scripts click here

Combine The Two :
Now we must combine the two to create our kernel’s binary for GRUB to load. The Simplest way to do this in Unix is to create a makefle script to do the assembling, compiling and linking for us. However if you are using windows, you can create a batch file. A batch file is simply a collection of DOS commands that can be executed with one command.

Reference : Bran’s Kernel Development Tutorial

Kernel Development – An Introduction

February 15, 2011 1 comment

A kernel is the essential center of a computer operating system, the core that provides basic services for all other parts of the operating system and manages the resources that the hardware needs to offer. Developing a kernel means that you have understood how to create software that interacts with and manages the hardware. Few most important system resources has to be taken care are :

1. Processor or CPU
2. Memory
3. Hardware resources

1. Processor or CPU
This is managed mainly by allotting time for a specific operations and interrupting a task or a process when it is time for another scheduled event to happen. A simple concept of multitasking is applied here. The system time is used to interrupt the current process and switch to a new process, which guarantees that the process will be given a chunk of time to run. Various scheduling algorithms can be used to find which process will be run next. The simplest is ‘Round Robin’. A more complicated scheduler involves ‘priorities’, where process is switched based on their priorities. A higher priority task is allowed more time to run than a low priority task. Even more complicated algorithm is ‘Real-time scheduler’. This algorithm guarantee that a certain process will be allowed at least a set number of timer ticks to run.

2. Memory
There might be incidents where memory is more precious than CPU time, probably due memory limits. Where as CPU is not. Thus we can either code the kernel to be memory efficient, which require lot of CPU or CPU efficient by using memory to store caches and buffer to ‘remember’ commonly used items instead of looking them up ever time. Thus combining both i.e best memory usage while preserving CPU time is the best approach.

3.Hardware resources
These includes Interrupt Requests (IRQs), Direct Memory Access (DMA), address on the I/O bus in the form of a port. IRQs are the special signals generated by hardware devices that tells the CPU to execute a certain routine to handle the data. DMA channel locks the memory bus and transfer it’s data directly into system memory without halting the processor’s execution. A DMA-enabled device doesn’t bother the CPU. It then generate and IRQ to the CPU indicating that the transfer is complete. This increases the performance of the system. The third most important hardware resource is address on the I/O bus in the form of a port. A device is configured, read or given data using it’s I/O port. It can use many I/O ports.

Reference : Bran’s Kernel Development Tutorial

Follow

Get every new post delivered to your Inbox.