TechKnowledge

Thursday, December 8, 2016

Communication Techniques | Programmed I/O | Interrupt Driven I/O | Direct Memory Access (DMA)

Communication Techniques

Three techniques are possible foe I/O operations:

• Programmed I/O

• Interrupt Driven I/O

• Direct Memory Access (DMA)

Programmed I/O

When a processor is executing a program and encounters an instruction related to I/O, it executes the instruction by using a command to the appropriate I/O module.

In the case of programmed I/O, the I/O module performs the requested action and then set the appropriate bits in the I/O status register but takes no further action to alter the processor.

In particular, it does not interrupt the processor.

Thus, after the I/O instruction is invoke, the processor must take some action role in determining when the I/O instruction is completed.

For this purpose, the processor periodically checks the status of the I/O module until it finds that the operation is complete.

With this technique the processor is responsible for extracting data from main memory for output and storing data into memory for input.

Thus the instruction set includes the I/O instructions in the following categories:

• Control: Used to active an external device and tell it what to do.

• Status: Used to test various conditions associated with an I/O modules.

• Transfer: Used to read and/or write data between processor register and external devices.

Interrupt Driven I/O

With programmed I/O processor has to wait for a long time for the I/O module of concern to be read for either reception or transmission of more data.

The processor while waiting must interrogate the status of the I/O module.

As a result, the performance level of the entire system is degraded.

An alternative is for the processor to issue an I/O command to a module and then go no to do some other useful work.

The I/O module will then interrupt the processor to request service when it is ready to exchange the data with the processor.

The processor then executes the data transfer, as before, and then resumes its former processing.

Let us consider how this works, first from the point of view of the I/O module.

For input, the I/O module receives a READ command from the processor.

The I/O module then proceeds to read data in form an associated peripheral.

Once the data are in the module’s data register, the module signals an interrupt to the processor over a control line.

The module then waits until its data are requested by the processor.

When the request is made, the module places its data on the data bus and is then ready from another operation.

From the processor’s point of view, the action for input is as follows.

The processor issues a READ command. It then saves the context of the current program and goes off and does something else. At the end of each instruction cycle, the processor checks for interrupt.

When the interrupt from the I/O module occurs, the processor save the context of the program it is currently executing and begins to execute an interrupt-handling program that processes the interrupts.

The interrupt driven I/O is more efficient than programmed I/O because it eliminates needless waiting. However interrupt driven still consumes a lot of processor time, because every word of data that goes from memory to I/O module or from I/O module to memory must pass through the processor.

There are multiple I/O modules in a computer system, so mechanisms are needed to determine which device caused the interrupt and to decide, in case of multiple interrupts, which one to handle first.

Direct Memory Access (DMA)

Interrupt driven I/O, though more efficient then simple programmed I/O, still requires the active intervention of the processor to transfer data between memory and I/O module, and any data transfer must traverse a path through the processor.

Thus, both of these forms of I/O suffer from two drawbacks:

1. The I/O transfer rate is limited by the speed with which the processor can test and service the device.

2. The processor is tied up in managing an I/O transfers; a number of instructions must be executed for each I/O transfer.

When large volumes of data are to be moved, a more efficient technique is required: Direct Memory Access (DMA).

DMA function can be performed by a separate module on the system bus or it can be incorporated into an I/O module.

When the processor wishes to read or write a block of data, it issues a command to the DMA module by sending following details to DMA module:

• Whether a read or write is required.

• The address of the I/O device involved.

• The starting location in memory to read the data from or write data to.

• The number or words to be read or written.

The processor then continues with other network. It has delegated this I/O operation to the DMA module, and that module will take care of it.

The DMA module transfer the entire block of data or one word at a time directly to or from memory without going through the processor.

When the transfer is completed the DMA module sends an interrupt to the processor. The processor is involved at the beginning and at the end of transfer.

The DMA module needs to take control of bus to transfer the data to and from memory. Because of this competition of bus usage, there may be time when the processor needs the bus and wait for the DMA module. But this is not an interrupt; the processor does not save the context and do something else.

Sunday, August 28, 2016

Mutual Exclusion - Hardware Support, Interrupt Disabling, Special Machine Instructions, Properties of the Machine-Instruction Approach

Interrupt Disabling

In a uniprocessor system, concurrent processes cannot have overlapped execution;

they can only be interleaved. Furthermore, a process will continue to run until it invokes

an OS service or until it is interrupted. Therefore, to guarantee mutual exclusion,

it is sufficient to prevent a process from being interrupted. This capability can

be provided in the form of primitives defined by the OS kernel for disabling and enabling

interrupts. A process can then enforce mutual exclusion in the following way

while (true) {
/* disable interrupts */;
/* critical section */;
/* enable interrupts */;
/* remainder */;
}

Because the critical section cannot be interrupted, mutual exclusion is guaranteed.
The price of this approach, however, is high. The efficiency of execution could
be noticeably degraded because the processor is limited in its ability to interleave
processes. A second problem is that this approach will not work in a multiprocessor
architecture. When the computer includes more than one processor, it is possible
for more than one process to be executing at a time. In this case, disabled
interrupts do not guarantee mutual exclusion.

Special Machine Instructions

In a multiprocessor configuration, several processors share access to a common
main memory. In this case, there is not a master/slave relationship; rather the processors
behave independently in a peer relationship. There is no interrupt mechanism
between processors on which mutual exclusion can be based.

At the hardware level, as was mentioned, access to a memory location excludes
any other access to that same location. With this as a foundation, processor
designers have proposed several machine instructions that carry out two actions
atomically, such as reading and writing or reading and testing, of a single memory
location with one instruction fetch cycle. During execution of the instruction, access
to the memory location is blocked for any other instruction referencing that
location.

Compare&Swap Instruction The compare&swap instruction, also called a
compare and exchange instruction

int compare_and_swap (int *word, int testval, int newval)
{
int oldval;
oldval = *word
if (oldval == testval) *word = newval;
return oldval;
}

This version of the instruction checks a memory location (*word) against a
test value (testval). If the memory location’s current value is testval, it is replaced
with newval; otherwise it is left unchanged. The old memory value is always returned;
thus, the memory location has been updated if the returned value is the
same as the test value. This atomic instruction therefore has two parts: A compare is
made between a memory value and a test value; if the values differ a swap occurs.
The entire compare&swap function is carried out atomically; that is, it is not subject
to interruption.

Another version of this instruction returns a Boolean value: true if the swap
occurred; false otherwise. Some version of this instruction is available on nearly all
processor families, and most operating systems use this

instruction for support of concurrency.

Mutual Exclusion - Hardware Support, Interrupt Disabling, Special Machine Instructions, Properties of the Machine-Instruction Approach

Properties of the Machine-Instruction Approach

mutual exclusion has a number of Advantages

It is applicable to any number of processes on either a single processor or multiple
processors sharing main memory.

It is simple and therefore easy to verify.

It can be used to support multiple critical sections; each critical section can be
defined by its own variable.

Disadvantages

Busy waiting is employed. Thus, while a process is waiting for access to a critical
section, it continues to consume processor time.

Starvation is possible. When a process leaves a critical section and more than
one process is waiting, the selection of a waiting process is arbitrary. Thus,
some process could indefinitely be denied access

Deadlock is possible. Consider the following scenario on a single-processor
system. Process P1 executes the special instruction and enters its critical section.
P1 is then interrupted to give the processor to P2, which has higher priority.
If P2 now attempts to use the same resource as P1, it will be denied access because of
the mutual exclusion mechanism. Thus it will go into a busy waiting loop. However,

P1 will never be dispatched because it is of lower priority than another ready process, P2.

Thursday, August 18, 2016

Principles of Concurrency, Race Condition, Operating System Concerns, Process Interaction, Competition among Processes for Resources, Cooperation among Processes by Sharing, Cooperation among Processes by Communication, Requirements for Mutual Exclusion

Principles of Concurrency, Race Condition, Operating System Concerns, Process Interaction, Competition among Processes for Resources, Cooperation among Processes by Sharing, Cooperation among Processes by Communication, Requirements for Mutual Exclusion

In a single-processor multiprogramming system, processes are interleaved in time to yield the appearance of simultaneous execution. Even though actual parallel processing is not achieved, and even though there is a certain amount of overhead involved in switching back and forth between processes, interleaved execution provides major benefits in processing efficiency and in program structuring. In a multiple-processor system, it is possible not only to interleave the execution of multiple processes but also to overlap them

At first glance, it may seem that interleaving and overlapping represent fundamentally different modes of execution and present different problems. In fact, both techniques can be viewed as examples of concurrent processing, and both present the same problems. In the case of a uniprocessor, the problems stem from a basic characteristic of multiprogramming systems: The relative speed of execution of processes cannot be predicted. It depends on the activities of other processes, the way in which the OS handles interrupts, and the scheduling policies of the OS. The following difficulties arise:

1. The sharing of global resources is fraught with peril. For example, if two processes both make use of the same global variable and both perform reads and writes on that variable, then the order in which the various reads and writes are executed is critical.An example of this problem is shown in the following subsection.

2. It is difficult for the OS to manage the allocation of resources optimally. For example, process A may request use of, and be granted control of, a particular I/O channel and then be suspended before using that channel. It may be undesirable for the OS simply to lock the channel and prevent its use by other processes; indeed this may lead to a deadlock condition.

3. It becomes very difficult to locate a programming error because results are typically not deterministic and reproducible

All of the foregoing difficulties present themselves in a multiprocessor system as well, because here too the relative speed of execution of processes is unpredictable. A multiprocessor system must also deal with problems arising from the simultaneous execution of multiple processes. Fundamentally, however, the problems are the same as those for uniprocessor systems. This should become clear as the discussion proceeds.

A Simple Example

void echo()
{
chin = getchar();
chout = chin;
putchar(chout);
}

This procedure shows the essential elements of a program that will provide a character echo procedure; input is obtained from a keyboard one keystroke at a time. Each input character is stored in variable chin. It is then transferred to variable chout and sent to the display. Any program can call this procedure repeatedly to accept user input and display it on the user’s screen.

Now consider that we have a single-processor multiprogramming system supporting a single user. The user can jump from one application to another, and each application uses the same keyboard for input and the same screen for output. Because each application needs to use the procedure echo, it makes sense for it to be a shared procedure that is loaded into a portion of memory global to all applications. Thus, only a single copy of the echo procedure is used, saving space.

The sharing of main memory among processes is useful to permit efficient and close interaction among processes. However, such sharing can lead to problems.

1. Process P1 invokes the echo procedure and is interrupted immediately after getchar returns its value and stores it in chin. At this point, the most recently entered character, x, is stored in variable chin.

2. Process P2 is activated and invokes the echo procedure, which runs to conclusion, inputting and then displaying a single character, y, on the screen.

3. Process P1 is resumed. By this time, the value x has been overwritten in chin and therefore lost. Instead, chin contains y, which is transferred to chout and displayed.

Thus, the first character is lost and the second character is displayed twice. The essence of this problem is the shared global variable, chin. Multiple processes have access to this variable. If one process updates the global variable and then is interrupted, another process may alter the variable before the first process can use its value. Suppose, however, that we permit only one process at a time to be in that procedure.

1. Process P1 invokes the echo procedure and is interrupted immediately after the conclusion of the input function. At this point, the most recently entered character, x, is stored in variable chin.

2. Process P2 is activated and invokes the echo procedure. However, because P1 is still inside the echo procedure, although currently suspended, P2 is blocked from entering the procedure. Therefore, P2 is suspended awaiting the availability of the echo procedure.

3. At some later time, process P1 is resumed and completes execution of echo.The proper character, x, is displayed.

4. When P1 exits echo, this removes the block on P2. When P2 is later resumed, the echo procedure is successfully invoked.

This example shows that it is necessary to protect shared global variables (and other shared global resources) and that the only way to do that is to control the code that accesses the variable. If we impose the discipline that only one process at a time may enter echo and that once in echo the procedure must run to completion before it is available for another process, then the type of error just discussed will not occur.

This problem was stated with the assumption that there was a single-processor, multiprogramming OS.The example demonstrates that the problems of concurrency occur even when there is a single processor. In a multiprocessor system, the same problems of protected shared resources arise, and the same solution works. First, suppose that there is no mechanism for controlling access to the shared global variable:

1. Processes P1 and P2 are both executing, each on a separate processor. Both processes invoke the echo procedure.

2. The following events occur; events on the same line take place in parallel:

Process P1 Process P2
• •
chin = getchar(); •
• chin = getchar();
chout = chin; chout = chin;
putchar(chout); •
• putchar(chout);
• •

The result is that the character input to P1 is lost before being displayed, and the character input to P2 is displayed by both P1 and P2. Again, let us add the capability of enforcing the discipline that only one process at a time may be in echo.

1. Processes P1 and P2 are both executing, each on a separate processor. P1 invokes the echo procedure.

2. While P1 is inside the echo procedure, P2 invokes echo. Because P1 is still inside the echo procedure (whether P1 is suspended or executing), P2 is blocked from entering the procedure. Therefore, P2 is suspended awaiting the availability of the echo procedure.

3. At a later time, process P1 completes execution of echo, exits that procedure, and continues executing. Immediately upon the exit of P1 from echo, P2 is resumed and begins executing echo.

In the case of a uniprocessor system, the reason we have a problem is that an interrupt can stop instruction execution anywhere in a process. In the case of a multiprocessor system, we have that same condition and, in addition, a problem can be caused because two processes may be executing simultaneously and both trying to access the same global variable. However, the solution to both types of problem is the same: control access to the shared resource.

Race Condition

A race condition occurs when multiple processes or threads read and write data items so that the final result depends on the order of execution of instructions in the multiple processes. Let us consider two simple examples.

As a first example, suppose that two processes, P1 and P2, share the global variable a. At some point in its execution, P1 updates a to the value 1, and at some point in its execution, P2 updates a to the value 2.Thus, the two tasks are in a race to write variable a. In this example the “loser” of the race (the process that updates last) determines the final value of a.

For our second example, consider two process, P3 and P4, that share global variables b and c, with initial values b = 1 and c = 2. At some point in its execution, P3 executes the assignment b = b + c, and at some point in its execution, P4 executes the assignment c = b + c. Note that the two processes update different variables. However, the final values of the two variables depend on the order in which the two processes execute these two assignments. If P3 executes its assignment statement first, then the final values are b = 3 and c = 5. If P4 executes its assignment statement first, then the final values are b = 4 and c = 3.

Operating System Concerns

What design and management issues are raised by the existence of concurrency?

1. The OS must be able to keep track of the various processes.

2. The OS must allocate and deallocate various resources for each active process. At times, multiple processes want access to the same resource.These resources include

Processor time
Memory
Files
I/O devices

3. The OS must protect the data and physical resources of each process against unintended interference by other processes. This involves techniques that relate to memory, files, and I/O devices.

4. The functioning of a process, and the output it produces, must be independent of the speed at which its execution is carried out relative to the speed of other concurrent processes.

Process Interaction

• Processes unaware of each other: These are independent processes that are not intended to work together. The best example of this situation is the multiprogramming of multiple independent processes. These can either be batch jobs or interactive sessions or a mixture. Although the processes are not working together, the OS needs to be concerned about competition for resources. For example, two independent applications may both want to access the same disk or file or printer. The OS must regulate these accesses.

• Processes indirectly aware of each other: These are processes that are not necessarily aware of each other by their respective process IDs but that share access to some object, such as an I/O buffer. Such processes exhibit cooperation in sharing the common object.

• Processes directly aware of each other: These are processes that are able to communicate with each other by process ID and that are designed to work jointly on some activity. Again, such processes exhibit cooperation.

Competition among Processes for Resources

Concurrent processes come into conflict with each other when they are competing for the use of the same resource. In its pure form, we can describe the situation as follows. Two or more processes need to access a resource during the course of their execution. Each process is unaware of the existence of other processes, and each is to be unaffected by the execution of the other processes. It follows from this that each process should leave the state of any resource that it uses unaffected. Examples of resources include I/O devices, memory, processor time, and the clock.

There is no exchange of information between the competing processes. However, the execution of one process may affect the behavior of competing processes. In particular, if two processes both wish access to a single resource, then one process will be allocated that resource by the OS, and the other will have to wait. Therefore, the process that is denied access will be slowed down. In an extreme case, the blocked process may never get access to the resource and hence will never terminate successfully.

In the case of competing processes three control problems must be faced. First is the need for mutual exclusion. Suppose two or more processes require access to a single nonsharable resource, such as a printer. During the course of execution, each process will be sending commands to the I/O device, receiving status information, sending data, and/or receiving data. We will refer to such a resource as a critical resource, and the portion of the program that uses it a critical section of the program. It is important that only one program at a time be allowed in its critical section. We cannot simply rely on the OS to understand and enforce this restriction because the detailed requirements may not be obvious. In the case of the printer, for example, we want any individual process to have control of the printer while it prints an entire file. Otherwise, lines from competing processes will be interleaved.

/* PROCESS 1 */

void P1

{

while (true) {

/* preceding code /;

entercritical (Ra);

/* critical section */;

exitcritical (Ra);

/* following code */;

}

/* PROCESS 2 */

void P2

{

while (true) {

/* preceding code */;

entercritical (Ra);

/* critical section */;

exitcritical (Ra);

/* following code */;

}

..........

/* PROCESS n */
void Pn
{
while (true) {
/* preceding code */;

entercritical (Ra);

/* critical section */;

exitcritical (Ra);

/* following code */;
}
}

The enforcement of mutual exclusion creates two additional control problems. One is that of deadlock. For example, consider two processes, P1 and P2, and two resources, R1 and R2. Suppose that each process needs access to both resources to perform part of its function. Then it is possible to have the following situation: the OS assigns R1 to P2, and R2 to P1. Each process is waiting for one of the two resources. Neither will release the resource that it already owns until it has acquired the other resource and performed the function requiring both resources. The two processes are deadlocked.

A final control problem is starvation. Suppose that three processes (P1, P2, P3) each require periodic access to resource R. Consider the situation in which P1 is in possession of the resource, and both P2 and P3 are delayed, waiting for that resource. When P1 exits its critical section, either P2 or P3 should be allowed access to R. Assume that the OS grants access to P3 and that P1 again requires access before P3 completes its critical section. If the OS grants access to P1 after P3 has finished, and subsequently alternately grants access to P1 and P3, then P2 may indefinitely be denied access to the resource, even though there is no deadlock situation.

Control of competition inevitably involves the OS because it is the OS that allocates resources. In addition, the processes themselves will need to be able to express the requirement for mutual exclusion in some fashion, such as locking a resource prior to its use. Any solution will involve some support from the OS, such as the provision of the locking facility. The mutual exclusion mechanism in abstract terms. There are n processes to be executed concurrently. Each process includes (1) a critical section that operates on some resource Ra, and (2) additional code preceding and following the critical section that does not involve access to Ra. Because all processes access the same resource Ra, it is desired that only one process at a time be in its critical section. To enforce mutual exclusion, two functions are provided: entercritical and exitcritical. Each function takes as an argument the name of the resource that is the subject of competition. Any process that attempts to enter its critical section while another process is in its critical section, for the same resource, is made to wait.

It remains to examine specific mechanisms for providing the functions entercritical and exitcritical. For the moment, we defer this issue while we consider the other cases of process interaction.

Cooperation among Processes by Sharing

The case of cooperation by sharing covers processes that interact with other processes without being explicitly aware of them. For example, multiple processes may have access to shared variables or to shared files or databases. Processes may use and update the shared data without reference to other processes but know that other processes may have access to the same data. Thus the processes must cooperate to ensure that the data they share are properly managed. The control mechanisms must ensure the integrity of the shared data.

Because data are held on resources (devices, memory), the control problems of mutual exclusion, deadlock, and starvation are again present. The only difference is that data items may be accessed in two different modes, reading and writing, and only writing operations must be mutually exclusive.

However, over and above these problems, a new requirement is introduced: that of data coherence. As a simple example, consider a bookkeeping application in which various data items may be updated. Suppose two items of data a and b are to be maintained in the relationship a b.That is, any program that updates one value must also update the other to maintain the relationship.

P1:
a = a + 1;
b = b + 1;
P2:
b = 2 * b;
a = 2 * a;

If the state is initially consistent, each process taken separately will leave the shared data in a consistent state.

a = a + 1;
b = 2 * b;
b = b + 1;
a = 2 * a;

At the end of this execution sequence, the condition a b no longer holds. For example, if we start with a = b = 1, at the end of this execution sequence we have a 4 and b 3. The problem can be avoided by declaring the entire sequence in each process to be a critical section.

Thus we see that the concept of critical section is important in the case of cooperation by sharing. The same abstract functions of entercritical and exitcritical can be used here. In this case, the argument for the functions could be a variable, a file, or any other shared object. Furthermore, if critical sections are used to provide data integrity, then there may be no specific resource or variable that can be identified as an argument. In that case, we can think of the argument as being an identifier that is shared among concurrent processes to identify critical sections that must be mutually exclusive.

Cooperation among Processes by Communication

In the first two cases that we have discussed, each process has its own isolated environment that does not include the other processes. The interactions among processes are indirect. In both cases, there is a sharing. In the case of competition, they are sharing resources without being aware of the other processes. In the second case, they are sharing values, and although each process is not explicitly aware of the other processes, it is aware of the need to maintain data integrity. When processes cooperate by communication, however, the various processes participate in a common effort that links all of the processes. The communication provides a way to synchronize, or coordinate, the various activities.

Typically, communication can be characterized as consisting of messages of some sort. Primitives for sending and receiving messages may be provided as part of the programming language or provided by the OS kernel.

Because nothing is shared between processes in the act of passing messages, mutual exclusion is not a control requirement for this sort of cooperation. However, the problems of deadlock and starvation are still present. As an example of deadlock, two processes may be blocked, each waiting for a communication from the other. As an example of starvation, consider three processes, P1, P2, and P3, that exhibit the following behavior. P1 is repeatedly attempting to communicate with either P2 or P3, and P2 and P3 are both attempting to communicate with P1. A sequence could arise in which P1 and P2 exchange information repeatedly, while P3 is blocked waiting for a communication from P1. There is no deadlock, because P1 remains active, but P3 is starved.

Requirements for Mutual Exclusion

1. Mutual exclusion must be enforced: Only one process at a time is allowed into its critical section, among all processes that have critical sections for the same resource or shared object.

2. A process that halts in its noncritical section must do so without interfering with other processes.

3. It must not be possible for a process requiring access to a critical section to be delayed indefinitely: no deadlock or starvation.

4. When no process is in a critical section, any process that requests entry to its critical section must be permitted to enter without delay.

5. No assumptions are made about relative process speeds or number of processors.

6. A process remains inside its critical section for a finite time only.

There are a number of ways in which the requirements for mutual exclusion can be satisfied. One way is to leave the responsibility with the processes that wish to execute concurrently.Thus processes, whether they are system programs or application programs, would be required to coordinate with one another to enforce mutual exclusion, with no support from the programming language or the OS. We can refer to these as software approaches. Although this approach is prone to high processing overhead and bugs, it is nevertheless useful to examine such approaches to gain a better understanding of the complexity of concurrent processing.

Saturday, August 13, 2016

Microkernels, Microkernel Architecture, Benefits of a Microkernel Organization, Microkernel Performance, Microkernel Design, Low-Level Memory Management, Interprocess Communication, I/O and Interrupt Management

A microkernel is a small OS core that provides the foundation for modular extensions. The term is somewhat fuzzy, however, and there are a number of questions about microkernels that are answered differently by different OS design teams. These questions include how small a kernel must be to qualify as a microkernel, how to design device drivers to get the best performance while abstracting their functions from the hardware, whether to run nonkernel operations in kernel or user space, and whether to keep existing subsystem code or start from scratch.

The microkernel approach was popularized by its use in the Mach OS, which is now the core of the Macintosh Mac OS X operating system. In theory, this approach provides a high degree of flexibility and modularity. A number of products now boast microkernel implementations, and this general design approach is likely to be seen in most of the personal computer, workstation, and server operating systems developed in the near future.

Microkernel Architecture

Operating systems developed in the mid to late 1950s were designed with little concern about structure. No one had experience in building truly large software systems, and the problems caused by mutual dependence and interaction were grossly underestimated. In these monolithic operating systems, virtually any procedure can call any other procedure. Such lack of structure was unsustainable as operating systems grew to massive proportions. For example, the first version of OS/360 contained over a million lines of code; Multics, developed later, grew to 20 million lines of code [DENN84]. modular programming techniques were needed to handle this scale of software development. Specifically, layered operating systems were developed in which functions are organized hierarchically and interaction only takes place between adjacent layers. With the layered approach, most or all of the layers execute in kernel mode.

Kernel Architecture

Problems remain even with the layered approach. Each layer possesses considerable functionality. Major changes in one layer can have numerous effects, many difficult to trace, on code in adjacent layers. As a result, it is difficult to implement tailored versions of a base OS with a few functions added or subtracted. And security is difficult to build in because of the many interactions between adjacent layers.

The philosophy underlying the microkernel is that only absolutely essential core OS functions should be in the kernel. Less essential services and applications are built on the microkernel and execute in user mode. Although the dividing line between what is in and what is outside the microkernel varies from one design to the next, the common characteristic is that many services that traditionally have been part of the OS are now external subsystems that interact with the kernel and with each other; these include device drivers, file systems, virtual memory manager, windowing system, and security services.

A microkernel architecture replaces the traditional vertical, layered stratification of an OS with a horizontal one. OS components external to the microkernel are implemented as server processes; these interact with each other on a peer basis, typically by means of messages passed through the microkernel. Thus, the microkernel functions as a message exchange: It validates messages, passes them between components, and grants access to hardware. The microkernel also performs a protection function; it prevents message passing unless exchange is allowed.

For example, if an application wishes to open a file, it sends a message to the file system server. If it wishes to create a process or thread, it sends a message to the process server. Each of the servers can send messages to other servers and can invoke the primitive functions in the microkernel. This is a client/server architecture within a single computer.

Benefits of a Microkernel Organization

A number of advantages for the use of microkernels have been reported in the literature.

• Uniform interfaces
• Extensibility
• Flexibility
• Portability
• Reliability
• Distributed system support
• Support for object-oriented operating systems (OOOSS)

Microkernel design imposes a uniform interface on requests made by a process. Processes need not distinguish between kernel-level and user-level services because all such services are provided by means of message passing.

Any OS will inevitably need to acquire features not in its current design, as new hardware devices and new software techniques are developed.The microkernel architecture facilitates extensibility, allowing the addition of new services as well as the provision of multiple services in the same functional area. For example, there may be multiple file organizations for diskettes; each organization can be implemented as a user-level process rather than having multiple file services available in the kernel. Thus, users can choose from a variety of services the one that provides the best fit to the user’s needs. With the microkernel architecture, when a new feature is added, only selected servers need to be modified or added.The impact of new or modified servers is restricted to a subset of the system. Further, modifications do not require building a new kernel.

Related to the extensibility of the microkernel architecture is its flexibility. Not only can new features be added to the OS, but also existing features can be subtracted to produce a smaller, more efficient implementation. A microkernel-based OS is not necessarily a small system. Indeed, the structure lends itself to adding a wide range of features. But not everyone needs, for example, a high level of security or the ability to do distributed computing. If substantial features are made optional, the base product will appeal to a wider variety of users.

Intel’s near monopoly of many segments of the computer platform market is unlikely to be sustained indefinitely. Thus, portability becomes an attractive feature of an OS. In the microkernel architecture, all or at least much of the processorspecific code is in the microkernel.Thus, changes needed to port the system to a new processor are fewer and tend to be arranged in logical groupings.

The larger the size of a software product, the more difficult it is to ensure its reliability.Although modular design helps to enhance reliability, even greater gains can be achieved with a microkernel architecture. A small microkernel can be rigorously tested. Its use of a small number of application programming interfaces (APIs) improves the chance of producing quality code for the OS services outside the kernel. The system programmer has a limited number of APIs to master and limited means of interacting with and therefore adversely affecting other system components.

The microkernel lends itself to distributed system support, including clusters controlled by a distributed OS. When a message is sent from a client to a server process, the message must include an identifier of the requested service. If a distributed system is configured so that all processes and services have unique identifiers, then in effect there is a single system image at the microkernel level. A process can send a message without knowing on which computer the target service resides.We return to this point in our discussion of distributed systems in Part Six.

A microkernel architecture works well in the context of an object-oriented operating system. An object-oriented approach can lend discipline to the design of the microkernel and to the development of modular extensions to the OS. As a result, a number of microkernel design efforts are moving in the direction of object orientation. One promising approach to marrying the microkernel architecture with OOOS principles is the use of components. Components are objects with clearly defined interfaces that can be interconnected to form software in a building block fashion. All interaction between components uses the component interface. Other systems, such as Windows, do not rely exclusively or fully on object-oriented methods but have incorporated object-oriented principles into the microkernel design.

Microkernel Performance

A potential disadvantage of microkernels that is often cited is that of performance. It takes longer to build and send a message via the microkernel, and accept and decode the reply, than to make a single service call. However, other factors come into play so that it is difficult to generalize about the performance penalty, if any.

Much depends on the size and functionality of the microkernel. summarizes a number of studies that reveal a substantial performance penalty for what might be called first-generation microkernels. These penalties persisted despite efforts to optimize the microkernel code. One response to this problem was to enlarge the microkernel by reintegrating critical servers and drivers back into the OS. Prime examples of this approach are Mach and Chorus. Selectively increasing the functionality of the microkernel reduces the number of user-kernel mode switches and the number of address-space process switches. However, this workaround reduces the performance penalty at the expense of the strengths of microkernel design: minimal interfaces, flexibility, and so on.

Another approach is to make the microkernel not larger but smaller. argues that, properly designed, a very small microkernel eliminates the performance penalty and improves flexibility and reliability. To give an idea of the sizes involved, a typical first-generation microkernel consists of 300 Kbytes of code and 140 system call interfaces. An example of a small second-generation microkernel is L4 which consists of 12 Kbytes of code and 7 system calls. Experience with these systems indicates that they can perform as well or better than a layered OS such as UNIX.

Microkernel Design

Because different microkernels exhibit a range of functionality and size, no hardand-fast rules can be stated concerning what functions are provided by the microkernel and what structure is implemented. In this section, we present a minimal set of microkernel functions and services, to give a feel for microkernel design.

The microkernel must include those functions that depend directly on the hardware and those functions needed to support the servers and applications operating in user mode. These functions fall into the general categories of low-level memory management, interprocess communication (IPC), and I/O and interrupt management.

Low-Level Memory Management

The microkernel has to control the hardware concept of address space to make it possible to implement protection at the process level. As long as the microkernel is responsible for mapping each virtual page to a physical frame, the bulk of memory management, including the protection of the address space of one process from another and the page replacement algorithm and other paging logic, can be implemented outside the kernel. For example, a virtual memory module outside the microkernel decides when to bring a page into memory and which page already in memory is to be replaced; the microkernel maps these page references into a physical address in main memory.

The concept that paging and virtual memory management can be performed external to the kernel was introduced with Mach’s external pager. the operation of an external pager. When a thread in the application references a page not in main memory, a page fault occurs and execution traps to the kernel. The kernel then sends a message to the pager process indicating which page has been referenced.The pager can decide to load that page and allocate a page frame for that purpose. The pager and the kernel must interact to map the pager’s logical operations onto physical memory. Once the page is available, the pager sends a resume message to the application.

This technique enables a nonkernel process to map files and databases into user address spaces without invoking the kernel. Application-specific memory sharing policies can be implemented outside the kernel.

a set of just three microkernel operations that can support external paging and virtual memory management

• Grant: The owner of an address space can grant a number of its pages to another process. The kernel removes these pages from the grantor’s address space and assigns them to the designated process.

• Map: A process can map any of its pages into the address space of another process, so that both processes have access to the pages. This creates shared memory between the two processes. The kernel maintains the assignment of these pages to the original owner but provides a mapping to permit access by other processes.

Page Fault Processing

• Flush: A process can reclaim any pages that were granted or mapped to other processes.

To begin, the kernel allocates all available physical memory as resources to a base system process. As new processes are created, pages from the original total address space can be granted or mapped to the new process. Such a scheme could support multiple virtual memory schemes simultaneously.

Interprocess Communication

The basic form of communication between processes or threads in a microkernel OS is messages. A message includes a header that identifies the sending and receiving process and a body that contains direct data, a pointer to a block of data, or some control information about the process. Typically, we can think of IPC as being based on ports associated with processes. A port is, in essence, a queue of messages destined for a particular process; a process may have multiple ports. Associated with the port is a list of capabilities indicating what processes may communicate with this process. Port identities and capabilities are maintained by the kernel. A process can grant new access to itself by sending a message to the kernel indicating the new port capability.

A note about message passing is appropriate here. Message passing between separate processes with nonoverlapping address spaces involves memory-to-memory copying and thus is bounded by memory speeds and does not scale with processor speeds.Thus, current OS research reflects an interest in thread-based IPC and memorysharing schemes such as page remapping.

I/O and Interrupt Management

With a microkernel architecture, it is possible to handle hardware interrupts as messages and to include I/O ports in address spaces. The microkernel can recognize interrupts but does not handle them. Rather, it generates a message for the user-level process currently associated with that interrupt.Thus, when an interrupt is enabled, a particular user-level process is assigned to the interrupt and the kernel maintains the mapping. Transforming interrupts into messages must be done by the microkernel, but the microkernel is not involved in device-specific interrupt handling.

viewing hardware as a set of threads that have unique thread identifiers and send messages to associated software threads in user space.A receiving thread determines whether the message comes from an interrupt and determines the specific interrupt. The general structure of such user-level code is the following:

driver thread:
do
waitFor (msg, sender);
if (sender == my_hardware_interrupt) {
read/write I/O ports;
reset hardware interrupt;
}
else • • •;
while (true);

Friday, August 12, 2016

Symmetric Multiprocessing (SMP) in os

Traditionally, the computer has been viewed as a sequential machine. Most computer programming languages require the programmer to specify algorithms as sequences of instructions. A processor executes programs by executing machine instructions in sequence and one at a time. Each instruction is executed in a sequence of operations (fetch instruction, fetch operands, perform operation, store results).

This view of the computer has never been entirely true. At the microoperation level, multiple control signals are generated at the same time. Instruction pipelining, at least to the extent of overlapping fetch and execute operations, has been around for a long time. Both of these are examples of performing functions in parallel.

As computer technology has evolved and as the cost of computer hardware has dropped, computer designers have sought more and more opportunities for parallelism, usually to improve performance and, in some cases, to improve reliability. In this book, we examine the two most popular approaches to providing parallelism by replicating processors: symmetric multiprocessors (SMPs) and clusters.

SMP Architecture

It is useful to see where SMP architectures fit into the overall category of parallel processors. A taxonomy that highlights parallel processor systems first introduced by Flynn [FLYN72] is still the most common way of categorizing such systems. Flynn proposed the following categories of computer systems:

Single instruction single data (SISD) stream: A single processor executes a single instruction stream to operate on data stored in a single memory.

Single instruction multiple data (SIMD) stream: A single machine instruction controls the simultaneous execution of a number of processing elements on a lockstep basis. Each processing element has an associated data memory, so that each instruction is executed on a different set of data by the different processors. Vector and array processors fall into this category.

Multiple instruction single data (MISD) stream: A sequence of data is transmitted to a set of processors, each of which executes a different instruction sequence. This structure has never been implemented.

Multiple instruction multiple data (MIMD) stream: A set of processors simultaneously execute different instruction sequences on different data sets

With the MIMD organization, the processors are general purpose, because they must be able to process all of the instructions necessary to perform the appropriate data transformation. MIMDs can be further subdivided by the means in which the processors communicate. If the processors each have a dedicated memory, then each processing element is a self-contained computer. Communication among the computers is either via fixed paths or via some network facility. Such a system is known as a cluster, or multicomputer. If the processors share a common memory, then each processor accesses programs and data stored in the shared memory, and processors communicate with each other via that memory; such a system is known as a shared-memory multiprocessor.

One general classification of shared-memory multiprocessors is based on how processes are assigned to processors. The two fundamental approaches are master/ slave and symmetric. With a master/slave architecture, the OS kernel always runs on a particular processor. The other processors may only execute user programs and perhaps OS utilities. The master is responsible for scheduling processes or threads. Once a process/thread is active, if the slave needs service (e.g., an I/O call), it must send a request to the master and wait for the service to be performed. This approach is quite simple and requires little enhancement to a uniprocessor multiprogramming OS. Conflict resolution is simplified because one processor has control of all memory and I/O resources. The disadvantages of this approach are as follows:

A failure of the master brings down the whole system.
The master can become a performance bottleneck, because it alone must do all scheduling and process management.

Parallel Processor Architectures

In a symmetric multiprocessor (SMP), the kernel can execute on any processor, and typically each processor does self-scheduling from the pool of available processes or threads. The kernel can be constructed as multiple processes or multiple threads, allowing portions of the kernel to execute in parallel. The SMP approach complicates the OS. It must ensure that two processors do not choose the same process and that processes are not somehow lost from the queue. Techniques must be employed to resolve and synchronize claims to resources.

The design of both SMPs and clusters is complex, involving issues relating to physical organization, interconnection structures, interprocessor communication, OS design, and application software techniques. Our concern here, and later in our discussion of clusters, is primarily with OS design issues, although in both cases we touch briefly on organization.

SMP Organization

There are multiple processors, each of which contains its own control unit, arithmetic-logic unit, and registers. Each processor has access to a shared main memory and the I/O devices through some form of interconnection mechanism; a shared bus is a common facility. The processors can communicate with each other through memory (messages and status information left in shared address spaces). It may also be possible for processors to exchange signals directly. The memory is often organized so that multiple simultaneous accesses to separate blocks of memory are possible.

In modern computers, processors generally have at least one level of cache memory that is private to the processor. This use of cache introduces some new design considerations. Because each local cache contains an image of a portion of main memory, if a word is altered in one cache, it could conceivably invalidate a word in another cache. To prevent this, the other processors must be alerted that an update has taken place. This problem is known as the cache coherence problem and is typically addressed in hardware rather than by the OS.

Symmetric Multiprocessor Organization

Wednesday, June 15, 2016

Differentiate Hardware and Micro-programmed control unit

Hardware Control Unit:
-----------------------------
There are two major types of control organization: hardwired control and micro programmed control. In the hardwired organization, the control logic is implemented with gates, flip-flops, decoders, and other digital circuits. It has the advantage that it can be optimized to produce a fast mode of operation. In the micro programmed organization, the control information is stored in a control memory. The control memory is programmed to initiate the required sequence of micro operations. A hardwired control, as the name implies, requires changes in the wiring among the various components if the design has to be modified or changed. In the micro programmed control, any required changes or modifications can be done by updating the microprogram in control memory.

Micro Programed Control Unit:
---------------------------------------
The control memory is assumed to be a ROM, within which all control information is permanently stored. The control memory address register specifies the address of the microinstruction, and d the control data register holds the microinstruction read from memory.

The microinstruction contains a control word that specifies one or more micro-operations for the data processor. Once these operations are executed, the control must determine the next address. The location of the next microinstruction may be the one next in sequence, or it may be located somewhere else in the control memory. For this reason it is necessary to use some bits of the present microinstruction to control the generation of the address of the next microinstruction. The next address may also be a function of external input conditions. While the microoperations are being executed, the next address is computed in the next address generator circuit and then transferred into the control address register to read the next microinstruction. Thus a microinstruction contains bits for initiating microoperations in the data processor part and bits that determine the address sequence for the control memory.
The next address generator is sometimes called a microprogram sequencer, as it determines the address sequence that is read from control memory. The address of the next microinstruction can be specified in several ways, depending on the sequencer inputs. Typical functions of a microprogram sequencer are incrementing the control address register by one, loading into the control address register an address from control memory, transferring an external address, or loading an initial address to start the control operations.

The control data register holds the present microinstruction while the next address is computed and read from memory. The data register is some-times called a pipeline register. It allows the execution of the microoperations specified by the control word simultaneously with the generation of the next microinstruction. This configuration requires a two-phase clock, with one clock applied to the address register and the other to the data register.
The system can operate without the control data register by applying a single-phase clock to the address register. The control word and next-address information are taken directly from the control memory. It must be realized that a ROM operates as a combinational circuit, with the address value as the input and the corresponding word as the output. The content of the specified word in ROM remains in the output wires as long as its address value remains in the address register. No read signal is needed as in a random-access memory. Each clock pulse will execute the microoperations specified by the control word and also transfer a new address to the control address register. In the example that follows we assume a single-phase clock and therefore we do not use a control data register. In this way the address register is the only component in the control system that receives clock pulses. The other two components: the sequencer and the control memory are combinational circuits and do not need a clock.
The main advantage of the microprogrammed control is the fact that once the hardware configuration is established, there should be no need for further hardware or wiring changes. If we want to establish a different control sequence for the system, all we need to do is specify a different set of microinstructions for control memory. The hardware configuration should not be changed for different operations; the only thing that must be changed is the microprogram residing in control memory.

Friday, June 10, 2016

Dekker's Algorithm (Mutual Exclusion)

This is the first correct solution proposed for the two-process case. Originally developed by Dekker in a different context, it was applied to the critical section problem by Dijkstra.
CONCEPT: Both the turn variable and the status flags are combined in a careful way. The entry protocol begins as in Algorithm 3; we (the requesting process) set our flag and then check our neighbor's flag. If that flag is also set, the turn variable is used. There is no progress problem now because we know the other process is in its CS or entry protocol. If the turn is ours we wait for the flag of the other process to clear. No process will wait indefinitely with its flag set. If the turn belongs to the other process we wait, but we clear our flag before waiting to avoid blocking the other process. When the turn is given to us we reset our flag and proceed.

INITIALIZATION:

typedef char boolean;
...
shared boolean flags[n -1];
shared int turn;
...
turn = i ;
...
flags[i ] = FREE;
...
flags[j ] = FREE;
...
ENTRY PROTOCOL (for Process i ):
/* claim the resource */
flags[i ] = BUSY;

/* wait if the other process is using the resource */
while (flags[j ] == BUSY) {

/* if waiting for the resource, also wait our turn */
if (turn != i ) {

/* but release the resource while waiting */
flags[i ] = FREE;
while (turn != i ) {
}
flags[i ] = BUSY;
}

}
EXIT PROTOCOL (for Process i ):
/* pass the turn on, and release the resource */
turn = j ;
flags[i ] = FREE;

Google Search