OS By Example (OBE) is a collection of runnable examples that illustrate various operating system concepts. The collections is mainly developed to teach the students of RWTH Aachen University to get a deeper knowledge in the design of system software.

Introduction

TODO

In general, Andrew S. Tanenbaum et al. [5], Abraham Silberschatz et al. [3], and William Stalling [4] wrote excellence books with a broad view on system software and operating systems.

Nearly all examples are hackable, feel free to edit and to play around. With the compile&run button below the source code you are able to test the example.

Used programming languages

The collection of examples is mostly based on C, Rust, and Assembly. We assume that the basics of C are already known. For some parts, assembly is required because the examples use features which are only expressible on this level. These parts will be short and explained in detail. Rust is a relatively new system programming language which promises to be fast and memory safe. This collection of examples will not give a deep introduction into Rust but the examples are explained in detail. Consequently, it is not required to be a Rust expert to understand the examples. However, if you want to get a deeper introduction into the programming language Rust, I suggest the course from Lukas Kalbertodt.

Introduction into Rust (in German)

The seminar at Stanford University is more suitable for experienced programmers.

Stanford Seminar - The Rust Programming Language

Synchronization of processes and threads

Race condition or race hazard

Processes and threads have to share resources to sum up the result of a joint computation. The simplest kind of resource is a shared variable which serves as an example resource in this section. The challenges arising from shared resources are described by taking the example of the calculation of the number π. One possible way to compute the number π is to determine the integral from \$f(x) = 4 / (1+x^2)\$ between 0 and 1. Consequently, we have to determine the surface area below the graph f between 0 and 1. A technique to approximate the result is to draw rectangles below the graph `f and to sum the surface area of all rectangles. The accuracy of the result increases with a decreasing size of the rectangles.

Schematic representation of the algorithm to approximate π

Pi

Below, you find the sequential version of the π approximation. The variable nrecs defines the number of rectangles which will be drawn below the graph f. All rectangles have the same \$width = 1.0 / {nrecs}\$. To calculate the surface area of rectangle i, the height must be determined in the middle of the rectangle. In this case, \$(i + 0.5) * width\$ is inserted as \$x\$ in \$f(x) = 4 / (1+x^2)\$. All heights are then summed up and finally multiplied by the \$width\$ to get π as result.

Sequential version of the π approximation
#include <stdio.h> #include <stdlib.h> #include <sys/time.h> int main(int argc, char **argv) { double x, sum, width; int i, nrecs = 500000000; struct timeval start, end; if (argc > 1) nrecs = atoi(argv[1]); if (nrecs < 100) nrecs = 500000000; printf("nrecs = %d\n", (int)nrecs); gettimeofday(&start, NULL); sum = 0.0; width = 1.0 / (double)nrecs; for (i = 0; i < nrecs; i++) { x = (i + 0.5) * width; sum += 4.0 / (1.0 + x * x); } gettimeofday(&end, NULL); printf("Pi = %f\n", sum * width); printf("Time : %lf sec\n", (double) (end.tv_sec - start.tv_sec) + (double) (end.tv_usec - start.tv_usec)/1000000.0); return 0; }

The parallelization of the approximation is extremely simple. The calculation of the surface areas has to be distributed across multiple threads. For instance by using 2 threads, one thread determines the surface areas of all rectangles of the first half (\$0 ... {nrecs}/2-1\$), while the other thread determines the surface areas of the second half of rectangles (\${nrecs}/2\$ …​ nrecs).

Parallel (and faulty) version of the π approximation
#include <stdio.h> #include <stdlib.h> #include <pthread.h> #include <sys/time.h> #define MAX_THREADS 2 int nrecs = 500000000; double width, sum; typedef struct { int start, end; } thread_param; void *thread_func(void *arg) { thread_param *thr_arg = (thread_param *) arg; double x; int i; for (i = thr_arg->start; i < thr_arg->end; i++) { x = (i + 0.5) * width; sum += 4.0 / (1.0 + x * x); } return 0; } int main(int argc, char **argv) { int i; struct timeval start, end; pthread_t threads[MAX_THREADS]; thread_param thr_arg[MAX_THREADS]; if (argc > 1) nrecs = atoi(argv[1]); if (nrecs < 100) nrecs = 500000000; printf("nrecs = %d\n", (int)nrecs); gettimeofday(&start, NULL); sum = 0.0; width = 1.0 / (double)nrecs; /* Create MAX_THREADS worker threads. */ for (i = 0; i < MAX_THREADS; i++) { /* initialize arguments of the thread */ thr_arg[i].start = i * (nrecs / MAX_THREADS); thr_arg[i].end = (i + 1) * (nrecs / MAX_THREADS); pthread_create(&(threads[i]), NULL, thread_func, &(thr_arg[i])); } /* Wait until all threads have terminated */ for (i = 0; i < MAX_THREADS; i++) pthread_join(threads[i], NULL); gettimeofday(&end, NULL); printf("Pi = %f\n", sum * width); printf("Time : %lf sec\n", (double)(end.tv_sec - start.tv_sec) + (double)(end.tv_usec - start.tv_usec)/1000000.0); return 0; }

By testing the code it is easy to see that the solution proposal is faulty. The main problem is the global variable sum, which all threads use to add the partial results together. The following table visualizes the behavior of two threads. In this case, thread 2, who wrote the partial results \$44\$ to the memory (here the variable sum), wins and overwrites the results of the previous calculation. Consequently, the result is dependent on the timing of the threads. If thread 1 is, for some reason (e.g. more cache misses in comparsion to thread 2), delayed, its results \$46\$ will overwrite the result of thread 2. This behavior is called race condition or race hazard, where the output is dependent on the sequence or timing of other uncontrollable events.

Table 1. Example sequence of the π approximation

Time

Thread 1

Thread 2

\$t_0\$

read the value of \$x\$ (here \$0\$)

\$t_1\$

calculate high \$4 / (1 + 0 * 0) = 4.0\$

read the value of \$x\$ (here \$0\$)

\$t_2\$

read the value of sum (here \$42\$)

calculate high \$4 / (1 + 1 * 1) = 2.0\$

\$t_3\$

calculate \$42 + 4 = 46\$

read the value of sum (here \$42\$)

\$t_4\$

wrote \$46\$ to sum

calculate \$42 + 2 = 44\$

\$t_5\$

wrote \$44\$ to sum

In this case, the access to the global variable sum should be synchronized. Reading the value of sum, adding the partial results and writing back to memory should be an uninterruptible operation. A software mechanism to realize such an operation is called mutex (mutual exclusion). The Pthread library provides a mutex by the interface pthread_mutex_lock / pthread_mutex_unlock. The mutex is referenced by a variable of type pthread_mutex_t and locked by calling pthread_mutex_lock. If the mutex is already locked, the calling thread blocks until the mutex becomes available. In return, the pthread_mutex_unlock function releases the mutex and wake ups the blocked threads.

The following approach is a thread-safe version of the π approximation, where sum is protected by a mutex. However, the mutex interface induces overhead and serializes the parallel computation partly. The parallel part (calculation of \$f(x)\$) is in comparsion to the serial part (adding the partial result to sum) too small to get benefit from the parallel computation. The current approach is a valid and simple example to learn the behavior of a mutex. Later, we will show a faster approach for the π approximation.

Parallel version of the π approximation, global variable sum is protected by a mutex
#include <stdio.h> #include <stdlib.h> #include <pthread.h> #include <sys/time.h> #define MAX_THREADS 2 pthread_mutex_t cs_mutex = PTHREAD_MUTEX_INITIALIZER; int nrecs = 500000000; double width, sum; typedef struct { int start, end; } thread_param; void *thread_func(void *arg) { thread_param *thr_arg = (thread_param *) arg; double x; int i; for (i = thr_arg->start; i < thr_arg->end; i++) { x = (i + 0.5) * width; x = 4.0 / (1.0 + x * x); /* Enter the critical section */ pthread_mutex_lock(&cs_mutex); sum += x; /* Leave the critical section */ pthread_mutex_unlock(&cs_mutex); } return 0; } int main(int argc, char **argv) { int i; struct timeval start, end; pthread_t threads[MAX_THREADS]; thread_param thr_arg[MAX_THREADS]; if (argc > 1) nrecs = atoi(argv[1]); if (nrecs < 100) nrecs = 500000000; printf("nrecs = %d\n", (int)nrecs); fflush(stdout); gettimeofday(&start, NULL); sum = 0.0; width = 1.0 / (double)nrecs; /* Create MAX_THREADS worker threads. */ for (i = 0; i < MAX_THREADS; i++) { /* initialize arguments of the thread */ thr_arg[i].start = i * (nrecs / MAX_THREADS); thr_arg[i].end = (i + 1) * (nrecs / MAX_THREADS); pthread_create(&(threads[i]), NULL, thread_func, &(thr_arg[i])); } /* Wait until all threads have terminated */ for (i = 0; i < MAX_THREADS; i++) pthread_join(threads[i], NULL); gettimeofday(&end, NULL); printf("Pi = %f\n", sum * width); printf("Time : %lf sec\n", (double)(end.tv_sec - start.tv_sec) + (double)(end.tv_usec - start.tv_usec)/1000000.0); return 0; }

Synchronization primitives

In 1968, Edsger W. Dijkstra developed the scientific basis of mutexe [1]. Dijkstra was an early pioneer in computing science. He was a professor of mathematics at Eindhoven University of Technology (1962-1984) and held the Schlumberger Centennial Chair in Computer Sciences at University of Texas at Austin (1984-1999). Everyone who is able to read dutch should read his early notes. His complete archive is published at www.cs.utexas.edu/users/EWD/.

Dijkstra introduced semaphores as synchronization primitives as follows:

The semaphores are essentially non-negative integers; when only used to solve the mutual exclusion problem, the range of their values will even be restricted to "0" and "1". It is the merit of the Dutch physicist and computer designer Drs. C. S. Scholten to have demonstrated a considerable field of applicability for semaphores that can also take on larger values. When there is a need for distinction, we shall talk about "binary semaphores" and "general semaphores", respectively.

In addition, Dijkstra defines two indivisible operations on a semaphore, which he called P and V. P is an acronym for prolaag, which is Dutch for decreasing. The opposite operation is V, which is the acronym for verhoog and increases a semaphore. Per definition, is a semaphore a non-negative integer. Consequently, if the value is zero, the operation P blocks until the value becomes greater than zero. Per Brinch Hansen called these operations wait and signal [2], which describe the logical behavior between both operations better.

The pseudo code for a binary semaphore is listed below. In case of binary semaphores, only one execution thread is able to enter a region between indivisible operations P and V. However, to realize a semaphore, it is required to support indivisible operations at least for basic operations like P and V.

Pseudo code to realize a binary semaphore
void P(int* SEM)
{
	int test;

	do {
		test = (*SEM == 1);
		*SEM = 0;
	} while(!test);
}

void V(int* SEM)
{
	*SEM = 1;
}

void foo(void)
{
	int sem = 1;

	...
	P(&sem);

	/*
	 * only one execution thread is able to enter this code region
	 */

	V(&sem);
}

Most modern CPU architectures support hardware extensions to realize simple indivisible operations. For instance, the x86 processors introduce the instruction prefix lock. On multiprocessor (and also on singleprocessor) machines, the prefix guarantees that two concurrent processes using the same data are not able to modify it simultaneously. In combination with the instruction xchg, a binary semaphore could be realized as follow:

A binary semaphore with hardware support
void P(int* SEM)
{
	int test = 0;

	do {
		asm volatile ("xchgl %0, %1" : "=r"(test) : "m"(*SEM), "0"(test) : "memory");
	} while(!test);
}

void V(int* SEM)
{
	*SEM = 1;
}

This version works out-of-the-box in our π example (see source code below). However, this implementation of semaphores does not block. The function P spins actively to enter the region, which is protected by the semaphore. Consequently, the function P wastes time.

#include <stdio.h> #include <stdlib.h> #include <pthread.h> #include <sys/time.h> #include "mysem.h" #define MAX_THREADS 2 int cs_mutex = 1; int nrecs = 500000000; double width, sum; typedef struct { int start, end; } thread_param; void *thread_func(void *arg) { thread_param *thr_arg = (thread_param *) arg; double x; int i; for (i = thr_arg->start; i < thr_arg->end; i++) { x = (i + 0.5) * width; x = 4.0 / (1.0 + x * x); /* Enter the critical section */ P(&cs_mutex); sum += x; /* Leave the critical section */ V(&cs_mutex); } return 0; } int main(int argc, char **argv) { int i; struct timeval start, end; pthread_t threads[MAX_THREADS]; thread_param thr_arg[MAX_THREADS]; if (argc > 1) nrecs = atoi(argv[1]); if (nrecs < 100) nrecs = 500000000; printf("nrecs = %d\n", (int)nrecs); fflush(stdout); gettimeofday(&start, NULL); sum = 0.0; width = 1.0 / (double)nrecs; /* Create MAX_THREADS worker threads. */ for (i = 0; i < MAX_THREADS; i++) { /* initialize arguments of the thread */ thr_arg[i].start = i * (nrecs / MAX_THREADS); thr_arg[i].end = (i + 1) * (nrecs / MAX_THREADS); pthread_create(&(threads[i]), NULL, thread_func, &(thr_arg[i])); } /* Wait until all threads have terminated */ for (i = 0; i < MAX_THREADS; i++) pthread_join(threads[i], NULL); gettimeofday(&end, NULL); printf("Pi = %f\n", sum * width); printf("Time : %lf sec\n", (double)(end.tv_sec - start.tv_sec) + (double)(end.tv_usec - start.tv_usec)/1000000.0); return 0; }

Appendix

Assembler Hello World

Just a Hello World in Assembler

section .text global _start ;must be declared for linker (ld) _start: ;tell linker entry point mov edx,len ;message length mov ecx,msg ;message to write mov ebx,1 ;file descriptor (stdout) mov eax,4 ;system call number (sys_write) int 0x80 ;call kernel mov eax,1 ;system call number (sys_exit) int 0x80 ;call kernel section .data msg db 'Hello, world!',0xa ;our dear string len equ $ - msg ;length of our dear string

Exercise Test

Write a Hello World here by adding

printf("Hello World!\n");

Check the mark on the bottom right to verify your Code

#include "main.h" int main() { return 0; }

Rust version of the π approximation

Just a test to run Rust applications …​

use std::sync::mpsc; use std::thread; use std::time::Instant; const NTHREADS: u64 = 2; const NRECS: u64 = 100000000; fn term(start: u64, end: u64) -> f64 { let step = 1.0 / NRECS as f64; let mut sum = 0.0; for i in start..end { let x = (i as f64 + 0.5) * step; sum += 4.0 / (1.0 + x * x); } sum } fn main() { let (tx, rx) = mpsc::channel(); let mut sum = 0.0; let now = Instant::now(); for id in 0..NTHREADS { // The sender endpoint can be copied let thread_tx = tx.clone(); let start = (NRECS / NTHREADS) * id; let end = (NRECS / NTHREADS) * (id+1); // Each thread will send its partial sum via the channel thread::spawn(move || { let partial_sum = term(start, end); thread_tx.send(partial_sum).unwrap(); }); }; for _ in 0..NTHREADS { // The `recv` method picks a message from the channel // `recv` will block the current thread if there no messages available sum = sum + rx.recv().unwrap(); } let duration = now.elapsed(); println!("nrecs: {}", NRECS); println!("Time to calculate: {}", duration.as_secs() as f64 + (duration.subsec_nanos() as f64 / 1000000000.0)); println!("Pi: {}", sum * (1.0 / NRECS as f64)); }

Acknowledgments

This collection of tutorials and runnable examples is realized by the support of many students and colleagues of the Institute for Automation of Complex Power Systems and the former Chair for Operating Systems at RWTH Aachen University.

We have to thank especially the following people:

  • Nicolas Berr

  • Sonja Kolen

  • Kaspar Laaser

  • Simon Pickartz

  • Lukas Razik

  • Steffen Vogel

Bibliography

[1] E. W. Dijkstra, “Cooperating Sequential Processes,” in The Origin of Concurrent Programming, New York, NY: Springer, New York, NY, 1968, pp. 65–138.

[2] P. B. Hansen, Operating System Principles. Prentice Hall, 1973.

[3] A. Silberschatz, P. B. Galvin, and G. Gagne, Operating System Concepts, 8Th edition. John Wiley & Sons., 2009.

[4] W. Stallings, Operating Systems: Internals and Design Principles, 6Th edition. Pearson Education International, 2009.

[5] A. S. Tanenbaum and H. Bos, Modern Operating Systems, 4Th edition. Pearson Education International, 2014.

Imprint

Publisher

Stefan Lankes
RWTH Aachen, E.ON Energy Research Center
Mathieustraße 10
52074 Aachen, Germany
Email: slankes@eonerc.rwth-aachen.de

License

Unless stated otherwise, the associated texts and images are published under the Creative Commons Attribution 4.0 International License CC BY and the source code under the MIT license.

Collection, Processing and Use of Personal Data

On principle, you can use our website without revealing your identity. By sending any information to osbyexample.com (through a contact form, for example) you consent to this information being processed for our own purposes (e.g. sending of desired information) unless you declared an objection. You have the right to withdraw your consent at any time in the future, to immediate effect.

Upon a visit to our website the following data will be stored exclusively for statistical purposes, and the identity of the user is being kept anonymous:

  • the name of the file requested,

  • the date and time of the request,

  • the amount of data transferred,

  • IP number

We will not give any stored data to third parties - neither for commercial nor non-commercial purposes.

Responsibility for our Web Content

osbyexample.com makes all reasonable efforts to ensure that the contents of its web site are up-to-date, complete and accurate. Despite these efforts, the occurrence of errors or omissions cannot be completely ruled out.

Responsibility for External Content

Some hyperlinks on this website link to contents which are not operated by osbyexample.com but provided by third parties. Please note that osbyexample.com is not responsible for contents offered by other organizations.

This general disclaimer is part of the contents provided by the web site of osbyexample.com. If any of the terms and conditions is found to be invalid by reason of the relevant laws then the remaining terms and conditions shall remain in full effect.