Lab 2: Processes and Synchronization

Introduction

In this lab, you will add the first major OS abstraction to PIOS: processes. Like Unix, PIOS creates a single "root" process on startup and allows that and any subsequent process to create "child" processes, forming a process hierarchy. PIOS's process abstraction is much simpler than Unix's, however: for example, child processes cannot outlive their parents, and processes can directly interact only with their immediate parent and immediate children, not with arbitrary other processes in the system like Unix processes can via pipes or the file system. Philosophically, the PIOS process abstraction follows more of a recursive virtual machine or nested process model, in contrast with Unix's timesharing-oriented job control model.

This lab contains the following four main implementation components:

  1. Synchronization: Maintaining the consistency of data structures in a multiprocessor kernel.
  2. Scheduling: Determining which process to run on which CPU at a given moment, and running them.
  3. System calls: Implementing system calls allowing process to create and manage child processes.
  4. Preemption: Using asynchronous hardware interrupts to switch periodically among processes.

Software Setup

In this lab you will build on the kernel you started in lab 1. We will also provide you with some additional source code. To fetch the new source code, use Git to commit your Lab 1 source code if you haven't already, then fetch the latest updates from the master Git repository, and then create a local branch called lab2 based on our lab2 branch, which will be named origin/lab2 in your copy of the repository:

$ cd lab
$ git commit -am 'my solution to lab1'
Created commit 254dac5: my solution to lab1
 3 files changed, 31 insertions(+), 6 deletions(-)
$ git pull
Already up-to-date.
$ git checkout -b lab2 origin/lab2
Branch lab2 set up to track remote branch refs/remotes/origin/lab2.
Switched to a new branch "lab2"

The git checkout -b command shown above actually does two things: it first creates a local branch lab2 that is based on the origin/lab2 branch provided by the course staff, and second, it changes the contents of your lab directory to reflect the files stored on the lab2 branch. Git allows switching between existing branches using git checkout branch-name, though you should commit any outstanding changes on one branch before switching to a different one.

You will now need to merge the changes you made in your lab1 branch into the new lab2 branch, as follows:

$ git merge lab1
Merge made by recursive.
 kern/kdebug.c  |   11 +++++++++--
 kern/monitor.c |   19 +++++++++++++++++++
 lib/printfmt.c |    7 +++----
 3 files changed, 31 insertions(+), 6 deletions(-)

In some cases, Git may not be able to figure out how to merge your changes with the new lab assignment (e.g. if you modified some of the code that is changed in the second lab assignment). In that case, the git merge command will tell you which files are conflicted, and you should first resolve the conflict (by editing the relevant files) and then commit the resulting files with git commit -a.

Lab 2 contains the following new source files, which you should browse through and familiarize yourself with:

inc/syscall.h Definitions of PIOS system call conventions
kern/spinlock.{c,h} Spinlock-based mutual exclusion module template
kern/proc.{c,h} Process management and scheduling module template
kern/syscall.{c,h} System call handling module template
kern/mp.{c,h} Multiprocessor hardware identification
boot/bootother.S Boot code for CPUs other than CPU 0
dev/pic.{c,h} Legacy interrupt controller (PIC) driver
dev/lapic.{c,h} Local interrupt controller (LAPIC) driver
dev/ioapic.{c,h} I/O interrupt controller (IOAPIC) driver

Hand-In Procedure

The same as lab 1.

Part 1: Multiprocessor Synchronization

Like xv6, PIOS is a multiprocessor kernel, and will therefore have to use synchronization mechanisms to ensure that different physical CPUs running kernel code concurrently do not accidentally corrupt the kernel's data structures. For now, PIOS will use only one simple synchronization abstraction to protect kernel data structures: the spinlock. Since xv6 also uses spinlocks in the kernel, make sure you have read the xv6 chapter on locking and thoroughly understand how xv6 implements and uses spinlocks before you attempt this part of the lab.

PIOS's use of spinlocks will be different from xv6's in one important way, in terms of how spinlocks relate to interrupt handling. The xv6 kernel can take asynchronous device interrupts while running in either user or kernel mode, and servicing those interrupts often requires acquiring spinlocks. If an interrupt handler were to attempt to acquire a spinlock that the interrupted code already holds, however, then a deadlock would result: the interrupt handler will spin waiting for the "lock holder" to release the lock, but the lock holder is the kernel code that was preempted by the interrupt! So the xv6 spinlock code always disables hardware interrupts when the kernel first takes a spinlock, and re-enables interrupts only after unlocking the last spinlock it is holding, effectively preemption whenever kernel code is holding any spinlock. PIOS, in contrast, simply never enables interrupts in the first place while running in kernel mode, so the spinlock code in PIOS doesn't need to deal with this issue.

Basic Spinlock Functionality

PIOS spinlocks are declared in kern/spinlock.h, and support the following four operations:

You will need to implement these four functions. Use the xchg() function in inc/x86.h to provide the atomicity needed when acquiring the spinlock.

You should also use the atomic xchg() instruction to release the lock - in this case not because of the atomicity it provides, but because atomic instructions also prevent the processor from reordering other reads or writes around the atomic instruction, which modern processors often do with normal memory accesses, and such reordering can cause synchronization code to fail if not controlled carefully. See section 7.2, "Memory Ordering", in the IA-32 System Programming Guide for more information.

One further detail about the spin loop itself: while a spinlock will work if you just do nothing in the loop, Intel has defined a special instruction called PAUSE that notifies the processor that a spin loop is in progress and can improve system performance in such cases, especially on "hyper-threaded" processors that multiplex a single execution unit among multiple virtual CPUs. See the x86 manuals for details on this instruction. We have provided a pause() function in inc/x86.h for convenience.

Debugging Enhancements

While the main purpose of spinlocks is to ensure mutual exclusion, and that can be accomplished if spinlocks are nothing but a single word in memory, we can make debugging multiprocessor code much easier by instrumenting spinlocks a little. Design your spinlocks so that they:

These debugging facilities won't necessarily make writing multiprocessor code "easy" per se, but hopefully it may be a little easier than it would be otherwise!

Exercise 1. Implement spinlocks in PIOS as described above. Make sure your implementation passes the spinlock_check() test provided. But note that this test code only tests basic functionality; it does not verify that your spinlocks are properly atomic and race-free under multiprocessor operation!

Using Spinlocks

You will now need to use the spinlocks you have just implemented to provide mutual exclusion for the physical memory code from lab 1.

Exercise 2. Add mutual exclusion to protect the mem_freelist in kern/mem.c, so that different CPUs can safely call mem_alloc() and mem_free() concurrently without pages being lost or the free list being corrupted.

Part 2: Creating and Scheduling Processes

First familiarize yourself with kern/proc.h and the template code in kern/proc.c. This will be PIOS's main process management and scheduling module.

As mentioned earlier, PIOS processes form a strictly tree-structured process hierarchy. The proc structure contains a parent pointer and up to PROC_CHILDREN (256) child pointers to child processes. The parent pointer is NULL in the root process but valid for all other processes; any child pointer in a proc's child array may be NULL indicating that child does not exist.
PIOS Process State Machine

At a given moment a process may be in any of four states, as defined by the state field in the proc structure and illustrated in the figure to the right:

Scheduling and Running Processes

You will now need to design and implement a simple scheduler. The scheduler can be as simple as keeping all ready processes on a single queue and running them in round-robin order. The main requirements for purposes of this lab are: (a) the scheduler must work correctly, even on a multiprocessor system, and never lose processes or corrupt the ready queue for example; and (b) assuming the scheduler is invoked on a regular basis, it must not "starve" any process of CPU time forever. The second requirement means that while a simple queue structure will work, a stack structure would not for example, because processes pushed more recently on the stack could starve processes pushed earlier forever.

We have declared and provided skeletons for the following scheduling functions; you may add other helper functions and data structures as needed in kern/proc.c and kern/proc.h:

You will of course need to make sure that your implementations of these functions are multi-processor safe via appropriate use of spinlocks. Be sure to think carefully about what needs to be locked when. Do you ever need to acquire two or more spinlocks at once, e.g., the spinlocks in two different proc structures? If so, how can you ensure that you never deadlock?

Exercise 3. Implement the above scheduling functions. Then modify init() in kern/init.c so that instead of just conjuring up a trapframe and calling trap_return() directly to enter user space, it calls proc_alloc() to create a "root process", sets up that process's saved EIP and ESP to run the user() function on the user_stack, calls proc_ready() to make the process ready, and finally calls proc_sched() to schedule and run it.

You should now be able to get into user mode via your scheduler, although the root process won't yet be able to create other processes without the system calls to be implemented shortly. You may wish to test your scheduler a bit more thoroughly now by creating several "root" processes, starting them all at once, and ensuring that your schedule correctly picks a separate one to run on each physical CPU. But note that each user process will need its own stack!

Hint: In general, it may make debugging easier if you get your code working first in a uniprocessor environment, by removing the `-smp 2' option from the qemu command line, before trying to get it working in a multiprocessor environment. If you are experiencing a mysterious bug, always check to see if it goes away in uniprocessor mode: if it does, it probably means it's a locking issue; if it doesn't, it'll probably be easier to replicate and debug in the simpler uniprocessor environment.

Challenge: Make your scheduler more powerful by implementing one of the enormous number of prioritization or fair-share scheduling algorithms you can find in textbooks or in the research literature: for example, fixed-priority scheduling, Unix-style variable-priority scheduling, or lottery scheduling.

Challenge: Make your scheduler more scalable by giving each CPU its own dedicated ready queue, so that that CPU can insert and remove ready processes without contending with other CPUs in the common case. You will still need locking, however, because when one CPU's ready queue is empty that CPU's scheduler will have to check other CPUs' ready queues and "steal" processes from them. This technique is called work stealing.

Part 3: System Calls

Now that your kernel has a way to create and schedule processes, it needs to provide user-mode code a way to do something interesting. User-mode processes invoke the kernel explicitly through an x86 software interrupt (INT instruction) to vector 0x30 (T_SYSCALL).

Entering, Completing, and Restarting System Calls

Before we get into the details of individual system calls, we need to clarify a few general system call principles.

Entering System Calls: When the processor encounters an explicit INT instruction, it behaves in mostly the same way as if it had encountered an exception such as divide-by-zero, with two important differences:

Be sure you thoroughly understand both of these differences in behavior, and the reasons for them! Now we need to think about how PIOS actually processes system calls.

Completing System Calls: When the kernel handles a system call, in the "easy cases" it will complete whatever action the user process requested and then need to resume execution of the user mode process from where it left off. In PIOS, the system call handler does this by calling trap_return().

Given the above considerations regarding the value of EIP that the processor pushes when it successfully executes an INT instruction, does the system call handler need to do anything to the trapframe pushed on entry before calling trap_return() to resume the user process?

Blocking Processes and Restarting System Calls: In the "not so easy" system call handling cases, the kernel may not be able to complete the system call immediately, but instead must block the process until some future event occurs, typically as a result of an action of some other process or an external device. In a process-model kernel like xv6, which maintains a separate kernel stack for each user process, the kernel stack provides a convenient place where the kernel can maintain state pertaining to whatever system call was in progress when the process needed to block: when the necessary "wakeup" condition is satisfied, the process simply resumes executing in the kernel at the intermediate system call processing state represented by the contents of the process's kernel stack.

In an interrupt-model kernel such as PIOS, however, processes do not have kernel stacks, so a blocked process must entirely give up the use of the kernel stack so that the next process to run on the same CPU can use it. So how does the kernel keep track of what it was or is supposed to be doing on behalf of a blocked process? We could add explicit state to the proc structure describe exactly what the kernel was supposed to be doing - e.g., executing user-mode code, or blocked at a particular stage in a particular system call. Keeping track of this complicated state machine can be difficult, however. Another option is to design the kernel's system calls to be simple and semantically atomic - meaning the system call either runs to completion or blocks before changing any user-visible state at all. That way, if a process needs to block waiting for some event to occur, then once the process is woken by the occurrence of that event, the process simply restarts the original system call from scratch - beginning with the original INT instruction executed in user mode. In this design, the only process state that really matters is its user-mode register state, particularly its EIP.

PIOS takes the latter, system call restart approach, because it simplifies kernel state (and can have other kernel design advantages). To implement the restart model, however, you have to think carefully about exactly where you want the user-mode EIP to point when you actually complete a system call, versus what you want the saved EIP to point to when you block the process without completing the system call. Given the EIP that the processor saves in the trapframe when it executes the INT instruction, if the system call handler decides it needs to block the process, does it need to do anything to the trapframe before saving it into the proc structure for use when the processor eventually wakes up and resumes execution? Hint: how long exactly is an INT instruction? (Look it up in the IA-32 manuals if you don't know.)

PIOS System Calls

We now examine the specific system calls PIOS supports. PIOS for now implements only four system calls, which we list according to their C function prototypes as called from user space programs (see inc/syscall.h):

Exercise 4. Implement the above process management system calls. You will first need to modify kern/trap.c to set up a trap entrypoint for the T_SYSCALL vector and make trap() invoke syscall() when that trap occurs. Then fill out the syscall() function in kern/syscall.c; we have oh so generously implemented SYS_CPUTS for you. You will have to design the process management code carefully to interact with your scheduling code in kern/proc.c when putting processes to sleep or waking them up.

Test this code with the first part of proc_check(), which user() calls to check process management. With QEMU emulating 2 CPUs (the default when you type 'make qemu' with our GNUmakefile), your code should be able to get through the spawn/wait test with 2 child processes. The subsequent test with 4 child processes will hang, however, until you implement preemption in the next part (or you can increase the number of emulated CPUs to 4).

Challenge: Implement x87/MMX/XMM FPU state saving and restoring in the kernel, so that user processes can use floating-point arithmetic, and make a process's FPU state accessible to its parent process via sys_put and sys_get with the SYS_FPU flag, which we have defined in inc/syscall.h. We have also provided a structure, fxstate, that is suitable for holding FPU state and laid out in the format required by the processor's convenient FXSAVE and FXRSTOR instructions.

Part 4: Preemption and Trap Reflection

In this final part of the lab you will enhance the kernel's handling of traps other than explicit system calls to integrate with the scheduling and process management code above.

Preemption

If one process executes for too long, we would like to preempt the CPU and give other processes a chance to run.

To implement preemption, we need to arrange for an asynchronous timer interrupt to be delivered to each CPU periodically, so that the processor will return control to the kernel and the kernel can schedule and run a different process. In multiprocessor x86 systems, each CPU has a Local Advanced Programmable Interrupt Controller or LAPIC, which can generate such interrupts. We have provided code in dev/lapic.{c,h} to initialize each CPU's LAPIC and enable its periodic timer interrupt. The processor will simply ignore these interrupt requests, however, until the kernel enables interrupts by setting the IF bit in the EFLAGS register. Once interrupts are enabled, timer interrupts will arrive at vector T_LTIMER. In response, the kernel should place the current process back on the ready queue, and pick and run a new process from the ready queue.

When the kernel takes an asynchronous hardware interrupt, it must signal the hardware that generated the interrupt that the handling of the interrupt is complete: otherwise, the source of the interrupt (in this case the local APIC) would just keep the interrupt request asserted, interrupting the processor again immediately, as soon as its IF flag is set again. This "interrupt clearing" procedure is commonly called an End Of Interrupt or EOI. For interrupts generated by the local APIC, we have provided a function lapic_eoi() to signal the LAPIC that the timer interrupt has been handled.

As described in Section 8.9 of the IA-32 System Programming Guide, certain corner-case timing situations can cause the Local APIC to generate a spurious interrupt. Our local APIC setup code directs such interrupts to interrupt vector T_IRQ0+IRQ_SPURIOUS; you should catch interrupts arriving at this vector and simply do a trap_return() without an EOI, as specified in the IA-32 manual. You may want to print a message to a console on a spurious interrupt, since they're not likely to happen often and if/when it does you probably want to know so that you can tell whether your spurious interrupt handler is working.

Exercise 5. Implement preemption in PIOS. You will need to create entrypoints for the Timer IRQ and Spurious Interrupt vectors, and handle them in trap().

You will also need to turn on the Interrupt Flag in order to allow the processor to take timer interrupts. Keep in mind, however, that the PIOS kernel assumes that the processor takes asynchronous hardware interrupts only when running in user mode, never while while running in kernel mode (as xv6 allows). You must be careful to preserve this design convention, otherwise the kernel will get very confused when it tries to preempt itself. How do you cause the IF flag to be set only while the processor is running in user mode? (Hint: the write_eflags() function in inc/x86.h is not what you want.)

Once you get interrupts working, your code should be able to get through the part of proc_check() that uses 4 child processes, even when running on the default 2-CPU SMP configuration (or even a 1-CPU uniprocessor QEMU configuration).

Trap Reflection

While the sys_ret() system call gives a PIOS process an explicit way to "return" to its parent, a process can also implicitly return to its parent by causing any other processor-defined exception, such as a divide by zero or general protection fault. That is, whenever a process causes an exception in user mode that the PIOS kernel doesn't handle automatically in some way, the kernel reflects the trap to the process's parent in the form of an implicit "return": the process goes into the stopped state and sits passively until the parent collects its state via sys_get(). In this case, the tf_trapno in the trapframe structure that sys_get(SYS_REGS) returns to the parent will indicate which trap the child process caused, and the tf_eip will point at (not after, as in the explicit sys_ret system call) the instruction that caused the trap. This reflection facility gives user-mode processes the ability to "supervise" their children, handling traps caused by their children in any way appropriate. Of course, the kernel can't reflect traps caused by the root process because the root process doesn't have a parent to reflect them to.

Exercise 6. Implement reflection for processor exceptions in the x86 architecture-defined vector range of 0 through 31, but only if the exception occurs in user mode. (Exceptions in this range that occur in kernel mode still probably represent bugs and should cause a panic with a backtrace.) We have declared and provided a skeleton function proc_ret() in kern/proc.c, which is intended to handle either "explicit returns" via sys_ret() or "implicit returns" via other traps from user mode.

Your kernel should now be able to pass the entire proc_check() test, including its test of reflection for several types of traps. Run make grade to make sure the whole test suite for lab 2 comes out right.

This completes the lab.