Homework: running and debugging xv6

This lecture is the introduction to xv6, an x86-based re-implementation of Unix v6. Read the source code in the assigned files; The assigned chapter for today provides a commentary on the assigned files.

You won't have to understand the details yet; we will simply focus for now on getting xv6 up and running under QEMU and debugging it under GDB at the C source code level.

Hand-In Procedure

You are to turn in this homework during lecture. Please write up your answers to the exercises below and hand them in to a staff member at the beginning of lecture.

Assignment

Fetch and un-tar the xv6 source:

$ wget http://staff.ustc.edu.cn/~bjhua/courses/ats/2014/ref/xv6-rev7.tar.gz
$ tar xzvf xv6-rev7.tar.gz
xv6/
xv6/trap.c
xv6/sysfile.c
xv6/types.h
...
$

Build xv6:

$ cd xv6
$ make
gcc ... 
gcc ...
ld -m    elf_i386 -N -e start -Ttext 0x7C00 -o bootblock.o bootasm.o bootmain.o
objdump -S bootblock.o > bootblock.asm
objcopy -S -O binary -j .text bootblock.o bootblock
...
$

Running xv6 under QEMU

Now type:

$ make qemu
qemu -serial mon:stdio -hdb fs.img xv6.img -smp 2 -m 512
xv6...
cpu1: starting
cpu0: starting
init: starting sh
$

A separate window should appear containing the display of the virtual machine. After a few seconds, QEMU's virtual BIOS will load xv6's boot loader from a virtual hard drive image contained in the file xv6.img, and the boot loader will in turn load and run the xv6 kernel.

Note: For those who can't find QEMU executable, please check first if you have installed QEMU for i386 (not x86_64) architecture. Then change QEMU variable in Makefile to be your working QEMU executable (typically qemu-system-i386). Replace the line

#QEMU =

with

QEMU = /usr/bin/qemu-system-i386

Note: For those who are trying this by ssh'ing from a Mac to a Linux box and running QEMU under X, due to a QEMU bug, you have to edit the Makefile replacing the line

QEMUOPTS = -hdb fs.img xv6.img -smp $(CPUS)

with

QEMUOPTS = -hdb fs.img xv6.img -smp $(CPUS) -k en-us

After everything is loaded, you should get a '$' prompt in the xv6 display window and be able to enter commands into the rudimentary but functional xv6 shell. For example, try:

$ ls
.		1 1 512
..		1 1 512
README         2 2 1927
cat            2 3 10535
...
$ echo Hello!
Hello!
$ cat README
xv6 is a re-implementation of Dennis Ritchie's and Ken Thompson's Unix
Version 6 (v6).  xv6 loosely follows the structure and style of v6,
...
$ grep run README
To build xv6 on an x86 ELF machine (like Linux or FreeBSD), run "make".
Then run "make TOOLPREFIX=i386-jos-elf-".
...
$ cat README | grep run | wc
6 64 346
$ echo My New File >newfile
$ cat newfile
My New File

The small file system you're examining and modifying here is resides on a second virtual disk, whose initial contents QEMU initializes from the file fs.img. Later in the course you will examine how xv6 accesses and modifies this file system.

Now close this QEMU session, destroying the state of the xv6 virtual machine. You can do so either by pressing CTRL-A + C to enter QEMU monitor and then entering quit at the (qemu) prompt in the original window from which you started QEMU, or just by pressing CTRL-A + X in that window.

Remote Debugging xv6 under QEMU

The easiest way to debug xv6 under QEMU is to use GDB's remote debugging feature and QEMU's remote GDB debugging stub.

Remote debugging is a very important technique for kernel development in general: the basic idea is that the main debugger (GDB in this case) runs separately from the program being debugged (the xv6 kernel atop QEMU) - they could be on completely separate machines, in fact. The debugger and the target environment communicate over some simple communication medium, such as a network socket or a serial cable, and a small remote debugging stub handles the "immediate supervision" of the program being debugged in the target environment. This way, the main debugger can be a large, full-featured program running in a convenient environment for the developer atop a stable existing operating system, even if the kernel to be debugged is running directly on the bare hardware of some other physical machine and may not be capable of running a full-featured debugger itself. In this case, a small remote debugging stub is typically embedded into the kernel being debugged; the remote debugging stub implements a simple command language that the main debugger uses to inspect and modify the target program's memory, set breakpoints, start and stop execution, etc. Compared with the size of the main debugger, the remote debugging stub is typically miniscule, since it doesn't need to understand any details of the program being debugged such as high-level language source files, line numbers, or C types, variables, and expressions: it merely executes very low-level operations on behalf of the much smarter main debugger.

When we are doing kernel development using a virtual machine such as QEMU, remote debugging may not be quite as critical: for example, xv6 can also be run under the Bochs emulator, which is much slower than QEMU but has a debugger built-in and thus does not require the use of GDB remote debugging. On the other hand, while usable, the Bochs debugger is still not as complete as GDB, so we will primarily use GDB with QEMU's remote debugging stub in this course.

To run xv6 under QEMU and enable remote debugging, type:

$ make qemu-gdb
*** Now run 'gdb'.
qemu -serial mon:stdio -hdb fs.img xv6.img -smp 2 -m 512  -S -gdb tcp::26616

You will notice that while a window appears representing the virtual machine's display, nothing appears on that display: that is because QEMU initialized the virtual machine but stopped it before executing the first instruction, and is now waiting for an instance of GDB to connect to its remote debugging stub and supervise the virtual machine's execution. In particular, QEMU is listening for connections on a TCP network socket, at port 26616 in this example, because of the '-p 26616' in the qemu command line above. You might see a different port number: this is because xv6's Makefile tries to compute a port number that is likely to be unique and available even if multiple users are debugging xv6 under QEMU on the same machine (see the GDBPORT variable in xv6's Makefile).

To start the debugger and connect it to QEMU's waiting remote debugging stub, open a new, separate terminal window, change to the same xv6 directory, and type:

$ gdb kernel
GNU gdb (GDB) SUSE (7.3-41.1.2)
Copyright (C) 2011 Free Software Foundation, Inc.
...
Reading symbols from /home/accts/xxx/xv6/kernel...done.
+ target remote localhost:26616
The target architecture is assumed to be i8086
[f000:fff0]    0xffff0:	ljmp   $0xf000,$0xe05b
0x0000fff0 in ?? ()
+ symbol-file kernel

Several things are going on here. Note that we entered 'gdb kernel' just as if we were going to debug a program named kernel directly under this GDB instance - but actually trying to execute the xv6 kernel under GDB in this way wouldn't work at all, because GDB would provide an execution environment corresponding to a user-mode Linux process (or a process on whatever operating system you are running GDB on), whereas the kernel expects to be running in privileged mode on a "raw" x86 hardware environment. But even though we're not going to run the kernel locally under GDB, we still need to have GDB load the kernel's ELF program image so that it can extract the debugging information it will need, such as the addresses of C functions and other symbols in the kernel, and the correspondence between line numbers in xv6's C source code and the memory locations in the kernel image at which the corresponding compiled assembly language code resides. That is what GDB is doing when it reports "Reading symbols from ...".

Important: When remote debugging, always make sure that the program image you give to GDB is exactly the same as the program image running on the debugging target: if they get out of sync for any reason (e.g., because you changed and recompiled the kernel and restarted QEMU without also restarting GDB with the new image), then symbol addresses, line numbers, and other information GDB gives you will not make any sense. Fortunately keeping the target program and the debugger in sync is not too difficult when they are both loaded from the same directory in the same host machine as they are in this case, but synchronization can be a bit more of a challenge with "true" remote debugging, where one machine runs GDB and another machine runs the target kernel loaded from a separate media such as a local hard disk or USB stick.

The GDB command 'target remote' connects to a remote debugging stub, given the waiting stub's TCP host name and port number. In our case, the xv6 directory contains a small GDB script residing in the file .gdbinit, which gets run by GDB automatically when it starts from this directory. This script automatically tries to connect to the remote debugging stub on the same machine (localhost) using the appropriate port number: hence the "+ target remote localhost:26616" line output by GDB. If something goes wrong with the xv6 Makefile's port number selection (e.g., it accidentally picks a port number already in use by some other process on the machine), or if you wish to run GDB on a different machine from QEMU (try it!), you can comment out the 'target remote' command in .gdbinit and enter the appropriate command manually once GDB starts.

Once GDB has connected successfully to QEMU's remote debugging stub, it retrieves and displays information about where the remote program has stopped:

The target architecture is assumed to be i8086
[f000:fff0] 0xffff0:    ljmp   $0xf000,$0xe05b
0x0000fff0 in ?? ()

As mentioned earlier, QEMU's remote debugging stub stops the virtual machine before it executes the first instruction: i.e., at the very first instruction a real x86 PC would start executing after a power on or reset, even before any BIOS code has started executing. For backward compatibility, PCs today still start executing after reset in exactly the same way the very first 8086 processors did: namely in 16-bit, "real mode", starting at address 0xffff0 - 16 bytes short of the end of the BIOS and the top of the 1MB of total addressable memory in the original PC architecture.

We'll leave further exploration of the boot process for later; for now just type in the GDB window:

(gdb) b exec
Breakpoint 1 at 0x80100af0: file exec.c, line 12.
(gdb) c

These commands set a breakpoint at the entrypoint to the exec function in the xv6 kernel, and then continue the virtual machine's execution until it hits that breakpoint. You should now see QEMU's BIOS go through its startup process, after which GDB will stop again with output like this:

The target architecture is assumed to be i386
=> 0x80100af0 :	push   %ebp

Breakpoint 1, exec (path=0x1c "/init", argv=0x8dfffe98) at exec.c:12
12	{
(gdb)

Note: If you get outputs like:

      Breakpoint 1, exec (
	  path=<error reading variable: can't compute CFA for this frame>,
	  argv=<error reading variable: can't compute CFA for this frame>)
	  at exec.c:12

please apply this patch to Makefile and then rebuild xv6.

      git apply Makefile.patch
      make clean; make

At this point, the machine is running in 32-bit mode, the xv6 kernel has initialized itself, and it is just about to load and execute its first user-mode process, the /init program. You will learn more about exec and the init program later; for now, just continue execution:

(gdb) c
Continuing.
=> 0x80100af0 :	push   %ebp

Breakpoint 1, exec (path=0x8b3 "sh", argv=0x8dffee98) at exec.c:12
12	{
(gdb)

The second time the exec function gets called is when the /init program launches the first interactive shell, sh.

Now if you continue again, you should see GDB appear to "hang": this is because xv6 is waiting for a command (you should see a '$' prompt in the virtual machine's display), and it won't hit the exec function again until you enter a command and the shell tries to run it. Do so by typing something like:

$ cat README

You should now see in the GDB window:

=> 0x80100af0 :	push   %ebp

Breakpoint 1, exec (path=0x1a40 "[Bcat", argv=0x8dfbee98) at exec.c:12
12	{
(gdb)

GDB has now trapped the exec system call the shell invoked to execute the requested command.

Now let's inspect the state of the kernel a bit at the point of this exec command.

Turn in: the output of the following GDB 'print' or 'p' commands, with which we can inspect the arguments that the exec function was called with:

(gdb) p argv[0]
(gdb) p argv[1]
(gdb) p argv[2]

Turn in: the output of the GDB 'backtrace' or 'bt' command at this point, which traces and lists the chain of function calls that led to the current function: i.e., the function that made this call to exec, the function that called that function, etc.

Now go "up" one call frame so we can inspect the context from which exec was called:

(gdb) up
#1  0x80105e72 in sys_exec () at sysfile.c:394
394	  return exec(path, argv);
(gdb)

Turn in: the output of the GDB 'list' or 'l' command at this point, showing the source code around sys_exec's call to exec.

Further exploration: Look through the online GDB manual to learn about more of GDB's debugging features, and try them out on a running instance of the xv6 kernel.