Submit your solutions before the beginning of the next lecture to any course staff.
Make sure you understand what would happen if the kernel executed the following code snippet:
struct spinlock lk; initlock(&lk, "test lock"); acquire(&lk); acquire(&lk);(Feel free to use QEMU to find out.
acquire
is in spinlock.c
.)
An acquire
ensures interrupts are off
on the local processor using the cli
instruction
(via pushcli()
),
and interrupts remain off until the release
of the last lock held by that processor
(at which point they are enabled using sti
).
Let's see what happens if we turn on interrupts while
holding the ide
lock.
In iderw
in ide.c
, add a call
to sti()
after the acquire()
(line 3965),
and a call to cli()
just before the release()
(line 3982).
Rebuild the kernel and boot it in QEMU.
Chances are the kernel will panic soon after boot; try booting QEMU a few times
if it doesn't.
Turn in:
explain in a few sentences why the kernel panicked.
You may find it useful to look up the stack trace
(the sequence of %eip
values printed by panic
)
in the kernel.asm
listing.
sti()
and cli()
you added,
rebuild the kernel, and make sure it works again.
Now let's see what happens if we turn on interrupts
while holding the file_table_lock
.
This lock protects the table of file descriptors,
which the kernel modifies when an application opens or closes
a file.
In filealloc()
in file.c
, add
a call to
sti()
after the call to acquire()
,
and a cli()
just before each of the
release()
es.
You will also need to add
#include "x86.h"
at the top of the file after
the other #include
lines.
Rebuild the kernel and boot it in QEMU.
It will not panic.
Turn in:
explain in a few sentences why the kernel didn't panic.
Why do file_table_lock
and ide_lock
have
different behavior in this respect?
You do not need to understand anything about the details of the IDE hardware to answer this question, but you may find it helpful to look at which functions acquire each lock, and then at when those functions get called.
(There is a very small but non-zero chance that the kernel will panic
with the extra sti()
in filealloc()
.
If the kernel does panic, make doubly sure that
you removed the sti()
call from
iderw
. If it continues to panic and the
only extra sti()
is in filealloc()
,
then E-mail the staff
and think about buying a lottery ticket.)
Turn in:
Why does release()
clear
lk->pcs[0]
and lk->cpu
before clearing lk->locked
?
Why not wait until after?
Download ph.c and compile it on your laptop:
$ gcc -g -O2 ph.c -pthread $ ./a.out 2The 2 specifies the number of threads that execute put and get operations on the the hash table.
After running for a little while, the output will be something along the lines:
0: put time = 2.871728 1: put time = 2.957073 1: lookup time = 12.731078 1: 1 keys missing 0: lookup time = 12.731874 0: 1 keys missing completion time = 15.689165
Each thread runs in two phases. In the first phase it puts NKEYS/nthread keys into the hash table. In the second phase, it gets NKEYS from the hash table. The print statements tell you how long each phase took for each thread. The completion time at the bottom tells you the total runtime for the application. You see that that completion time is about 16 seconds. Each thread computed for about 16 seconds (~3 for put + ~13 for get). This indicates that we achieved perfect parallelism; the threads didn't interfere with each other.
When you run this application, you may see no parallelism if you are running on a machine with 1 core or if the machine is loaded with other applications.
Independent of whether you see speedup, you will likely observe that the code is incorrect. The application inserted 1 key in phase 1 that phase 2 couldn't find. Run the application with 4 threads:
2: put time = 1.516581 1: put time = 1.529754 0: put time = 1.816878 3: put time = 2.113230 2: lookup time = 15.635937 2: 21 keys missing 3: lookup time = 15.694796 3: 21 keys missing 1: lookup time = 15.714341 1: 21 keys missing 0: lookup time = 15.746386 0: 21 keys missing completion time = 17.866878
Two points: 1) The completion time is about the same as for 2 threads, but this run did twice as much work as with 2 threads; we are achieving good parallelism. 2) More keys are missing. In your runs, there may be more or fewer keys missing. There may be even 0 keys missing in some runs. If you run with 1 thread, there will never be any keys missing. Why are there missing keys with 2 or more threads, but not with 1 thread? Identify a sequence of events that can lead to keys missing for 2 threads.
To avoid this sequence of events, insert lock and unlock statements in put and get so that the number keys missing is always 0. The relevant pthread calls are (for more see the manual pages, man pthread):
pthread_mutex_t lock; // declare a lock pthread_mutex_init(&lock, NULL); // initialize the lock pthread_mutex_lock(&lock); // acquire lock pthread_mutex_unlock(&lock); // release lock
Test your code first with 1 thread, then test it with 2 threads. Is it correct (i.e. have you eliminated missing keys?)? Is the two-threaded version faster than the single-threaded version?
Modify your code so that get operations run in parallel while maintaining correctness. (Hint: are the locks in get necessary for correctness in this application?)
Modify your code so that some put operations run in parallel while maintaining correctness. (Hint: would a lock per bucket work?)
Turn in: your modified ph.c