Lab #1: Compiling, Linking, Libraries, and Runtime


Introduction

This lab is split into three parts. The first part concentrates on getting familiarized with the whole processing of producing executables from C programs: preprocessing, compiling, and linking. The second part examines a minimal C library. Finally, the third part delves into a minimal C runtime.

1. Preprocessing, Compiling, & Linking

In this problem, you'll only need to read some code we offered, and familiarize yourself with the complete process of preprocessing, compiling, linking and running a program. No hacking is required, just read code and answer some questions.

First download this package of code, and uncompress it in any directory you like. Brows the code, which consist of 4 files:

area.h
area.c
main.c
Makefile
Please open the former 3 files with some text editor and read the code carefully, make sure you understand what the code is doing. For now, ignore the file "Makefile".

Now, open the command line prompt, and go into the directory your code resides. Type this command in the prompt:

cl /P area.c
what new file is generated by this command? Open that new file and read it.
Question #1: Be able to answer these questions:
  • What does the "#line ..." mean?
  • What's the differece between file "area.c" and "area.i"?
  • Where has the file "area.h" gone to, and why?
And now type this command in prompt:
cl /P main.c
again, what new file is generated? Open that file and read the code in it.
Question #2: Be able to answer these questions:
  • In what directory, the file "stdio.h" resides?
  • Then find the file "stdio.h" according to the directory, open it and skim the code. (You don't need to understand the stuffs in it. Just get a feel of what's there.)
This completes the preprocessing part of this problem, and next comes the compiling part.

Now, type this command in prompt:

cl /c area.c
what new file is generated? Open it and skim the content (don't care for its meaning for now). And next, type this command in prompt:
dumpbin /disasm area.obj >area.asm
what new file is generated? Open that file and check it's content. You don't need to learn assembly in this course, but if you are really curious, please read the Chap 17 of Intel manual and find out what these assembly mean.
(Optional) Question #3: And even you may want to try to answer these questions (only after you read the screen output and the Intel manual):
  • Where is the function "area"'s argument "r"?
  • How is the mutiplication "PI * r * r" performed?
  • Where is the multiplicaton result put?
  • How is the result returned to function "main"?
Note: dumpbin is a object file analysis tool offered by MS, and as we can see, it disassembles an object form in binary file to its corresponding assembly form which is easier to read. For more of its usage, type:
dumpbin /?
you can see more arugments and options.

Now, type this command in prompt:

cl /c main.c
what new file is generated? Check this file's content using the tool dumpbin:
dumpbin /disasm main.obj >main.asm
just as we do above. In file "main.asm", you can see these code (among others):
_main:
  00000000: 55                 push        ebp
  00000001: 8B EC              mov         ebp,esp
  00000003: 51                 push        ecx
  00000004: 6A 05              push        5
  00000006: E8 00 00 00 00     call        0000000B
  0000000B: 83 C4 04           add         esp,4
  0000000E: 89 45 FC           mov         dword ptr [ebp-4],eax
  00000011: 8B 45 FC           mov         eax,dword ptr [ebp-4]
  00000014: 50                 push        eax
  00000015: 68 00 00 00 00     push        offset _main
  0000001A: E8 00 00 00 00     call        0000001F
  0000001F: 83 C4 08           add         esp,8
  00000022: 33 C0              xor         eax,eax
  00000024: 8B E5              mov         esp,ebp
  00000026: 5D                 pop         ebp
  00000027: C3                 ret
again, you don't need to understand what they are in this course, just pay special attention to these three lines:
  00000006: E8 00 00 00 00     call        0000000B
  00000015: 68 00 00 00 00     push        offset _main
  0000001A: E8 00 00 00 00     call        0000001F
  
                       Fig.1: 3 lines of codes
which we will turn back next in the linking part.

This completes the compiling part of this problem, and finally comes the linking part.

As we discussed in the class, linking will link all object files together and generate executables. To see the details, type this command in prompt:

link /out:a.exe main.obj area.obj
what new files is generated? Run it, what's the output? Is it correct?

Now type this command line in the prompt:

dumpbin /disasm a.exe >a.asm
what new file is generated? Open that file, you'll see a huge body of code, among which we are intrested in are these:
  00401000: 55                 push        ebp
  00401001: 8B EC              mov         ebp,esp
  00401003: 51                 push        ecx
  00401004: 6A 05              push        5
  00401006: E8 25 00 00 00     call        00401030
  0040100B: 83 C4 04           add         esp,4
  0040100E: 89 45 FC           mov         dword ptr [ebp-4],eax
  00401011: 8B 45 FC           mov         eax,dword ptr [ebp-4]
  00401014: 50                 push        eax
  00401015: 68 30 70 40 00     push        407030h
  0040101A: E8 20 00 00 00     call        0040103F
  0040101F: 83 C4 08           add         esp,8
  00401022: 33 C0              xor         eax,eax
  00401024: 8B E5              mov         esp,ebp
  00401026: 5D                 pop         ebp
  00401027: C3                 ret
  00401028: CC                 int         3
  00401029: CC                 int         3
  0040102A: CC                 int         3
  0040102B: CC                 int         3
  0040102C: CC                 int         3
  0040102D: CC                 int         3
  0040102E: CC                 int         3
  0040102F: CC                 int         3
  00401030: 55                 push        ebp
  00401031: 8B EC              mov         ebp,esp
  00401033: 8B 45 08           mov         eax,dword ptr [ebp+8]
  00401036: 6B C0 03           imul        eax,eax,3
  00401039: 0F AF 45 08        imul        eax,dword ptr [ebp+8]
  0040103D: 5D                 pop         ebp
  0040103E: C3                 ret
  0040103F: 53                 push        ebx
  
                    Fig.2: part of a.asm
(Optional) Question #4: Read code in Fig.1 and Fig.2, and be able to answer these:
  • Where is Fig.1's 3 lines of code in Fig.2? Detect those 3 lines.
  • What's the difference between these two pieces of codes? And why?
This completes the linking part of this problem.

There is a final file left: "Makefile". Now open it and read the content in it, and then type this command in prompt:

nmake
what happend? (Note that we have no time discussing makefiles in this course, but it's ubiquitous in system software development, so you should have a try. A pretty good introduction to makefile is here.)

2. (Optional) A Minimal Library

Having known the process of handling program development: preprocessing, compling, and linking, in this problem, your job is to implement a minimal C library, compiling it and linking with your programs. The library you'll implement is mainly a IO library (along with others). Download this program to start with.

Browse the code we offered you, which consist of:

myStdio.h
myStdio.c
sysCall.obj
main.c
Makefile
And now open a command line prompt and type:
nmake
which should generate an executable "a.exe". Run this .exe file you should see this output on your computer screen:
hello, world
1
##@@ to do: fill in your code! @@##

##@@ to do: fill in your code! @@##

##@@ to do: fill in your code! @@##

##@@ to do: fill in your code! @@##

##@@ to do: fill in your code! @@##

finished!
num of characters: 
Open the file main.c, and investigate how the above output is produced. As you may notice, some ouput is incorrect, why?

Now open the file myStdio.c, and read the function myPrintf in it. Understand why the output is incorrect? (Note that the only system call we may make use of is function putchar, which prints a single character to standard output, and all rest code should be yours.)

Assignment #1: Read the function myPrintf, and find all the program points that call the function toDo. Delete all calls to toDo, and fill in your code there. (Note: you may want to read the textbook for a reference to the formatting character.)

Assignment #2: In file myStdio.c, read the function printInt. Can printInt handle negative integers or 0? If not, revise the function printInt to handle all negative, 0, and positive integers.

Assignment #3: In C library, what type should the function printf return? What value should it return? Then, is our myPrintf return value correctly? Modify function myPrintf to correct this.

You may also want to write your own version of other C libraries, such as ctype.h etc.

3. (Optional) A Minimal Runtime

Why every C program must contain a unique entry function named main ()? To answer this question, we'll write a minimal language runtime in this problem. Recall that a runtime is a collection of libraries to support user code (such as calling or exiting user program, etc.). Download this program to start with.

Browse the code we offered you, which consist of:

myStdio.h
myStdio.c
sysCall.obj
runtime.c
main.c
Makefile
Note that comparing to problem 2, the only newly added file is runtime.c, and make sure that you have finished problem 2 and you must copy your solution code in problem into file myStdio.c to make function myPrintf work properly.

And now open a command line prompt and type:

nmake
which should generate an executable "a.exe". Run that .exe file, and what appears on you computer screen?

Now open the files main.c and runtime.c, read code in them, and investigate how your screen output is generated.

Be able to answer these questions:
Question #1: Who calls function main? How are main's arguments argc, argv passed? Where is main's return value returned to?

Question #2: Change code in main.c to make main return 1. Re-make the program and run the executable. What's the use of main's return value?

Question #3: Must every C program have an entry function named main? Why or why not?