Lab 1

Goals

Learn to work with Unix system calls related to process creation and control.

Credits

This lab was developed by Prof. L. Felipe Perrone based on materials created by Prof. Phil Kearns for CSCI 315 at The College of William & Mary. Permission to reuse this material in parts or in its entirety is granted provided that this credits note is not removed. Additional students files associated with this lab, as well as any existing solutions can be provided upon request by e-mail to: perrone[at]bucknell[dot]edu

It should go without saying that all the work that you will turn in for this lab will be yours. Do not use AI to do the work for you; do not surf the web to get inspiration for this assignment; do not include code that was not written by you. You should try your best to debug your code on your own: use gdb, use printf, use your reasoning. It’s fine to get help from a colleague as long as that means getting assistance to identify the problem and doesn’t go as far as receiving source code to fix it (in writing or orally).

Set Up

Create a text file called lab01.txt in which you will write responses to questions throughout this lab. Identify each answer by the number of the problem and the item to which they correspond. For instance: when answering problem 1, item 1, identify your answer as (1.1).
You will work in your ~/csci315/Labs/Lab1 directory in this lab. All your submission files should be at this location. Copy into this directory all the source files from ~cs315/Labs/Lab1. (Try to use terminal commands for copying files around; if you don’t know how to do that, please ask!)

Unix Processes

In this lab, we start experimenting with a few Unix system calls. The first one, we will look at is fork(2), which is used by a process to spawn an identical copy of itself. When we learn to use a system call or a library function, it is helpful to follow the simple workflow described as follows.

Start by reading the man page of the system call or library function in which you are interested. This will help you begin to understand how it works, but it will also show you some practical details that are essential to using it successfully. In the man page, pay close attention to the SYNOPSIS; it will tell you:

The files you must #include in your program.
The function prototype(s) with which you will work.

For instance, if we’re dealing with fork, you’ll see something like:

FORK(2)            Linux Programmer's Manual FORK(2)

NAME
fork - create a child process

SYNOPSIS
#include <sys/types.h>
#include <unistd.h>

pid_t fork(void);

DESCRIPTION
fork() creates a new process by duplicating the calling process. 
The new process is referred to as the child process. 
The calling process is referred to as the parent process.
...

From this we learn that any program calling fork will need to #include the file unistd.h. The “angle brackets” indicate that these files reside in an include directory owned by the system (most often /usr/include).

We also learn that the fork call:

Returns a value of type pid_t (essentially, an unsigned integer), and
Does not take any input parameters, what is indicated by the formal parameter void.

Once we have tried our best to understand that information, we should not be so bold as to throw code into a large program to see how things work out. It is often more productive to write a small program just to test that we have the right understanding about the behavior of the function. Once we have experimented a bit with this program and are convinced that the function does what we expect and that we have learned to use it effectively, we can use it in a larger context.

Here is a first experiment with fork aimed at understanding what a child process inherits from a parent.

#include   <unistd.h> // need this for fork
#include   <stdio.h> // need this for printf and fflush

int i = 7;
double x = 3.1415926;
int pid;

int main (int argc, char* argv[]) {

  int j = 2;
  double y = 0.12345;

  if (pid = fork()) {
    // parent code
    printf("parent process -- pid= %d\n", pid); fflush(stdout);
    printf("parent sees: i= %d, x= %lf\n", i, x); fflush(stdout);
    printf("parent sees: j= %d, y= %lf\n", j, y); fflush(stdout);
  } else {
    // child code
    printf("child process -- pid= %d\n", pid); fflush(stdout);
    printf("child sees: i= %d, x= %lf\n", i, x); fflush(stdout);
    printf("child sees: j= %d, y= %lf\n", j, y); fflush(stdout);
  }   

  return(0);
}

This code is provided to you in file fork-test.c. Looking at this code, you may be inclined to think that you can infer the order of execution of these lines of C code. For instance: you might say that that parent executes first and the child executes next; or you might say that the order of execution is the one in which the program was written.

Don’t make the mistake of thinking that you can predict the order of execution of the actions in your processes! The process scheduler in the kernel will determine what executes when and your code should not rely on any assumptions of order of execution.

Problem 0 (5 points)

Create a Makefile that builds all the programs you created or modified for Pre-Lab 1 and Lab 1. You will work on this file incrementally to build the code of every subsequent problem in this lab assignment. Add to your git repo now and commit and push as you grow your Makefile.

Problem 1 (20 points)

Let’s start slowly by investigating what a child process may be inheriting from its parent process. First, let’s get this code to compile!

a) Take a look at the program given to you in file fork.c . Compile and execute the program. Add code to have both the child and the parent print out the value of the pid returned by the fork() system call.

int main(int argc, char *argv[]) {
  int pid; int num;
  if (--1  == (pid = fork())) {
    perror("something went wrong in fork");
    exit(-1);
  } else if (0 == pid) {
    for (num=0; num < 20; num++) {
      printf("child: %d\n", num); fflush(stdout);
      sleep(1);
    }
  } else {
    for (num=0; num < 20; num+=3) {
       printf("parent: %d\n", num); fflush(stdout);
       sleep(1);
    }
  }
}

b) The variable num is declared before the call to fork() as shown in this program. After the call to fork(), when a new process is spawned, does there exist only one instance of num in the memory space of the parent process shared by the two processes or do there exist two instances: one in the memory space of the parent and one in the memory space of the child? Discuss your conclusion in lab01.txt.

Now, let’s experiment with forcing a specific order of termination of the processes. As given to you, the code for this problem makes no guarantee that the child will terminate before the parent does! With the concepts we have covered so far in class, we can use a very basic mechanism to establish order in process creation (with fork) and in process termination (with wait or waitpid).

c) Copy fork.c to file fork-wait.c and modify it so that you can guarantee that the parent process will always terminate after the child process has terminated. Your solution cannot rely on the termination condition of the for loops or on the use of sleep. The right way to handle this is using a syscall such as wait or waitpid – read their man pages before jumping into this task. One more thing: Modify the child process so that it makes calls to getpid(2) and getppid(2) and prints out the values returned by these calls.

When you have completed the problem, do the following:

cd ~/csci315/Labs/Lab1
git pull
git add lab01.txt
git add fork.c
git add fork-wait.c
git commit -m “Lab1, problem 1 completed”
git push

Problem 2 (25 points)

This problem will help you remember some material you studied in CSCI 206 Computer Organization & Programming. Do you remember that every running program (aka. process) defines four segments of memory: text, data, stack, and heap?

a) Read fork-data.c carefully, then compile and run it. In lab01.txt explain in which segment of your running program the following variables reside: pid, x, y, i, and j.

b) In lab01.txt, discuss whether running fork-data.c allows you to conclude: (1) if the data segment and the stack segment of a parent process are copied over to the child process; (2) whether changes made to these variables by the child are seen by the parent. What you discover for (2) will tell you whether parent and child share the same memory for data and stack segments or if they each have their own separate segments.

Next, this problem will get you to investigate more deeply what a child process may inherit from its parent. This time, we will be working with files rather than variables. If a parent process has opened a file and then goes on to spawn a child process, you should wonder if the child will see the same file in open state. That is, will the file descriptor that the parent received after a call to open be usable in the child? Furthermore, by reading from a file shared by inheritance, does a process affect the “state” of this file that another process may be reading?

c) Copy the file given to you as fork-data.c to a new file called fork-file.c. Modify your new program so that before the if/fork structure, main creates and opens a file called data.txt and writes into it the string “this is a test for processes created with fork\nthis is another line”.

Note on opening to read or write a file. Be sure to use open(2), read(2), and write(2). Close the file right before the fork call (this will guarantee that all the writes to the file are flushed to disk). Immediately following the close call, open the file again for reading. Inside the parent code section of the program, have the parent issue a single call to read to get 5 characters from the file and print them to the terminal. Inside the child code of the program, have the child process issue a single call to read to get 5 characters from the file and print them to the terminal. Compile your code and run it to observe what happens.

d) In case it is not obvious: the file you open in main is visible in both child and parent processes. Experiment with your modified fork-file.c and write in your lab01.txt file the answers to the following questions: (1) if one process closes the file, can the other still read from it?; (2) say the child process reads from the “inherited” file; does that affect what the parent will read from the same file descriptor?

And now for something completely different. Every time you make a syscall or invoke a library function, take a look at the RETURN VALUE section of its man page. If it says something like “On error, -1 is returned, and errno is set appropriately,” you should consider wrapping that call with a function of your own. By doing so, you can effectively replace calls that your program makes to these services to your customized version of that function, which will react to errors in a standardized manner. In the remainder of this problem, you will write your first wrapper to a syscall. Many more will follow as we get into the semester.

e) Create a new function (outside main) in the file fork-file.c to wrap the call to fork and perform some basic error detection. This function should have the same prototype as the fork system call, but its name will start with a capital letter, that is, its prototype will be:

pid_t Fork(void);

In this new function, you will invoke the system call fork and check if the return value is -1. When that is the case, your function should invoke perror to print out a human readable error message and then call the library function exit with argument -1 to abort the program. This will terminate the process that called Fork and pass a return code to the creator of that process. (Hint: make sure to read the man pages to perror, exit, and any other system or library calls used in this lab and to #include in your code the header files you need.) After you create a wrapper for ANY function in this class, be sure that your programs use YOUR wrappers instead of the original functions.

When you have verified that your program works, do:

git add lab01.txt
git add fork-file.c
git commit -m “Lab 1, problem 2 completed”
git push

Problem 3 (20 points)

Before you get into this problem, let’s talk about a process’ termination status. In Problem 2, you were asked to check the return value of the call to fork(2) and to terminate the parent process when it fails. The mechanism for termination we suggested was to invoke the exit library call with argument -1. By convention, when a Unix process terminates without error, it returns or exits with status 0. When the termination status is different from 0, this convention indicates that some kind of error condition arose.

You can see how this works out in practice with a little experiment that you can run from your bash shell. The cat system utility, which resides in the /bin directory, can be used to display the content of a text file on the terminal. Try this out:

$ /bin/cat fork-file.c

Note that we are using the absolute path to invoke the cat utility and we are doing it just so that you don’t execute any other program with the same name in your PATH. If the file you passed to cat via the command line (that is, fork-file.c) is in your current working directory, its contents will appear on your terminal screen. When that is the case, the program cat terminates successfully and exits with status 0. You can learn what was the terminations status of the last program you executed by inspecting a shell variable, as follows:

$ echo $?

When all goes well in your execution of cat, this echo command will show 0 for termination status. Now, try to cat a file that doesn’t exist:

$ /bin/cat bogus.nuthin

You don’t have a file with that weird name, hopefully! See how you got a termination status different from 0? It was probably 1, in this case. Anyway, what this value in $? indicates is that something went wrong in the execution of the previous program. From now on, remember to use the termination status of a program to your advantage: make your processes use exit or have the main function in your programs return a non-zero value when things go so awfully wrong that you have to abort them.

You will practice exactly that in this problem. And you will also learn to create processes that are not identical clones of their parent. This latter part means you will practice using a function from the exec family.

a) Create a file called catcount.c in which you will write a program that receives one command line argument of type string: the name of a text file. Your program will work as described below.

Start out by spawning a child process.
The child process calls execlp(3) to run /bin/cat with the command line argument that the parent received. Read the man page of this library call to learn what arguments to pass to it. This replaces the binary executable on the child process, but the parent goes on with the code you wrote.
The parent process calls wait(2) so that it blocks until the child terminates and passes back its termination status.
If the child process terminates without error, the parent spawns another child and, again, calls wait so that it can block until the child terminates.
The new child calls execlp again, but this time it runs /usr/bin/wc on the same argument that the parent received from the command line (the file name passed to cat previously).
Once the parent learns that the child has terminated, it goes on to terminate also. If the parent gets to this point, it’s because all has gone well, so its termination status should be 0.

We wrap up this assignment with a little bit of practice in working with a double pointer (that is, a pointer to pointer kind of variable). If you read the man page for execlp, you will notice in the SYNOPSIS section the definition of a variable that your programs will see when they #include the appropriate header file. The variable is defined in unistd.h as:

char **environ;

In order to use it in your program, you will need to define the variable as “extern”, as indicated below:

extern char **environ;

The keyword extern tells the linker that this variable is defined in a separately compiled module (in a library, in this case). Because of C’s duality between pointers and arrays, you can view environ as an array of pointers, where each element is a string that matches the following format:

KEYWORD=VALUE

For this problem, we don’t actually care about the format of each individual string. We are interested in being able to write code to print all these strings to the terminal. To achieve this goal, we have to traverse the array from first to last element printing each string we find followed by the “\n” (newline) character. Since this is a variable size array, the pointer value NULL is used as sentinel to mark the end of the array.

b) In your catcount.c program, create a function with the prototype shown below.

void print_environment(void);

Have your program call this function at the very start of main. The code of the function should be a simple loop to iterate through the array of strings environ up until its last element, printing to the terminal each of the elements it finds (again, follow each of these strings with a “\n”).

When you are done with this problem, do

git add catcount.c
git commit -m “Lab 1, problem 3 completed”
git push

Hand In

Before turning in your work for grading, create a text file in your Lab 1 directory called submission.txt. In this file, provide a list to indicate to the grader, problem by problem, if you completed the problem and whether it works to specification. Wrap everything up by turning in this file and your Makefile from Problem 0 (if you haven’t done so already):

git add Makefile
git add submission.txt
git commit -m “Lab 1 completed”
git push

A word about submitting your work: it is your responsibility to ensure that everything that is necessary to build your programs is added, committed, and pushed to your git repository. You never add object or executable files to a git repo, but everything else you want graders to see must be there. It is unfair to our graders to expect that they should hound you to submit all the files needed for them to compile and test your work.

Grading Rubric

Problem 0 [5 points total]

Submitted a Makefile that builds all the programs created or modified for the prelab and lab.

Problem 1 [20 points total]

[5 points] Added to fork.c code to have both child and parent print out the value of the pid returned by the fork system call. fork.c must compile and execute correctly, otherwise scores only 2 points.
[5 points] Submitted file answers.txt explains whether there is one instance of variable num for the two processes or if there is an instance in the memory space of the parent process and another instance in the memory of the child process.
[7 points] Submitted file fork-wait.c demonstrates the use of wait or waitpid by making the parent process wait for the child to terminate before itself can terminate.
[3 points] Submitted file fork-wait.c has the child process making calls to getpit and getppid to obtain and print its pid and its parent’s pid.

Problem 2 [25 points total]

[4 points] Submitted file answers.txt explains which segment (text, data, heap, or stack) contains the variables pid, x, y, i, and j declared in file fork-data.c.
[5 points] Submitted file answers.txt explains whether the data segment and the stack segment of the parent process are copied over to the child process (2 points) and whether changes the child makes to its variables can be seen by the parent process (3 points).
[6 points] Submitted file fork-file.c used the file descriptor Linux API to create a file called data.txt and to write into it the given string.
[6 points] Submitted file answers.txt explains if a file closed by one of the processes can be read by the other (3 points) and if the child reading from the file affects the reads the parent makes to the same file (3 points).
[4 points] Submitted file fork-file.c contains wrapper Fork() that calls fork(), checks if the returned value was -1 calling perror to print a message to the terminal indicating the error that happened and later aborting the program with return code -1.

Problem 3 [20 points total]

[10 points] In submitted file catcount.c, the child process successfully calls execlp to run /bin/cat with the argument received from the command line (4 points); the parent process calls wait to block until the child’s terminates passing a return code to indicate its termination status (2 points); if the first child terminates without error, the parent spawns another child and waits for its termination (2 points); the new child calls execlp to run /usr/bin/wc on the file name the parent received its command line (2 points).
[10 points] Submitted file catcount.c implements function print_environment that compiles without warnings (5 points); the execution of that function produces output similar to that of program env(1) (5 points).