ECE344: Laboratory 3

ECE344 - Operating Systems

Assignment 3: Synchronization

Due Date: Sunday February 25, 2001, 11:59:59pm.

Objectives

The purpose of this lab is to get an introductory exposure to the available synchronization primitives.

Preparation

Read and understand the man pages for mutex and semaphore.

Problems

Write a simple program as follows: Initialize a global integer variable to 0. For further reference, I will call this variable x, but you can, of course, call it anything you want. Call the function compute() 14 times. This function is declared as follows:

void compute( int * );

here

compute()

Change one of the programs you wrote for Lab 2 to create 2 threads that each call compute() 7 times. Again, x, a globally declared integer initialized to 0, should be passed as the argument. What is the result of running this program, and how does it differ from the first program?
Perhaps compute() should be run in a critical section. Fix your program so that it runs correctly by using the mutex lock and unlock synchronization operations available for the pthreads library.
One problem with synchronization is that it can add overhead to the running time of a program. You are to measure how much overhead a lock/unlock pair introduces if no waiting is required. Note that we are not interested in the overhead of the compute() function, only the overhead of the lock/unlock pair.

Write a program that measures the time for a lock/unlock pair using gettimeofday() (see man 3c gettimeofday). Your program should measure the time for 1 lock/unlock pair (do this twice) and the average time for 10, 100, 1000, and 100000 lock/unlock pairs. To obtain each of the average values, perform a single timing measurement of the appropriate number of pairs, then divide by the number of pairs measured. The output of your program must have the following format:

Time for 1 pair = YOUR VALUE
Time for 1 pair = YOUR VALUE
10 pairs, avg time = YOUR VALUE
100 pairs, avg time = YOUR VALUE
1000 pairs, avg time = YOUR VALUE
100000 pairs, avg time = YOUR VALUE

What do you think compute() does? What other functions does compute appear to call?
a) Adapt your multiping program from Lab 2 so that it outputs the hosts sorted by (latency) distance; i.e., the hosts that respond the fastest should be listed first.

b) Make the following modifications to the multiping program above:

The new program must be called testping, and should accept one command line parameter. The parameter will be the name an input file containing up to 10 host names or Internet addesses, one per line.
Instead of running indefinitely, testping should print out 5 reports and then exit. As before, reports should be printed every 10 seconds or so.
Modify the output so that each host line appears as secs:usecs: host, and there is an empty line after each periodic report. Note: There should be no calls to "curses" routines or printing of special control characters. You do not need to clear the screens!

www.cs.toronto.edu
www.toronto.edu
www.eecg.toronto.edu

Then the output should look like:

    0:030903: www.cs.toronto.edu
    0:039859: www.toronto.edu
    0:055107: www.eecg.toronto.edu

    0:027880: www.cs.toronto.edu
    0:056314: www.eecg.toronto.edu
    0:945247: www.toronto.edu

    0:037378: www.toronto.edu
    0:052461: www.eecg.toronto.edu
    0:054199: www.cs.toronto.edu

    0:028432: www.cs.toronto.edu
    0:045742: www.toronto.edu
    0:064377: www.eecg.toronto.edu

    0:027718: www.cs.toronto.edu
    0:466636: www.toronto.edu
    0:531626: www.eecg.toronto.edu

Consider the multiping program again. The time needed to ping a host can vary with each ping. For some hosts, the variance of the time needed can be quite small, but for others it can be quite large. To get a better insight into the variance, for each host, construct a histogram of ping times with four buckets. You are free to choose the time-spans assigned to each bucket however you wish, as long as you do so intelligently! After each ping, one of the buckets gets incremented by one. Adapt multiping to output next to each host 4 numbers indicating the percentage of pings to that host that fall into each bucket. In addition, keep a running count of the total pings in each bucket across all hosts being pinged. At the bottom of your output, display a statistic on the totals of each bucket and the total over all. Note that the sum of the totals of each bucket should equal the total overall.

Your program should be called multiping2.

Remember Java, your favorite programming language? Write a Java program that implements a circular buffer, using two concurrent producing threads and two concurrent consuming threads. Although there is no class "Monitor" in Java, each object has its own implicit monitor that is automatically entered when one of the object's synchronized methods is called. A tutorial on Java threads appears here .

Optional

On a multiprocessor, many read-modify-write instructions, such as Test&Set and Exchange have negative performance effects because of the memory and bus bandwidth they consume. Consider the following realistic scenario. A critical region is highly contended for, and many processes wish to enter it. They all spin on a lock variable using, say, Test&Set. Each of these processes will in fact access the lock variable many times before it can enter the critical region. Because these memory accesses bypass the cache and because a memory or a bus can only service so many accesses per second, each such access uses up a portion of the fixed bandwidth. With many processes spinning on the same lock variable, they can easily consume over 50% of the memory and bus bandwidth. This has a negative effect on the performance of the program, because it slows down the process currently in the critical region, as its bus and memory accesses are now much slower. This is exactly the opposite of what one should be aiming for, namely to have the process in the critical region execute faster, allowing other processes to enter.
For this reason, a number of recent processors introduced new primitives for synchronization, called load-linked and store-conditional. Synchronization is performed using two instructions:

 Load-Linked( *flag )

flag

load link bit (LLbit)

LLbit

 Store-Conditional( *flag, value )

value

flag

Your job is to write (on paper, not implement) the functions lock(flag) and unlock(flag) that behave properly using only the LL/SC instructions, and not Test&Sets.

Submission Instructions

In your home directory, create a directory called ece344; create a subdirectory called as3 and then in that directory create files with your code. You should only submit source files. To submit the assignment you should use the submission procedure described in the policies section.

Name your programs for problems #1-4 progx.c where x is the problem number. Your code for #6a should be named multiping.c, for #6b testping.c, for #7 multiping2.c, and for #8 prog8.java. You should have a Makefile that has targets for each of the programs (prog1, prog2, prog3, prog4, multiping, testping, and multiping2).