- Forward


Processes
The Basics


Prof. David Bernstein
James Madison University

Computer Science Department
bernstdh@jmu.edu

Print

Review
Back SMYC Forward
  • Program:
    • A file containing machine language instructions, the entry-point address, data, symbol and relocation tables, and other information
  • Process:
    • An instance of a program execution
Associated with a Process
Back SMYC Forward
  • Process ID:
    • A positive integer that uniquely identifies it
  • Memory:
    • Text (i.e., instructions), initialized data, uninitialized data, stack, and heap
  • Arguments to main():
    • The array of pointers and the strings they point to are stored in a contiguous area of memory above the stack
  • Environment:
    • An array of strings of the form name=value (that are stored in a contiguous area of memory above the stack)
Working with Process IDs - getpid() getpid
Back SMYC Forward
pid_t getpid(void)
Purpose:
Get the process ID of the calling process.
Details:
Return The process ID of the caller
#include <unistd.h> unistd.h
#include <sys/types.h> sys/types.h
Working with Process IDs - getppid() getppid
Back SMYC Forward
pid_t getppid(void)
Purpose:
Get the process ID of the parent of the calling process.
Details:
Return The process ID of the caller's parent
#include <unistd.h> unistd.h
#include <sys/types.h> sys/types.h
Shell Info. About Process Status: ps ps
Back SMYC Forward
  • What it Does:
    • Reports process status (including IDs, utilization, address, size, times, etc...)
  • Useful Flags:
    • -a all processes associated with terminals
    • -A all processes
    • -l generates a long listing
Process Memory
Back SMYC Forward
  • Code/Text Segment:
    • Instructions
    • Read-only data
  • Data Segment:
    • Initialized data (containing external variables that have been initialized by the programmer)
    • Uninitialized data/block started by signal (BSS) segment (containing external variables that have not been initialized by the programmer and static local variables)
  • Heap:
    • Dynamically allocated memory
  • Stack:
    • A last-in, first-out block of memory that contains automatic/temporary variables and information needed for process execution (i.e., the stack frame/activation records)
Process Memory (cont.)
Back SMYC Forward
  • An Obvious Question:
    • Why are initialized and unitialized variables in distinct areas?
  • One Answer:
    • The contents of initialized variables are written to and read from the object file
    • The contents of uninitialized variables do not need to be written to the object file (they are assigned values by the kernel before the program starts running)
Process Memory (cont.)
Back SMYC Forward
  • The Stack and Heap:
    • Tend to grow in opposite directions
  • Other Details:
    • Depend on the operatings system, compiler, linker, and loader
Process Memory (cont.)
Back SMYC Forward
process-memory-organization
Process memory (cont.)
Back SMYC Forward
unixexamples/processes/memory/memory_layout.c
 
Shell Info. About Process Memory: size size
Back SMYC Forward
  • What it Does:
    • Displays the size of the text segment, initialized data segment and unitialized data segment
  • Where it Gets the Information:
    • Object and/or executable files
Understanding the Environment
Back SMYC Forward
  • Source:
    • A process "inherits" a copy of its parent environment
  • A Common Use:
    • If you put environment variables in the shell's environment then a copy will be given to processes it creates
Working with the Environment from a Shell
Back SMYC Forward
  • Adding Environment Variables:
    • Most Shells:
      name=value
      export name
    • bash and Korn Shell:
      export name=value
    • C Shell:
      setenv name value
  • Displaying the Environment:
    • printenv
  • Removing an Environment Variable:
    • unset (or unsetenv in the C shell)
Get an Environment Variable in a Program - getenv() getenv
Back SMYC Forward
char *getenv(const char *name)
Purpose:
Get an environment variable
Details:
name A pointer to the name of the environment variable
Return A pointer to the string containing the value (or NULL if there is no such variable)
#include <stdlib.h> stdlib.h
Add an Environment Variable in a Program - setenv() setenv
Back SMYC Forward
int setenv(const char *name, const char* value, int overwrite)
Purpose:
Set (or reset) an environment variable
Details:
name A pointer to the name of the environment variable
value A pointer to the value of the environment variable
ovewrite 0 not to overwrite an existing variable; nonzero to always set/reset
Return 0 on success; -1 on error
#include <stdlib.h> stdlib.h

Note: There is also a putenv() putenv function but you must be much more careful when using it because it does not duplicate the string it just points to it. Hence, the string must not be an automatic variable (i.e., a character on the stack) or an array that might change.

Remove an Environment Variable in a Program - unsetenv() unsetenv
Back SMYC Forward
int unsetenv(const char *name)
Purpose:
Remove an environment variable
Details:
name A pointer to the name of the environment variable
Return 0 on success; -1 on error
#include <stdlib.h> stdlib.h
Creating a New Process in a Program
Back SMYC Forward
  • Motivation:
    • Divide a task into multiple flows of control
  • What Happens?
    • A parent process creates a child process (which is an almost exact duplicate)
Creating a New Process - fork() fork
Back SMYC Forward
pid_t fork(void)
Purpose:
Create a child process
Details:
Return Parent: PID of the child or -1. Child: 0
#include <unistd.h> unistd.h
#include <sys/types.h> sys/types.h

Note: fork() is called from the parent process but creates a duplicate child process (containing the same instructions). So, there is a return in both the parent and the child.

Memory in the Parent and Child
Back SMYC Forward
  • Conceptually:
    • Both processes share the same instructions (text segment)
    • Each process has its own data, stack and heap (that are identical copies at the time of the fork())
  • In Reality:
    • Since the text segment is read-only, the processes can share it
    • The kernel uses a copy on write scheme to avoid unnecessary copying of the data, stack and heap
Execution of the Parent and Child
Back SMYC Forward
  • The Return from fork():
    • Happens in both processes
    • The process can be identified using the return value (-1 means an error occured, 0 means the return occurred in the child, positive integer means teh return occurred in the parent)
  • The Order of the Returns:
    • Which process is next scheduled to use the CPU is indeterminate (making the overall system nondeterministic
  • Implications of the Indeterminate Order of the Returns:
    • System correctness must not depend on processes running in a particular order
Files in the Parent and Child
Back SMYC Forward
  • What Happens?
    • The child has duplicates of all of the parent's file descriptors
  • Implications:
    • The file offset and the file status are shared between the parent and child
Files in the Parent and Child (cont.)
Back SMYC Forward
file-descriptors_fork
The Process Lifecycle
Back SMYC Forward
process-lifecycle
Terminating a Process: Returning from main()
Back SMYC Forward
  • Return Values:
    • 0 for successful completion; positive otherwise
  • Missing Return Statement or Return Value:
    • Behavior is underfined in some implementations
Terminating a Process: _exit() _exit
Back SMYC Forward
void _exit(int status)
Purpose:
Terminate the calling process normally
Details:
status 0 for a successful termination; postivie otherwise
Return Does not return
#include <unistd.h> unistd.h

Note: Normally, the library call exit() is used rather than the system call _exit()

Terminating a Process (cont.): exit() exit
Back SMYC Forward
void exit(int status)
Purpose:
"Cleanup" and terminate the calling process normally
Details:
status 0 for a successful termination; positive otherwise
Return Does not return
#include <stdlib.h> stdlib.h

Note: exit() calls exit handlers (in reverse order of registration), flushes the standard files, and calls _exit().

Details of Termination (Normal and Abnormal)
Back SMYC Forward
  1. Open file descriptors are closed (and file locks are released)
  2. Other descriptors are closed (e.g., message catalog descriptors, conversion descriptors)
  3. Semaphores are closed
  4. Message queues are closed
  5. Memory locks are removed
  6. Memory mappings are unmapped
Exit Handlers: atexit() atexit
Back SMYC Forward
int atexit(void (*func)(void));
Purpose:
Add a function to a list of functions to be called when a process terminates.
Details:
Return 0 on success; non-zero on error
#include <stdlib.h> stdlib.h
Doing the "Same Thing" in Two Processes
Back SMYC Forward
unixexamples/processes/fork/messy/grads.c
 
Shortcomings of the Previous Implementation
Back SMYC Forward
  • Difficult to Understand:
    • It's easy to miss the fact that there are two processes
  • Limits Re-Use:
    • The code that performs the calculations can't be used in any other programs
Doing "Different Things" in Two Processes
Back SMYC Forward
unixexamples/processes/fork/messy/summary.c
 
Shortcomings of the Previous Implementation
Back SMYC Forward
  • Difficult to Understand:
    • It's hard to see what each process is doing
  • Limits Re-Use:
    • The code that performs the calculations can't be used in any other programs
The Examples Revisited - The Library
Back SMYC Forward
unixexamples/processes/fork/functions/stats_lib.c
 
Doing the "Same Thing" Revisited
Back SMYC Forward
unixexamples/processes/fork/functions/grads.c
 
Doing "Different Things" Revisited
Back SMYC Forward
unixexamples/processes/fork/functions/summary.c
 
Doing More Than Two Things
Back SMYC Forward
unixexamples/processes/fork/functions/summary_multiple.c
 
Doing More Than Two Things (cont.)
Back SMYC Forward
  • How Many Processes Are There?
    • After the original fork() there are two. Then, both the parent and child call fork() resulting in two more.
      Expand
  • What Is Calculated?
    • The parent first calculates commutes in 1990
    • The parent then calculates commutes in 2000
    • The first child first calculates grads in 1990
    • The first child then calculates commute in 2000
    • The second child calculates grades in 2000
    • The grandchild calculates grads in 2000
    • Expand
Doing More Than Two Things (cont.)
Back SMYC Forward
  • Which Processes Have Copies of File Descriptors?
    • Parent and first child
    • Parent and second child
    • First child and grandchild
  • Which Processes Point to the Same Entry in the i-Node Table?
    • The second child and the grandchild both open commute_2000.dat
A Real World Complication
Back SMYC Forward
  • The Issue:
    • Sometimes programs (with main() functions) exist for the tasks we want to perform in the different
  • Dealing With Such Situations:
    • Use the exec family
Executing a Program: execve() execve
Back SMYC Forward
int execve(const char *path, char *const argv[], char *const envp[]);
Purpose:
Load a new program into an existing process's memory (discarding the existing text/code, data, stack, heap, etc...)
Details:
path The path name of the new program
argv The NULL-terminated (so the length can be determined) array of arguments to be passed to the new program
envp The NULL-terminated (so the length can be determined) array of name-value pairs specifying the environment list
Return Does not return on success; returns -1 on error
#include <unistd.h> unistd.h

Note: The process ID remains the same and the file descriptors are left open.

Executing a Program: errno
Back SMYC Forward
ENOENT The file doesn't exist
ENOEXEC The file isn't in a recognizable format
ETXTBSY The file is open for writing
E2BIG The argument list and/or environment space are too big
An Example of execve()
Back SMYC Forward
The Programs
unixexamples/processes/exec/college.c
 
unixexamples/processes/exec/commute.c
 
An Example of execve() (cont.)
Back SMYC Forward
unixexamples/processes/exec/census.c
 
Doing More Than Two Things Revisited
Back SMYC Forward
unixexamples/processes/exec/census_notmultiple.c
 
Doing More Than Two Things Revisited (cont.)
Back SMYC Forward
  • How Many Processes Are There?
    • After the original fork() there are two. Then the text/code segment of both processes is changed by execve() so the second call to fork() is never executed.
      Expand
  • What Is Calculated?
    • The child calculates grads in 2000
    • The parent calculates commutes in 2000
    • Expand
Doing More Than Two Things Revisited Again
Back SMYC Forward
unixexamples/processes/exec/census_multiple.c
 
Doing More Than Two Things Revisited Again (cont.)
Back SMYC Forward
  • How Many Processes Are There?
    • Four -- the parent and each of the three calls to fork() creates a child
      Expand
  • What Is Calculated?
    • The first child calculates commutes in 1990
    • The second child calculates commutes in 2000
    • The third child calculates grads in 1990
    • The parent calculates grads in 2000
    • Expand
The exec() Family
Back SMYC Forward
  • An Observation:
    • Sometimes the parameters required by execve() are inconvenient
  • The Resolution:
    • The are several exec functions with slightly different signatrues (e.g., execle(), execlp(), execvp(), execv(), and execl())
Monitoring Child Processes
Back SMYC Forward
  • A Common Situation:
    • The parent needs to wait (stay in the stopped state) until one or more children terminate
  • The Solution:
    • waitpid()
Waiting for a Child to Terminate: waitpid() waitpid
Back SMYC Forward
pid_t waitpid(pid_t pid, int *status, int options);
Purpose:
Wait for a particular child process to terminate.
Details:
pid The process to wait for (see below)
status An outbound parameter used to indicate how the child terminated
options A bit mask
Return The PID of the child or 0; -1 on error
#include <sys/types.h> sys/types.h
#include <sys/wait.h> sys/wait.h

Note: If pid is greater than 0 then this call returns when the specific process ID terminates. If pid is 0 then this call returns when any child in the same process as the caller group terminates. If pid is less than -1 then this call returns when any child in the group of the absolute value of the PID terminates. If pid is -1 then it returns when any child terminates.

Waiting for a Child to Terminate: Options
Back SMYC Forward
WNOHANG Return immediately (with a value of 0) if no member of the wait set has already terminated.
WUNTRACED Suspend execution of the calling process until a running member of the wait set is terminated or stopped.
WCONTINUED Suspend execution of the calling process until a cunning member of the wait set is terminated or a stopped member has been resumed.
Waiting for a Child to Terminate: wait() wait
Back SMYC Forward
pid_t wait(int *status);
Purpose:
Wait for any child child process to terminate.
Details:
status An outbound parameter used to indicate how the child terminated
Return The PID of the child or 0; -1 on error
#include <sys/types.h> sys/types.h
#include <sys/wait.h> sys/wait.h

Note: Calling wait(&status) is equivalent to calling waitpid(-1, &status; 0).

Macros for "Dissecting" the Status
Back SMYC Forward
WIFEXITED(status) Returns true if the child exited normally.
WIFSIGNALED(status) Returns true if the child was killed by a signal (and WTERMSIG(status) returns the signal number).
WIFSTOPPED(status) Returns true if the child that caused the return is stopped (and WSTOPSIG(status) then returns the signal number).
WIFCONTINUED(status) Returns true if the child that caused the return was resumed.
Orphans and Zombies
Back SMYC Forward
  • Orphan:
    • A child process that outlives the parent (and is "adopted" by init - the ancestor of all processes)
  • Zombie:
    • A child that terminates before the parent calls wait() (and has most of its resources returned to the system but the process ID and termination status are maintained)
Details of Zombies
Back SMYC Forward
  • Can't be Killed:
    • So the parent always has an opportunity to call wait()/waitpid()
  • Are Eventually Removed:
    • When the parent calls wait()/waitpid()
    • When the parent terminates (and they are adopted by init)
  • An Implication:
    • Long-lived parents that create numerous children should call wait()/waitpid() to ensure that they don't create long-lived zombies (this is known as reaping)
An Example with Zombies
Back SMYC Forward
unixexamples/processes/wait/zombies.c
 
An Example with an Orphan
Back SMYC Forward
unixexamples/processes/wait/orphan.c
 
An Example of Reaping
Back SMYC Forward
unixexamples/processes/wait/reap_one.c
 
Another Example of Reaping
Back SMYC Forward
unixexamples/processes/wait/reap_all.c
 
Process Groups
Back SMYC Forward
  • An Observation:
    • Every process belongs to exactly one process group (which is identified by a positive integer)
  • Default Process Group:
    • By default, a child process belongs to the same process group as its parent
Setting the Process Group: setpgid() setpgid
Back SMYC Forward
int setpgid(pid_t pid, pid_t pgid)
Purpose:
Set the process group ID
Details:
pid The ID of the process (or 0 for the current process)
pgid The ID of the group (or 0 to use the process ID)
Return 0 on success; -1 on error
#include <sys/types.h> sys/types.h
#include <unistd.h> unistd.h
Getting the Process Group: getpgrp() getpgrp
Back SMYC Forward
pid_t getpgrp(void)
Purpose:
Get the process group ID of the calling process
Details:
Return Process group ID of the calling process
#include <unistd.h> unistd.h
There's Always More to Learn
Back -