Home Previous Next

CSC220 :: Lecture Note :: Week 16
Assignments | Handouts | Resources | Email Thurman {Twitter::@compufoo Facebook::CSzero}
{GDT::Bits:: Time  |  Weather  |  Populations  |  Special Dates}

Overview

Assignment(s):


The Unix Philosophy

Here is the Unix Philosophy in a nutshell.

GDT::Bit:: The Unix Philosophy

{TopOfPage} {Resources}


What is a Process?

A program is an executable (or binary) file the resides on some sort of secondary storage device (e.g. hard drive, floppy disk, tape, CD-ROM, etc.). [Programs are sometimes called applications or commands.]

A program is typically generated by translating some sort of source code into a machine language that is readable by the CPU (Central Processing Unit). Some programs contain source code that is interpreted by some other program. These are commonly referred to as scripts.

Typically, program files are located in a common directory or collection of directories. On many Unix systems, programs are found in the following locations:

   /bin
   /usr/bin
   /usr/local/bin

Note that the term bin is used to represent the word binary. Programs are often called binary files. Binary files have a non-ASCII format to them and cannot be modified using a regular text editor. Not all binary files are programs -- in some cases, they are data base type files.

Sadly, a binary file generated on one version of Unix may not be executable on a different Unix system even though the systems may be using the same CPU. [There are different types of executable formats.]

To execute a program, the binary file must be loaded into the memory of the computer. Once this is accomplished, the program now becomes a process. Said another way: A process is an instance of a program.

The shell is usually the program responsible for getting a program loaded into memory. [The shell -- when executing -- is a process.]

Once a process has been created, it is assigned a PID or process identifier that is used to track the process while it executes.

Every process has a parent process to which it belongs. In many cases, that parent process is the shell.

The following are some crude notes that have not been incorporated into the lecture note. The are intended for the CSC178 -- Programming in the Unix Environment course.

   + process subsystem
      * process control
         + creation
         + termination
      * scheduling
      * memory management
         - swapping
         - paging
   + layout of a process
      * text: 
         machine instructions executed by the CPU typically, 
         shareable and read-only
      * initialized data segment: 
         data that is defined & initialized outside any 
         function (e.g. int i = 5;)
      * uninitialized data segment:  
         (bss) data in this segment is initialized by the kernel 
         to 0 (or null pointers) before the program starts executing 
         (bss - old assembler operator "block started by symbol")
      * stack: 
         where auto variables are stored along with info that is 
         saved each time a function is called
      * heap:  dynamic memory alloc'd from here

         +-------------+
         |cmdline args |
         |env vars     |
         ---------------
         | stack       |
         ---------------
         |   ...       |
         |   heap      |
         |   ...       |
         ---------------
         | uninit data |
         -----------------------+
         | init'd data |
         ---------------          read from program file by exec
         | text        |
         -----------------------+

   + pid = fork()   (fork is the only way for a Unix to create a new process)
      * in parent, pid > 0; in the child, pid is 0
      * process 0, created internally by the kernel when the system is booted,
        is the only process not created via fork
      * process creation
         - slot alloc'd in process tbl
         - unique pid assigned 
         - logical copy of the parent process is made
            (text area may not be included in this copy)
         - file and inode tbl counters incremented
         - return pid of child to parent, and 0 to child
      * child inherits the following (incomplete list):  read/effective 
         uid/gid, current working directory, root directory, 
         file mode creation mask, environment, resource 
   + the 'ps' command
      - pid and ppid
      - pid 0 is scheduler
      - pid 1 is init
      - pid 2 is pageout (or many systems)

{TopOfPage} {Resources}


More on Processes

A program file loaded into memory becomes a process.

Every process is assigned a PID (process id) by the kernel. PIDs are assigned on a sequential basis and usually wrap when they read the value 32,767 (2-bytes might be used to store the PID in the process table).

At the lowest-level, a process is created by a fork system call. If the new process is a different program file, then a exec is invoked (fork and exec). [ Webopedia.com::system call]

<side-bar>
Unix consists of six program system calls.

	open, read, write, close, fork, exec
</side-bar>

Every process has a parent process. For shell users, their shell is the parent process of the commands executed.

To aid with terminology, the following paragraph was taken from page 656 of the text book for the course.

There is a special metaphor that applies to processes in the Unix system. The processes have life: they are alive or dead; they are spawned (born) or die; they become zombies or they become orphaned. They are parents or children, and when you want to get rid of one, you kill it.

{TopOfPage} {Resources}


Process Related Commands

There are numerous process-related commands that come with Unix. Two of the more commonly used commands are ps and kill.

The ps Command

The ps (process status) command is used to see a list of active processes.

The default output of ps displays the PID, what terminal the processes was invoked on, the amount of CPU time given to the process, and the process name.

   $ ps -l      # gives a long listing of your current processes
   $ ps -aux    # displays all processes

When you do a long form of the ps command, the process state (S) is displayed. The following are the various states that a process can be in:

   O     running
   S     sleeping
   R     runnable process in queue
   I     idle process, being created
   Z     zombie
   T     process stopped and being traced
   X     process waiting for more memory

The pstree command gives a "tree" diagram of the process table.

The top command gives a "real-time" image of processor activity.

The kill Command

The kill command is used to stop a running process. The simpliest version of the command is as follows:

   $ kill 1234
where 1234 is the PID of the process you want to stop.

The kill command is used to send a signal to the process. In some cases, the program can choose to ignore signals. If you try to kill a process but it doesn't die, then execute the following:

   $ kill -9 1234
where "-9" is an unconditional kill signal (programs cannot choose to ignore this signal).

If you execute kill without specifying a signal value, then a 15 (SIGTERM) is sent by default.

If you want to kill all the processes you may have running on a particular terminal session, then execute the following:

   $ kill 0

You are only allowed to kill your own processes. Only super-user has random killing privledges.

The pkill Command

To be completed.

{TopOfPage} {Resources}


Daemons

Daemon processes are used to extend the functionality of the OS. They are not part of the kernel, but they play important roles in providing applications not directly supported by the kernel. [webserver, telnet, mail, line printer, cron, etc.]

A daemon is a process that starts at boot-time and continues as long as the system is up. Some daemons start and stop on an as needed basis and some run at scheduled time periods.

Some daemons can be thought of as service providers.

Many daemon program names end with a dee ('d'). [httpd, inetd, lpd, crond, ...]

Some popular daemon programs.

   init, cron, inetd, lpd, sendmail
   paging daemon (pageout, kpiod, pagedaemon)
   swapping daemon (swapper, kswapd)

inetd is a super-daemon. [Note: inetd has been replaced with xinetd.]

About the Word: daemon

There term daemon was first used in computing during the early 1960's. At one time the term meant "an attendant spirit that influences one's character or personality. A daemon is neither good or evil; they are creatures of independent thought and will."

{TopOfPage} {Resources}


Running Commands in the Background

Unix is a multi-tasking system. Not only can it support multiple users, but each user can in turn have multiple tasks running.

Usually when you execute a command, the shell takes over your terminal and you have to wait for the command to end before you see the shell prompt again. In some instances, the command you need to execute will take a long time and you don't want to sit idle waiting for it to finish; in other words, you want to "spawn" off the command letting it run in the background while you go ahead and issue more commands. This can be accomplished by using an ampersand at the end of the command-line.

   $ make &
   939
   $ who
   ...
   $ ps
   ...
   $ date
   ...

      Typically, the 'make' command can take a long time to
      execute.  Therefore, we start off the command and use
      a & on the command-line to spin it off in the background.
      This shell displays the PID (process id) of the command
      and re-issues the shell prompt.  Now we can execute more
      commands while the 'make' program runs as a background job.

When you execute commands in the background, you have to be careful with respect to processing ouput. If all the commands write data to the standard output stream and/or error streams, then the data from the various commands will be mixed together. In many instances, commands executed in the background have their output re-directed into a file.

   $ make 1>make.out 2>make.err &
   1411
   $ who
   ...
   $ date
   ...

You need to be careful when executing interactive programs in the background. When you do, then you can have multiple programs reading the standard input stream and this doesn't work [at least two -- the shell and the program executed as a background process]. Typically, the standard input stream is re-directed (i.e. input is obtained from an object other than the keybard) when interactive programs are executed in the background.

{TopOfPage} {Resources}


The nohup Command

When you exit the system, a SIGHUP signal (value 1) is sent all of the programs you have running. By default, a program dies upon receiving this signal.

The nohup command can be used to keep a program running after you exit the system (i.e. the SIGHUP signal is not sent to your processes).

   general syntax:

      nohup your_command_line &

   example:

      nohup make 1>make.out 2>make.err &

When you use nohup , you should always re-direct command output. If you don't, then the command re-directs both output streams to a file called nohup.out (generally, not a good idea because what happens when you nohup a couple of commands?).

{TopOfPage} {Resources}


Signals

A signal is a piece of data in the form of an integer value that can be sent to a process. Signals can be sent to process using the kill command.

Signal values start at 1 and go up from there.

Each signal value has a name associated with it. A list of signal names can be obtained by using the kill command with the -l option. Here is an abreviated list obtained from a Linux system.

    1) SIGHUP     2) SIGINT     3) SIGQUIT    4) SIGILL
    5) SIGTRAP    6) SIGABRT    7) SIGBUS     8) SIGFPE
    9) SIGKILL   10) SIGUSR1   11) SIGSEGV   12) SIGUSR2
   13) SIGPIPE   14) SIGALRM   15) SIGTERM   17) SIGCHLD
   ...

The ANSI 'C' standard defines the following signals.

   SIGABRT    SIGFPE     SIGILL
   SIGINT     SIGSEGV    SIGTERM

A process can be written to ignore or catch signals except for the SIGKILL (signal value 9). If the process contains no signal handling logic, then the default behavior is for the process to terminate. If you want to terminate a program, then first use SIGTERM followed by a SIGKILL if the first kill doesn't work.

Typically, when working at the command-line, typing a <ctrl-c> causes a SIGINT (the interrupt signal value 2) to be sent to a process.

The operating system can send signals to a process when the process performs an illegal operation (e.g. attempts a divide-by-zero or access an invalid memory location).

Many daemon processes read configuration files upon startup. If the configuration is modified while the daemon is running, then a signal is usually sent to the daemon to instruct it to re-read its configuration files. SIGHUP is commonly used for this purpose.

   kill -s SIGHUP pid_of_the_daemon_process

{TopOfPage} {Resources}


The crontab Command

A crontab command is used to maintain a "database" of files named crontab that in turn are used by the cron daemon process to execute jobs a specific time.

A daemon is a process that is not connected to a terminal; they usually run in the background; and are commonly used on Unix systems.

Each user can have a crontab file. The system administrator can prohibit you from using the cron facility by placing your name in the /usr/lib/cron/cron.deny file. If a /usr/lib/cron/cron.allow file exists, then only those users listed in that file can use cron. If both files exist, then /usr/lib/cron/cron.allowed is the file that is used. If neither file exists, then only root is allowed to use cron.

You will want to use cron over at when you have jobs that you want to run more than once (or you don't have permission to use at).

The following is an example of a crontab file.

   17 5 * * 0 /etc/cleanup > /dev/null
   0  2 * * 0,4 /usr/lib/cron/logchecker
   1  3 * * * cat -s /dev/clock >/dev/null 2>&1 || exit 0;
             /etc/setclock `date +\%m\%d\%H\%M\%y`
   30 12 15 * * ulimit 5000; 
                /bin/su uucp -c "/usr/lib/uucp/uudemon.clean" > /dev/null
   0 1 1 * * cp /usr/adm/messages /usr/adm/messages.prev ; >/usr/adm/messages
   55 16 1,15 1,4,7,10 * >/usr/spool/mail/uucp
   0,15,30,45 0-3 * * 1,3,5 /usr/local/bin/happy

Each crontab entry has six fields and they are:

  1. minute that the command is to run (0-59)
  2. hour that the command is to run (0-23)
  3. day of the month... (1-31)
  4. month of the year... (1-12)
  5. day of the week... (0-6, 0=Sun)
  6. the command to run specified using absolute path
  7. Asterisks are wildcards.

    Typically, when you run a command via cron, command output is re-directed into a file. If it isn't, then the output is sent to you via email.

       17 5 * * 0 /etc/cleanup > /dev/null
    
          At 5:17 on every Sunday, execute /etc/cleanup.
    
       0  2 * * 0,4 /usr/lib/cron/logchecker
          
          At 2:00 on every Sunday and Thursday, execute the
          /usr/lib/cron/logchecker program.
    
       30 12 15 * * ulimit 5000; 
                    /bin/su uucp -c "/usr/lib/uucp/uudemon.clean" > /dev/null
    
          At 12:30 on the 15th day of every month, run the commands
          ulimit and uudemon.clean.
    
       0 1 1 * * cp /usr/adm/messages /usr/adm/messages.prev; >/usr/adm/messages
    
          At 1:00 on the 1st day of every month, make a backup copy
          of the  /usr/adm/messages  file and then clear the file.
    
       55 16 1,15 1,4,7,10 * >/usr/spool/mail/uucp
    
          At 16:55 on the 1st and 15th of Jan., Apr., Jul., and Oct.,
          clear out the mail file for the  uucp  account.
    
       0,15,30,45 0-3 * * 1,3,5 /usr/local/bin/happy >/tmp/happy.out 2>&1
    
          Every Monday, Wednesday and Friday, beginning at mid-night
          and ending at 3:45, run the command  /usr/local/bin/happy
          at 0, 15, 30 and 45 minutes after the hour.
    

    Using the crontab command.

       $ crontab your_crontab_file
    
          Submits your  crontab  file to the cron program.
    
       $ crontab -r
    
          Removes your  crontab  file from the cron program.
    
       $ crontab -l
    
          Displays the content of the  crontab  file that was
          submitted to cron.
    
       $ crontab
       CTRL-D
    
          Removes your  crontab  file from  cron  and is
          probably an error on your part.
    

    It is important to note that cron executes with a limited environment. In other words, the jobs that you run using cron should not rely on a particular environment setup. If they do, then the command needs to take steps to get that environment established.

    [Realworld Experience] I once worked on a system (SCO) where the cron program magically died on the 56th day of running. This was a major problem because most customers never reported the problems to the support group. We could never track the problem down; therefore, we added a cron entry to cause the machine to reboot once every month. [This is the worst, but common way to "fix" a problem -- treat the symptom not the illness.]

{TopOfPage} {Resources}


The alias Command

Some Unix shells allow you to define alias commands. Alias commands can be used to save typing when it come to executing frequently used command-lines.

If you are a DOS user, then you may define the following alias commands.

   alias del='rm -i'
   alias dir='ls'
   alias edit='vi'

Define aliases in the .profile that you use. This will ensure they are defined everytime you log in.

Here are some example aliases.

   alias motd='cd $HOME/public_html/motd'
   alias rmdead='rm -f ~/dead.letter'
   $ cd
   $ pwd
   /home/gdt
   $ ls
   dead.letter
   $ motd
   $ pwd
   /home/gdt/public_html/motd
   $ rmdead
   $ ls $HOME
   $ rmdead
   $

Many SysAdmins like to alias the rm command to always be interactive. In addition, some Linux distributions come is various aliased versions of the ls command.

{TopOfPage} {Resources}


Shell Scripts

The shell -- in addition to being a command-line interpreter -- is also a script programming language.

The shell programming language is a 3rd-generation high-level programming language that provides constructs for structured programming; in other words, it supports sequence, selection and iteration (repetition).

A shell script is a file that contains a series of commands for a shell program to execute. As a shell script is executed, each command in the file is interperted by the shell. Shell script programs are not compiled.

Shell script filenames are typically lower-case, but they can contain upper-case letters, digits and other characters. A suffix on a filename can be used to imply file content, but the shell does not key off of filename suffixes.

After a shell script program has been written and saved to a file, the file is typically "marked" executable and it is copied to a "bin" directory that is contained on the user's PATH.

Within a shell script file, the octothorp character # is used to start a comment. Comments are ignored by the shell executing the shell script. The # starts a comment and it stays in effect until the end-of-line.

Shell scripts can use variables to store values (data/information). A variable is a piece of memory given a name to store a value. The variable has no value until one is assigned to it. The equal sign = is the assignment operator and it is used to assign a value to a variable. Spaces are not allowed on either side of the equal sign. Variable names begin with a letter or an underscore; you can use letters, numbers and underscores for the rest of the name. The naming of variables is important and the programmer often has to design and implement a naming convention. Variables do not have to be given a type by the programmer.

   max_number=10
   default_name="unknown"
   PROMPT="Enter your name: "
   foo=

   # the variable foo has no value
Example Shell Script Programs

Suppose you are always curious as to how many regular files are in your current working directory. The following shell script program could be written.

   file name:  filecnt.sh
   ----------------------

   echo "`pwd` has `ls -l | grep '^-' | wc -l` file(s)"

   ----------------------

   $ sh filecnt.sh

      From the shell's perspective,  filecnt.sh  is a regular file 
      containing plain-old-text; therefore, to execute it, another 
      shell is started and the file is supplied as an argument.  
      The shell command ('sh' in this example) expects a shell 
      script filename as an argument.  If no filename is specified, 
      then  sh  issues a command-line prompt.  The  sh  command opens 
      the file and sequentially executes each line in the file.

   $ chmod 755 filecnt.sh

      Typically, shell scripts that are created for re-use are marked 
      executable.  This eliminates the need to invoke a shell command 
      in order to run the script.  

   $ ./filecnt.sh

      The  ./  must be prefixed to the command-name to tell  sh  where 
      the script file is located.  This is not necessary if the current 
      working directory (.) is part of the PATH.  Recall, if  .  is on 
      your PATH, then it should be last component.

   $ PATH="$PATH:$HOME/bin:."
   $ filecnt.sh

      Placing our current working directory on our PATH eliminates 
      the need to prefix the command-name with dot-slash.

   $ ln filecnt.sh $HOME/bin/filecnt

      Re-usable commands are placed in a directory that 
      is included in our PATH.

   $ filecnt

      If this script is to be executed over and over, then it
      should be stored in a directory that is on you PATH.  If
      you do this, then you can execute the command from any
      directory.  Typically, you do not see the  ".sh"  suffix
      used on executable files.
[filecnt.sh source code]
Example

Let us assume you want to be presented with the date and time each time you log off, and that you want an entry made in a log file.


   file name:  bye.sh
   ------------------

   now=`date`
   echo "Good bye... it is now $now"
   echo $now >>$HOME/.logout_times
   exit

   ------------------

   $ chmod 755 bye.sh
   $ ln bye.sh $HOME/bin/bye
   $ bye
   Good Bye... it is now ...
   $

      Our expectation is that the  exit  statement in the
      script will result in a logout, but it doesn't.  This
      is because the shell spawns another shell to execute
      the command and the  exit  statement causes that shell
      to terminate.

   $ . bye          # or we can execute   source bye
   Good Bye... it is now ...
   Connection closed.

      The  dot  and  source  commands are built-in shell commands
      that lets you execute a program in the current shell and
      prevents the shell from creating a sub-shell in order to
      execute the command-line.
[bye.sh source code]
Example

Suppose you want to change your prompt without having to remember about the PS1 environment variable.

   
   file name:  setprompt.sh  
   ------------------------

   echo -n "Enter your prompt: "
   read prompt
   PS1="$prompt: "

   ------------------------

   $ ln setprompt.sh $HOME/bin/setprompt
   $ chmod 755 $HOME/bin/setprompt
   $ setprompt
   Enter your prompt:  yes?
   $                             # The prompt does not change.
   $ . setprompt
   Enter your prompt:  yes?
   yes? 

      The  read  command reads on line from the input device.
      The first word is stored in the first variable, the
      second word in the second variable, and so on.

[setprompt.sh source code]
Example

The following is example of using the read command.

   file name:  whoRU.sh
   --------------------

   echo "Enter first name, last name, address, city, state."
   echo "Example:  john doe 2541 N. Whatever, Tempe, AZ"
   echo -n " "
   read firstname lastname address
   echo "You are $lastname, $firstname and your address is $address"

   --------------------

   By default, whitespace is used to separate words on
   the input stream.  If there are no variables supplied to
   read all the words, then all the remaining words are assigned
   to the last variable.
[whoRU.sh source code]

{TopOfPage} {Resources}


More Shell Scripting

params.sh and if.sh and for.sh and while.sh and arithops.sh and exprcmd.sh

{TopOfPage} {Resources}


Previous Next