Purchase  Copyright © 2002 Paul Sheer. Click here for copying permissions.  Home 

next up previous contents
Next: 8. Streams and sed Up: rute Previous: 6. Editing Text Files   Contents

Subsections

7. Shell Scripting

This chapter introduces you to the concept of computer programming. So far, you have entered commands one at a time. Computer programming is merely the idea of getting a number of commands to be executed, that in combination do some unique powerful function.

7.1 Introduction

To execute a number of commands in sequence, create a file with a .sh extension, into which you will enter your commands. The .sh extension is not strictly necessary but serves as a reminder that the file contains special text called a shell script. From now on, the word script will be used to describe any sequence of commands placed in a text file. Now do a

 
chmod 0755 myfile.sh

which allows the file to be run in the explained way.

Edit the file using your favorite text editor. The first line should be as follows with no whitespace. [Whitespace are tabs and spaces, and in some contexts, newline (end of line) characters.]

 
#!/bin/sh

The line dictates that the following program is a shell script, meaning that it accepts the same sort of commands that you have normally been typing at the prompt. Now enter a number of commands that you would like to be executed. You can start with

 
 
 
 
echo "Hi there"
echo "what is your name? (Type your name here and press Enter)"
read NM
echo "Hello $NM"

Now, exit from your editor and type ./myfile.sh. This will execute [Cause the computer to read and act on your list of commands, also called running the program. ] the file. Note that typing ./myfile.sh is no different from typing any other command at the shell prompt. Your file myfile.sh has in fact become a new UNIX command all of its own.

Note what the read command is doing. It creates a pigeonhole called NM, and then inserts text read from the keyboard into that pigeonhole. Thereafter, whenever the shell encounters NM, its contents are written out instead of the letters NM (provided you write a $ in front of it). We say that NM is a variable because its contents can vary.

You can use shell scripts like a calculator. Try

 
 
 
 
5 
 
echo "I will work out X*Y"
echo "Enter X"
read X
echo "Enter Y"
read Y
echo "X*Y = $X*$Y = $[X*Y]"

The [ and ] mean that everything between must be evaluated [Substituted, worked out, or reduced to some simplified form. ] as a numerical expression [Sequence of numbers with +, -, *, etc. between them. ]. You can, in fact, do a calculation at any time by typing at the prompt

 
echo $[3*6+2*8+9]

[Note that the shell that you are using allows such [ ] notation. On some UNIX systems you will have to use the expr command to get the same effect.]

7.2 Looping to Repeat Commands: the while and until Statements

The shell reads each line in succession from top to bottom: this is called program flow. Now suppose you would like a command to be executed more than once--you would like to alter the program flow so that the shell reads particular commands repeatedly. The while command executes a sequence of commands many times. Here is an example ( -le stands for less than or equal):

 
 
 
 
5 
 
N=1
while test "$N" -le "10"
do
        echo "Number $N"
        N=$[N+1]
done

The N=1 creates a variable called N and places the number 1 into it. The while command executes all the commands between the do and the done repetitively until the test condition is no longer true (i.e., until N is greater than 10). The -le stands for less than or equal to. See test(1) (that is, run man 1 test) to learn about the other types of tests you can do on variables. Also be aware of how N is replaced with a new value that becomes 1 greater with each repetition of the while loop.

You should note here that each line is a distinct command--the commands are newline-separated. You can also have more than one command on a line by separating them with a semicolon as follows:

 
N=1 ; while test "$N" -le "10"; do echo "Number $N"; N=$[N+1] ; done

(Try counting down from 10 with -ge (greater than or equal).) It is easy to see that shell scripts are extremely powerful, because any kind of command can be executed with conditions and loops.

The until statement is identical to while except that the reverse logic is applied. The same functionality can be achieved with -gt (greater than):

 
N=1 ; until test "$N" -gt "10"; do echo "Number $N"; N=$[N+1] ; done

7.3 Looping to Repeat Commands: the for Statement

The for command also allows execution of commands multiple times. It works like this:

 
 
 
 
5 
for i in cows sheep chickens pigs
do
        echo "$i is a farm animal"
done
echo -e "but\nGNUs are not farm animals"

The for command takes each string after the in, and executes the lines between do and done with i substituted for that string. The strings can be anything (even numbers) but are often file names.

The if command executes a number of commands if a condition is met ( -gt stands for greater than, -lt stands for less than). The if command executes all the lines between the if and the fi (``if'' spelled backwards).

 
 
 
 
5 
X=10
Y=5
if test "$X" -gt "$Y" ; then
        echo "$X is greater than $Y"
fi

The if command in its full form can contain as much as:

 
 
 
 
5 
 
 
 
 
X=10
Y=5
if test "$X" -gt "$Y" ; then
        echo "$X is greater than $Y"
elif test "$X" -lt "$Y" ; then
        echo "$X is less than $Y"
else
        echo "$X is equal to $Y"
fi

Now let us create a script that interprets its arguments. Create a new script called backup-lots.sh, containing:

 
 
 
 
#!/bin/sh
for i in 0 1 2 3 4 5 6 7 8 9 ; do
        cp $1 $1.BAK-$i
done

Now create a file important_data with anything in it and then run ./backup-lots.sh important_data, which will copy the file 10 times with 10 different extensions. As you can see, the variable $1 has a special meaning--it is the first argument on the command-line. Now let's get a little bit more sophisticated ( -e test whether the file exists):

 
 
 
 
5 
 
 
 
 
10 
 
 
 
 
#!/bin/sh
if test "$1" = "" ; then
        echo "Usage: backup-lots.sh <filename>"
        exit
fi
for i in 0 1 2 3 4 5 6 7 8 9 ; do
        NEW_FILE=$1.BAK-$i
        if test -e $NEW_FILE ; then
                echo "backup-lots.sh: **warning** $NEW_FILE"
                echo "                already exists - skipping"
        else
                cp $1 $NEW_FILE
        fi
done

7.4 breaking Out of Loops and continueing

A loop that requires premature termination can include the break statement within it:

 
 
 
 
5 
 
 
 
 
10 
 
#!/bin/sh
for i in 0 1 2 3 4 5 6 7 8 9 ; do
        NEW_FILE=$1.BAK-$i
        if test -e $NEW_FILE ; then
                echo "backup-lots.sh: **error** $NEW_FILE"
                echo "                already exists - exitting"
                break
        else
                cp $1 $NEW_FILE
        fi
done

which causes program execution to continue on the line after the done. If two loops are nested within each other, then the command break 2 causes program execution to break out of both loops; and so on for values above 2.

The continue statement is also useful for terminating the current iteration of the loop. This means that if a continue statement is encountered, execution will immediately continue from the top of the loop, thus ignoring the remainder of the body of the loop:

 
 
 
 
5 
 
 
 
 
10 
#!/bin/sh
for i in 0 1 2 3 4 5 6 7 8 9 ; do
        NEW_FILE=$1.BAK-$i
        if test -e $NEW_FILE ; then
                echo "backup-lots.sh: **warning** $NEW_FILE"
                echo "                already exists - skipping"
                continue
        fi
        cp $1 $NEW_FILE
done

Note that both break and continue work inside for, while, and until loops.

7.5 Looping Over Glob Expressions

We know that the shell can expand file names when given wildcards. For instance, we can type ls *.txt to list all files ending with .txt. This applies equally well in any situation, for instance:

 
 
 
 
#!/bin/sh
for i in *.txt ; do
        echo "found a file:" $i
done

The *.txt is expanded to all matching files. These files are searched for in the current directory. If you include an absolute path then the shell will search in that directory:

 
 
 
 
#!/bin/sh
for i in /usr/doc/*/*.txt ; do
        echo "found a file:" $i
done

This example demonstrates the shell's ability to search for matching files and expand an absolute path.

7.6 The case Statement

The case statement can make a potentially complicated program very short. It is best explained with an example.

 
 
 
 
5 
 
 
 
 
10 
 
 
 
 
15 
 
 
 
 
20 
 
 
 
 
#!/bin/sh
case $1 in
        --test|-t)
                echo "you used the --test option"
                exit 0
        ;;
        --help|-h)
                echo "Usage:"
                echo "        myprog.sh [--test|--help|--version]"
                exit 0
        ;;
        --version|-v)
                echo "myprog.sh version 0.0.1"
                exit 0
        ;;
        -*)
                echo "No such option $1"
                echo "Usage:"
                echo "        myprog.sh [--test|--help|--version]"
                exit 1
        ;;
esac
 
echo "You typed \"$1\" on the command-line"

Above you can see that we are trying to process the first argument to a program. It can be one of several options, so using if statements will result in a long program. The case statement allows us to specify several possible statement blocks depending on the value of a variable. Note how each statement block is separated by ;;. The strings before the ) are glob expression matches. The first successful match causes that block to be executed. The | symbol enables us to enter several possible glob expressions.

7.7 Using Functions: the function Keyword

So far, our programs execute mostly from top to bottom. Often, code needs to be repeated, but it is considered bad programming practice to repeat groups of statements that have the same functionality. Function definitions provide a way to group statement blocks into one. A function groups a list of commands and assigns it a name. For example:

 
 
 
 
5 
 
 
 
 
10 
 
 
 
 
15 
 
 
 
 
20 
 
 
 
 
25 
 
 
 
#!/bin/sh
 
function usage ()
{
        echo "Usage:"
        echo "        myprog.sh [--test|--help|--version]"
}
 
case $1 in
        --test|-t)
                echo "you used the --test option"
                exit 0
        ;;
        --help|-h)
                usage
        ;;
        --version|-v)
                echo "myprog.sh version 0.0.2"
                exit 0
        ;;
        -*)
                echo "Error: no such option $1"
                usage
                exit 1
        ;;
esac
 
echo "You typed \"$1\" on the command-line"

Wherever the usage keyword appears, it is effectively substituted for the two lines inside the { and }. There are obvious advantages to this approach: if you would like to change the program usage description, you only need to change it in one place in the code. Good programs use functions so liberally that they never have more than 50 lines of program code in a row.

7.8 Properly Processing Command-Line Arguments: the shift Keyword

Most programs we have seen can take many command-line arguments, sometimes in any order. Here is how we can make our own shell scripts with this functionality. The command-line arguments can be reached with $1, $2, etc. The script,

 
 
 
#!/bin/sh
 
echo "The first argument is: $1, second argument is: $2, third argument is: $3"

can be run with

 
myfile.sh dogs cats birds

and prints

 
The first argument is: dogs, second argument is: cats, third argument is: birds

Now we need to loop through each argument and decide what to do with it. A script like

 
 
 
for i in $1 $2 $3 $4 ; do
        <statments>
done

doesn't give us much flexibilty. The shift keyword is meant to make things easier. It shifts up all the arguments by one place so that $1 gets the value of  $2, $2 gets the value of $3, and so on. ( != tests that the "$1" is not equal to "", that is, whether it is empty and is hence past the last argument.) Try

 
 
 
 
while test "$1" != "" ; do
        echo $1
        shift
done

and run the program with lots of arguments.

Now we can put any sort of condition statements within the loop to process the arguments in turn:

 
 
 
 
5 
 
 
 
 
10 
 
 
 
 
15 
 
 
 
 
20 
 
 
 
 
25 
 
 
 
 
30 
 
 
 
#!/bin/sh
 
function usage ()
{
       echo "Usage:"
       echo "        myprog.sh [--test|--help|--version] [--echo <text>]"
}
 
while test "$1" != "" ; do
        case $1 in
                --echo|-e)
                        echo "$2"
                        shift
                ;;
                --test|-t)
                        echo "you used the --test option"
                ;;
                --help|-h)
                        usage
                        exit 0
                ;;
                --version|-v)
                        echo "myprog.sh version 0.0.3"
                        exit 0
                ;;
                -*)
                        echo "Error: no such option $1"
                        usage
                        exit 1
                ;;
        esac
        shift
done

myprog.sh can now run with multiple arguments on the command-line.

7.9 More on Command-Line Arguments: $@ and $0

Whereas $1, $2, $3, etc. expand to the individual arguments passed to the program, $@ expands to all arguments. This behavior is useful for passing all remaining arguments onto a second command. For instance,

 
 
 
 
if test "$1" = "--special" ; then
        shift
        myprog2.sh "$@"
fi

$0 means the name of the program itself and not any command-line argument. It is the command used to invoke the current program. In the above cases, it is ./myprog.sh. Note that $0 is immune to shift operations.

7.10 Single Forward Quote Notation

Single forward quotes  '  protect the enclosed text from the shell. In other words, you can place any odd characters inside forward quotes, and the shell will treat them literally and reproduce your text exactly. For instance, you may want to echo an actual $ to the screen to produce an output like  costs $1000. You can use echo 'costs $1000' instead of echo "costs $1000".

7.11 Double-Quote Notation

Double quotes  "  have the opposite sense of single quotes. They allow all shell interpretations to take place inside them. The reason they are used at all is only to group text containing whitespace into a single word, because the shell will usually break up text along whitespace boundaries. Try,

 
 
 
for i in "henry john mary sue" ; do
    echo "$i     is     a      person"
done

compared to

 
 
 
for i in henry john mary sue ; do
    echo $i     is     a      person
done

7.12 Backward-Quote Substitution

Backward quotes  `  have a special meaning to the shell. When a command is inside backward quotes it means that the command should be run and its output substituted in place of the backquotes. Take, for example, the cat command. Create a small file, to_be_catted, with only the text daisy inside it. Create a shell script

 
 
X=`cat to_be_catted`
echo $X

The value of X is set to the output of the cat command, which in this case is the word daisy. This is a powerful tool. Consider the expr command:

 
 
X=`expr 100 + 50 '*' 3`
echo $X

Hence we can use expr and backquotes to do mathematics inside our shell script. Here is a function to calculate factorials. Note how we enclose the * in forward quotes. They prevent the shell from expanding the * into matching file names:

 
 
 
 
5 
 
 
 
 
10 
function factorial ()
{
    N=$1
    A=1
    while test $N -gt 0 ; do
        A=`expr $A '*' $N`
        N=`expr $N - 1`
    done
    echo $A
}

We can see that the square braces used further above can actually suffice for most of the times where we would like to use expr. (However, $[] notation is an extension of the GNU shells and is not a standard feature on all varients of UNIX.) We can now run factorial 20 and see the output. If we want to assign the output to a variable, we can do this with X=`factorial 20`.

Note that another notation which gives the effect of a backward quote is $(command ), which is identical to `command `. Here, I will always use the older backward quote style.


next up previous contents
Next: 8. Streams and sed Up: rute Previous: 6. Editing Text Files   Contents