Complete.Org:
Mailing Lists:
Archives:
discussion:
October 1999: [aclug-L] shell programming |
![]() |
[aclug-L] shell programming[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
FYI, attached are the notes on shell programming that I had prepared for last night's ACLUG meeting. These notes are limited to bash itself, and are not complete -- although they do cover more than they omit. It is important to note that most Linux commands are designed to work in shell programs. Moreover, what distinguishes shell programming languages from more generic languages (like C) is the relative ease with which they can orchestrate other programs. -- /* * Tom Hull -- mailto:thull@xxxxxxxxxxx or thull@xxxxxxxxxx * http://www.ocston.org/~thull/ */ -- Attached file included as plaintext by Listar -- -- File: bash.txt These notes cover most of the features of bash as they may be used in shell programming. They do not cover interactive features (such as command line editing, history, aliases, job control), and do not cover setup features (e.g., .profile) or the quirks of login shells. Bash programming is a superset of traditional Unix Bourne shell programming. Most (not all) of the extensions are compatible with the more recent Korn shell (available as pdksh). Commands: -- Normally each line typed into bash is a command. -- A command can be extended over more than one line by ending all but the last line with backslash: \ -- Bash has a number of special characters: -- Special characters cause the shell to do something special: ( ) [ ] { } & | > < ~ * ? $ ` # ' " \ -- The backslash \ character quotes the next character. If you actually want a backslash, type two \\. -- A pair of single quote ' characters quotes everything but ' and \. -- A pair of double quote " characters quotes everything but $ ` " \. -- Multiple commands can be written on the same line if separated by ; or &. -- Command lines are split into tokens. Normally, the split occurs on space, but space may be included in a token by quoting. -- Each command has three standard files: stdin, stdout, stderr. These are normally attached to the user's tty, but can be redirected: -- > FILE writes stdout to FILE. -- < FILE reads stdin from FILE. -- 2> FILE writes stderr to FILE. -- &> FILE writes stdout and stderr to FILE. This can also be written as: > FILE 2>&1 -- >> FILE appends stdout to FILE. -- You can provide canned stdin input (called a "here file") by using: COMMAND << LABEL ... LABEL -- LABEL is any arbitrary string of characters. The two occurrences have to match exactly, except for leading space. -- The input between the LABELS is effectively double-quoted, which means that $ and ` ` expand. -- Multiple commands can be strung together with | indicating that stdout of the previous command is redirected to stdin of the following command. Such commands are called a "pipeline". -- Some special characters cause tokens to be expanded; i.e., replaced by one or more tokens. These include: -- Pathname expansion is done with the special characters * ? [ ] ~, which specify patterns for matching file names: -- * matches any string of zero or more characters -- ? matches any one character -- [ ] matches any one character listed inside the brackets, which may include ranges like [0-9]. -- [! ] matches any one character which is not listed inside the brackets. [^ ] does the same thing. -- ~ as the first character expands to the home directory of a following user name; if no name, the user's home directory; ~+ and ~- are $PWD and $OLDPWD, respectively. -- A command enclosed in ` ` is substituted with the stdout of the command. -- $ indicates a variety of expansions/substitutions, which will be detailed further below. -- The first token in a command is either: -- A bash keyword or built-in command. -- A program. -- The tokens in a command are passed as arguments to the command. -- Each command has an environment, which is a list of NAME=VALUE strings. The default environment for a command is the shell's own exported environment. However, any NAME=VALUE tokens at the start of a command are put into the command's environment, and the actual command is the first non NAME=VALUE token. -- If the first letter in a token is #, everything from # to the end of line is treated as a comment. -- Empty lines are ignored; i.e., they are not treated as commands that simply do not do anything. Use : for a no-op command. -- Anywhere you can use a single command, you can use a list of commands surrounded by { }. Such a list can, for example, write to a pipe. -- You can also use ( ) to group a list of commands. This differs from { } in that the grouped command list is executed in a sub-shell (a child shell process), so they do not affect the parent shell. E.g.: -- tar cf - . | (cd /some/other/dir; tar xf -) Variables: -- Bash maintains two sets of variables: -- Exported variables, which are passed to commands as part of the command's environment. -- Non-exported variables. -- Bash starts up with the variables defined in its own environment. Such variables are automatically exported. -- You can define new variables: NAME=VALUE: -- NAME is an identifier: the first character is a letter or underscore _, and following characters are letters, underscores, or digits. -- VALUE is the rest of the token, evaluated after expansion and substitution. -- Your variables are not automatically exported. Use the export command to export them. You may: -- NAME=VALUE; export NAME -- export NAME=VALUE -- The declare or typeset command may be used to bind attributes to a variable. E.g.: -- declare -r NAME=VALUE; prevents NAME from being reset later. -- declare -i NAME=VALUE; causes NAME to be treated as an integer. -- To use the value of a variable, $NAME or ${NAME}. The latter is necessary if the character immediately after the end of $NAME is legal for variable names. The braces form is also used for: -- ${NAME:-WORD}; returns WORD if NAME is undefined or null, but does not change NAME; else returns $NAME. -- ${NAME:=WORD}; if NAME is undefined or null, assigns WORD to NAME and returns WORD; else returns $NAME. -- ${NAME:?MESSAGE}; if NAME is undefined or null, prints MESSAGE and terminates command or script. -- ${NAME:+WORD}; if NAME exists and isn't null, returns WORD; else returns null. -- ${NAME#PATTERN}; deletes shortest part from the beginning of $NAME that matches PATTERN, and returns the rest. PATTERN may include file name pattern characters * ? [ ]. -- ${NAME##PATTERN}; deletes longest part from the beginning of $NAME that matches PATTERN, and returns the rest. -- ${NAME%PATTERN}; deletes shortest part from the end of $NAME that matches PATTERN, and returns the rest. -- ${NAME%%PATTERN}; deletes longest part from the end of $NAME that matches PATTERN, and returns the rest. -- ${#NAME}; gives the length of $NAME. -- The built-in let command evaluates integer arithmetic expressions: -- All identifier arguments to let are treated as variables. -- Each argument must be a complete expression. -- It is usually best to quote each argument, since most of the operators are shell special characters. -- The operators, their precedence and associativity, are the same as in C. -- The return value of let is 0 (OK) if the last expression evaluated is non-zero; 1 (ERROR) if zero. -- Use $((EXPR)) to evaluate and substitute integer arithmetic expressions anywhere in a command line. -- To remove a variable definition: unset VARIABLE ... -- Variables can be assigned interactively with the read command: read VARIABLE ... -- This reads a line from stdin, and splits it into as many variables as are named on the line. The last named variable holds the remainder of the line, regardless of whether it could be further split. -- The split is controlled by the IFS variable. IFS means "input field separator", and is the set of characters (default: space) which separate command line tokens. You can change IFS to anything you want to control the split by read, but should remember to change it back again. E.g.: oldIFS="$IFS" IFS=: cat /etc/passwd | while read name passwd uid rest do ... done IFS="$oldIFS" Conditional Execution: -- Every command has a exit status: an integer between 0 and 255. By convention, exit status 0 is regarded as ok/success/true, and any non-0 value is regarded as error/failure/false. Values from 126 up have meanings reserved to the shell (e.g., signal- terminated commands are 128 + signal number). -- The exit status of the last run command is available in the $? variable. -- There are several keywords which allow you to execute commands conditionally: if, then, elif, else, fi: -- if starts a test command (up to the following then) and a conditional block (up to the following fi). -- then starts a block of commands that are executed if the preceding if test is true. The then block cannot be empty, although it can contain the no-op command : or built-in commands like variable assignments. -- else starts a block of commands that are executed if the preceding if test is false. -- fi ends the conditional block. -- elif starts a test command, which is executed only if all previous if and elif test commands are false. -- if/fi blocks can be nested. One advantage of using elif is that it appears at the same nesting level as the initial if, so does not require an extra fi as would be the case if you used else if. -- You can use ! between if or elif and the following test command(s). In this case the then-clause command(s) will be executed if the test command(s) are false. -- The && and || operators can be used for simple conditional execution: -- CMD1 && CMD2; will execute CMD2 only if CMD1 is true. -- CMD1 || CMD2; will execute CMD2 only if CMD1 is false. -- The built-in test command is commonly used for test expressions. It is more commonly written as [ TEST-EXPRESSION ]. The test expressions include: -- File conditions: -- [ -b FILE ]; true if FILE exists and is block special. -- [ -c FILE ]; true if FILE exists and is character special. -- [ -d FILE ]; true if FILE exists and is a directory. -- [ -e FILE ]; true if FILE exists. -- [ -f FILE ]; true if FILE exists and is a regular file. -- [ -g FILE ]; true if FILE exists and setgid bit is set. -- [ -k FILE ]; true if FILE exists and sticky bit is set. -- [ -p FILE ]; true if FILE exists and is a named pipe. -- [ -r FILE ]; true if FILE exists and is readable. -- [ -s FILE ]; true if FILE exists and has size > 0. -- [ -t [N] ]; true if file descriptor N (default 1) is a terminal device. -- [ -u FILE ]; true if FILE exists and setuid bit is set. -- [ -w FILE ]; true if FILE exists and is writable. -- [ -x FILE ]; true if FILE exists and is executable (or searchable if directory). -- [ -G FILE ]; true if FILE exists and its group is same as the effective group ID. -- [ -L FILE ]; true if file exists and is a symbolic link. -- [ -O FILE ]; true if FILE exists and its owner is same as the effective user ID. -- [ -S FILE ]; true if FILE exists and is a socket. -- [ FILE1 -ef FILE2 ]; true if FILE1 and FILE2 are linked. -- [ FILE1 -nt FILE2 ]; true if FILE1 is newer than FILE2. -- [ FILE1 -ot FILE2 ]; true if FILE1 is older than FILE2. -- String conditions: -- [ -n S1 ]; true if S1 has length > 0. -- [ -z S1 ]; true if S1 has zero length. -- [ S1 = S2 ]; true if S1 and S2 are the same. -- [ S1 != S2 ]; true if S1 and S2 are not the same. -- [ S1 ]; true if S1 is not null. -- Integer comparisions: -- [ N1 -eq N2 ]; true if N1 equals N2. -- [ N1 -ge N2 ]; true if N1 is greater than or equals N2. -- [ N1 -gt N2 ]; true if N1 is greater than N2. -- [ N1 -le N2 ]; true if N1 is less than or equals N2. -- [ N1 -lt N2 ]; true if N1 is less than N2. -- [ N1 -ne N2 ]; true if N1 does not equal N2. -- Combined expressions: -- [ ! EXPR ]; true if EXPR is false. -- [ EXPR -a EXPR ]; true if both EXPRs are true. -- [ EXPR -o EXPR ]; true if either EXPR is true. -- Use parentheses to force evaluation order. Parentheses have to be escaped: \( ... \). -- There is a case statement which looks like: case STRING in PATTERN ) COMMANDS ;; ... PATTERN ) COMMANDS esac -- This compares the STRING argument to one or more PATTERNs. On the first match, bash runs the corresponding COMMANDS up to ;;, then skips any remaining sections up to the esac. -- The PATTERNs may use file name wild card characters. -- *) matches all STRINGs, so can be used for a default case. Iteration: -- Bash supports a top-test conditional loop construct: while TEST-COMMANDS do COMMANDS done The while keyword executes the test command(s) and, if true, executes the commands bracketed by do ... done. This construct repeats until the test command(s) evaluate to false. -- Instead of while, you can use until, which breaks when the test command(s) evaluate to true. -- Bash does not provide a bottom-test loop. -- Bash supports an interator: for NAME in ... do COMMANDS done For each argument following in, the iterator sets a variable NAME to the argument, then executes COMMANDS. -- The in clause may be omitted, in which case the default argument list is $* (the arguments to the enclosing program or function).e -- Loops and iterators may be nested. -- Use break to stop execution within a command list and break out of its enclosing loop. You may specify a number of loops to break out of. -- Use continue to stop execution within a command list and start the next loop/iteration. You may specify a number of loops to jump to. Shell Scripts, Source Files, Functions: -- A shell script is a regular text file, which consists of shell commands. -- The first line of a shell script should be: #! /bin/bash When Linux goes to load a program (cf. exec(2)), it looks at the first few characters of the program to find out what format the program is in (e.g., ELF, COFF). If the first two characters are #!, Linux looks at the next string in the line to specify an interpreter for the file. Linux then loads the interpreter, and passes the file to the interpreter as an argument. -- The shell script should have its executable bits set. (If the executable bits are not set, you can still run the shell script as an argument to explictly invoking bash, but you cannot run it under its own name.) -- Command line arguments to a shell script are accessible through special variables: -- $0, $1, $2, ... are the positional parameters. $0 is generally the name of the program, while $1 and on are command line arguments. -- $# is set to the number of positional parameters, not including #0. -- $* is the list of all arguments from $1 on. -- $@ is the same as $*, except that when $@ appears in double quotes "$@" it causes each argument to be quoted as a separate string, while "$*" is only one string. -- The shift command causes all positional parameters to be shifted down one number, discarding the old $1 and reducing $# by one. You can specify an argument to shift to shift by an arbitrary amount. -- Some shell scripts (including /etc/profile and the .profile in your own home directory) are meant to be processed inside an existing shell, instead of running as a sub-process of a shell. These are called "source" files. To read a source file, use either: . COMMAND source COMMAND -- Bash will search $PATH to find a source file, but will only find it if the executable bit is set. -- Positional parameters are set for source files. -- Use getopts to extract command line options within shell scripts. Typically, this looks like: while getopts OPTSTRING OPT do case $OPT in ... esac done shift $((OPTIND - 1)) -- OPTSTRING contains option letters, optionally followed by colon :. The colon indicates that the option takes an argument. -- getopts extracts the next option flag from the positional parameters, and returns it in $OPT. getopts will fail when there are no further valid option flags, at which point the while loop breaks. -- The case construct should have patterns for each option specified in OPTSTRING. If an option takes an argument, pick up the arguments from $OPTARG. -- getopts updates $OPTIND to index the next positional parameter to look at. This can be used after the end of getopts processing to shift the optional arguments out of the positional parameters. -- Normally getopts will complain when it receives an option which is not specified in OPTSTRING. This can be suppressed by providing : as the first character in OPSTRING. In any case, getopts returns ? for all unmatched options, so the case code can provide error reporting. -- Bash supports functions: function NAME { COMMANDS } A function works like a command or shell script, except that it is directly executed within the parent bash process, so can refer to unexported variables and can change process attributes, like the current working directory. -- The arguments passed to a function set the positional parameters for the scope of the function. The function body refers to its positional parameters, $#, $*, $@, and may shift them. -- A return within a function body halts execution and returns a specified exit status (default 0). The caller may then refer to this exit status as $?. -- Variables can be defined as local to a function: local NAME=VALUE ... -- An alternate format for defining a function is: NAME () { ... } -- A shell script may arrange to trap signals: trap "COMMANDS" SIGNALS -- SIGNALS are signal numbers, or 0 to indicate any exit from the shell. Trapping 0 is often used for cleanup code. -- COMMANDS are typically quoted, since they must appear as a single non-numeric argument. -- The command "" causes the signal to be ignored. -- COMMANDS may be omitted, which causes SIGNALS to be handled using their normal defaults. -- To exit from a shell script, use: exit [STATUS] -- To load another program into the shell's process: exec COMMAND [...] Debugging: -- Some shell options are useful for debugging shell scripts: -- bash -n SCRIPT; checks for syntax errors, but does not execute commands. Within a shell script, this can be turned on by: set -n; or set -o noexec. -- bash -v SCRIPT; echoes commands before running them. Also: set -v; or set -o verbose. -- bash -x SCRIPT; echoes commands after processing the command line. Also: set -x; or set -o xtrace.
|