Advanced Bash Scripting
Taking your sh-fu to the next level
About the Instructor
- Nathan Isburgh
- Unix user 15+ years, teaching it 10+ years
- Unix Administration and Software Development Consultant
- RHCE on RHEL 5 & 6
- All around über-geek
- Goofy, forgetful ( remember that )
About the Course
- 2 days, lecture/lab format
Hours: 8:30 - 5:00
Lunch: 11:45 – 1:00
- Breaks about every hour
Throw something soft at me if I get too long in the tooth
- Telephone policy
Take it outside, please
- Restrooms
Across from central stairs
- Refreshments
Downstairs in break room, mini-fridge in classroom, machines by stairs
About the Students
- Name?
- Time served, I mean employed, at Rackspace?
- Department?
- General Unix skill level? What about Linux?
- And familiarity with Bash?
- How do you use Linux in your position?
- What are you hoping to take away from this class?
Expectations of Students
- Strong foundation in basic Linux use and administration
- Strong understanding of working in the shell
- Comfortable with topics from Intro to Bash Scripting
- Ask Questions!
- Complete the labs
- Email if you’re going to be late/miss class
- Have fun
- Learn something
Overview
- So you’re getting serious about scripting? You want the advanced stuff? That’s what you’re here for, right?
- Well before we go too much further, we need to lay down some laws..
- Style guidelines
- Scripting best practices
- I know, I know – you want to play with fire NOW!
- But first, we need to learn some skills and practices that will make your scripts more readable, more maintainable and less buggy
Commenting
- Remember from the introduction class:
- Commenting falls under the larger topic of coding style
- Note that style is an individual attribute, developed over time as a software developer
- It is also often lightly or strictly specified by organization
- To simplify this discussion, let us recall the Golden Rules of Commenting…
The Golden Rules of Commenting
- Always comment code which is not obvious to a non-author reader
- You should not comment “
i=i+1
”
- You should comment “
rsync –vazpc $WHAT $WHERE
”
- Always comment functions: their purpose, use, arguments, expectations and results
- Always comment the overall program’s purpose and behavior at the top of the file.
Include dates and authors ( maybe an abbreviated revision history? )
- Always comment when not sure if you should - They don’t cost anything!
Proper Script Structure
Scripts should generally be laid out as:
#!Shebang!
#
# Script comment block ( purpose, arguments, rev history, etc )
#
# Config variables with comments
CONFIG_VAR1=”user can tweak this”
# END OF CONFIGURATION – NO EDITS BELOW THIS LINE
# Function definitions
fail() { echo boohoo ; exit -1 }
# Main code block
if [ $# -lt 2 ] ; then fail ; fi
...
Always Initialize Variables
- You should always initialize your variables
- It looks cleaner, and for complex scripts, a short comment can be left indicating the purpose of the variable
- Security! If variables aren’t initialized, an educated user can easily pre-initialize a variable from the command line and cause all sorts of problems, some maybe nefarious!
Indentation
- Ah yes, good old indentation
Many a bloody nerd war has erupted over disagreements on indentation styles
- To avoid this same fate, let us agree on one simple rule:
Pick an indentation style, and stick to it 100% of the time
- The possibilities are endless:
- Tabs, two spaces, four spaces? Suggest: 2 spaces
- Indent all the blocks, only the multiline blocks, or? Suggest: all
- Reserved words: same line, different lines, indented? Suggest: different lines, indent the blocks only
- Etc, etc, etc
Check Those Arguments
- Users rarely do anything right – train yourself to expect that at all times, and you’ll write better code.
- Case in point: Arguments
- Check for the expected number of arguments
- Check for the expected types of data: numbers, strings, flags
- Check argument values if appropriate, eg: if it is supposed to be a pathname, check that it’s valid and exists
- On very large or complex scripts with many arguments, it might be prudent to consider an argument parsing library like
getopt
( external program, some inconsistencies ) or getopts
( shell builtin, consistent but no long arguments )
Check Commands and Versions
- If a script uses tools that are even remotely uncommon, it should check for their existence early on and error out if anything is missing
- Along the same lines, if there are any feature expectations, or important bug fixes tied to a version of a tool, library or even the shell itself, those version details should be verified early on
- Note that this requires a judgment call – there is no need to check version information on every piece of software touched – just the ones that could be off. For example:
- If a script relies on associative arrays, it should check that the bash interpreter is at least version 4 ( EL5 ships with v3! )
Assign Exit Codes
- Exit codes can be extremely useful to the users of your script
- At the very least, always exit 0 for success and non-zero for failure
- Best case scenario: assign exit codes to different conditions, eg
- invalid arguments
- insufficient permissions
- missing required software
- httpd not running
- unknown error
Write Common Functions
- Write some common, useful functions, such as:
fail(code, msg)
– Prints message to stderr and exits with given code
succeed()
– Maybe print happy message, then exit 0
cleanup()
– For complex scripts, cleanup things like logs, locks, etc. Usually called from fail() and succeed()
debug(msg)
– Prints a debug message to stderr. Bonus: use a config variable and/or command line flag to control behavior
usage()
– Print a detailed usage message to the user if there is a mistake in arguments, or -h/-? Passed
- Perhaps a good case for a library…
Speaking of stderr
- USE IT! Correctly!
- Recall:
stdout
– Normal command output/results
stderr
– Warnings, errors, fails of any kind
- Quick and easy ways to output to stderr:
printf blah > /dev/stderr
printf blah >&2
- This is one of the benefits of writing those common functions!
Command Substitution
- Recall the awesomely powerful backtick, `
- It runs the command in backtacks, takes its stdout and substitutes it, minus any trailing newlines, onto the calling command line
- echo `whoami`
- becomes
- echo student
- Very useful in many situations, and it is backwards compatible with some older shells
- But…
Command Substitution
- Try to avoid the backtick for command substitution
- It is not POSIX compliant
- It does not nest properly
- Quotes can be a serious pain
- Instead, use the
$()
syntax:
- Same behavior, but:
- POSIX compliant
- Nests
- Handles quotes much more simply
Lab
- Put together a properly styled skeleton for a shell script, called
skel.sh
- This should include:
- All of the components discussed in lecture, and placeholders for the pieces which are not known yet ( like config variables )
- The various common functions
- Come up with at least five common script failures, and assign them default exit codes ( example: ‘invalid arguments’ assigned -2 )
- Copy
skel.sh
to health-report.sh
, with synopsis:
./health-report.sh [-td] email
-t
will email one output iteration from top to the email address
-d
will email the output of ‘df -h’ to the email address
email
is the email address for the recipient of the report
Special Variables
- Recall that the shell has many special variables with useful information and settings
- Positional parameters ( arguments )
- Exit status of previous command
- Bash information
- Feature control variables ( IFS, OPT*, DIRSTACK, etc )
- During future labs, be sure to peruse the bash man page sections on:
- Special Parameters -
@, #, ?, $, -
- Shell Variables -
LINENO, SECONDS, PIPESTATUS
- Parameter Substitution -
${#PATH}, ${INPUT:5:10}
Arrays
- In addition to simple variables containing just strings and numbers, bash also supports array variables
- An array is just a collection of values, all stored within one variable, logically:
- TEST -> val1,val2,val3,val4,val5
- Traditionally, the different values in the array are referenced using numbers, called indexes, starting at zero:
- TEST[0] -> val1
- TEST[1] -> val2
- …
- This is known as an Indexed Array
Indexed Array Example
# To create the array, just start assigning values:
MYDIRS[0]=”/”
MYDIRS[1]=”/home”
MYDIRS[2]=”/usr”
echo $MYDIRS
# will just show ”/” since that is first member
echo ${MYDIRS[1]}
# will show ”/home”
# Note that you must use the braced expansion syntax, due to
# overloading of the square bracket characters ( pathname wildcard )
echo ${#MYDIRS[*]}
# shows 3, since there are three values in the array
Associative Arrays
- As of bash version 4, Associative Arrays are available
- An associative array uses strings to get at values, as opposed to numbers
- Associative arrays have to be created specially, using the declare builtin
- declare -A MYDICTIONARY
MYDICTIONARY[apple]=fruit
MYDICTIONARY[carrot]=vegetable
MYDICTIONARY[linux]=”Awesome operating system”
Lab
- Copy
skel.sh
to proc-count.sh
and implement as:
proc-count.sh [-f filter]... [-c] email
- This script will count processes with command names that match one or more filters, emailing one of two possible reports, either a TSV ( which is default ) or a CSV ( selected with the -c flag )
processname count
- Or
processname,count
- If no filter is given, all processes should be reported
- Use arrays to track filters and results
Overview
- An expansion occurs when the shell acts on metacharacters in a command to automatically expand their contents based on rules, generally so the user does not have to type as much ( wildcards ), can reference variables and more
- There are seven different kinds of expansions in bash:
- Brace expansion, tilde expansion, parameter/variable expansion, command substitution, arithmetic expansion, word splitting, and pathname expansion
- On operating systems that support named pipes ( like Linux! ), there is one additional form, known as process substitution
Brace Expansion
- Brace expansion allows for the automatic creation of arbitrary strings
- Consider:
- $ echo a{1..5}b
a1b a2b a3b a4b a5b
- $ echo a{f,h,g}b
afb ahb agb
- As seen in the examples, you can expand ranges of numbers or letters, as well as comma separated lists of values
Tilde Expansion
- You should already be familiar with tilde expansion, which evaluate to user home directories:
- $ echo ~
/home/student
- $ echo ~alice
/home/alice
- What you might not know is that tilde can be used to reference current directories ( ~+ ) and previous directories ( ~- ):
- $ cd /home ; cd / ; echo ~+ ; echo ~-
/
/home
- Started in
/home
, then moved to /
. ~+
expanded to /
, ~-
expanded to /home
Parameter/Variable Expansion
- This topic was covered in depth during the intro bash scripting class
- Quick reminder:
- The second form is more precise, and should generally be used anytime a variable reference is embedded within additional content, to protect from misinterpretation
- Also note, the curly brace expansion syntax allows for extremely powerful capabilities, including arrays, searching, substrings, character counts, case manipulation and more
Command Substitution
- Command substitution is incredibly useful, as it instructs the shell to run a given command in a new shell, and capture its output in some particular manner
- Recall the backtick and $() from an earlier lecture:
echo `whoami`
echo $(whoami)
whoami
will be run from a new shell, and it’s standard output, minus any trailing newlines, will be substituted into the quoted/parenthesis section of the command line, which is then executed from the main shell, as:
Arithmetic Expansion
- Sometimes, it’s incredibly useful to have the shell perform some simple math, and it’s also incredibly easy to use:
- Bash has a slew of operations available, including add/subtract/multiply/divide, exponentiation, bitwise operations including shifts, negations and logical operations, increments, decrements and more
- See the manpage under Arithmetic Evaluation
Word Splitting
- Word splitting is an interesting feature of the shell, that allows it to identify words within a parameter expansion, command substitution and arithmetic evaluation, and then split them out
- There is a shell variable known as
IFS
, which stands for Internal Field Separator
- This variable defines the characters which can separate words, and the default IFS is ‘<space><tab><newline>’
- Also note that the first character of IFS is used to separate the found words during splitting
- Try the following:
Pathname Expansion
- Pathname expansion is nerd-speak for how wildcards work in the shell
- This shouldn’t require review, but recall the three wildcards:
Sample Code
- Next, we will spend some time breaking down and understanding a number of commonly used scripts from Rackspace
- These scripts are available at:
- http://rackspace.edgecloud.com/adv-bash-scripting
Lab
- Modify
health-report.sh
from the earlier lab:
- Add a new flag,
-m
, to create a list of process names and memory percentages, sorted descending by memory usage.
- Also, add a
-c
flag to indicate “collect only” mode. The user should not need to supply an email in this mode. In this mode, the script should produce the requested reports ( from the other flags ), but instead of emailing them immediately, it should collect them in a file under /tmp
called health-report.YYYY-MM-DD
- You can simply append each new report to the file, but include a header in front of each new report that has the date/time
- Finally, add a
-r
flag which accepts a date in YYYY-MM-DD form, and emails the requested report to the supplied email address
- Make sure to produce meaningful error messages for all failures
Overview
- There are a few other topics that should be covered, but did not fall under any of the previous topics
- Here documents
- Subshell executions
- Command separators
- Conditionals with the shell
Subshell Executions
- Sometimes, it is convenient to execute a command within a subshell, which isolates it from the current shell
- It can not impact the environment or working directory of the current shell
- You can treat the subshell as an individual command, using redirection and pipes as needed
- Simple example:
( cd /home ; ls a* ) | wc –l
- This will list a count of the home directories starting with the letter a. The cd did not change the working directory of the main shell
Command Separators
- There are several ways to separate commands:
- Semicolon ( ; )
- This separates commands and does not provide any relation between the commands. They are simply executed one after another, left to right.
- Ampersand ( & )
- This puts the left command in the background and starts executing the next command immediately
- Double Ampersand ( && )
- This will execute the right command if the left command exited with a zero/success
- Double pipe ( || )
- This will execute the right command if the left command exited with a non-zero/fail
Useful Tools in Scripting
Overview
- There are, of course, many, many tools to use while scripting, but some are more powerful, or more frequently used
- We will overview three of these tools now:
Overview
- Finally, a few topics to get fancy!
- Trapping signals
- Terminal codes to get colors and special modes
- Automagic logging with coproc
Terminal Codes with Variables