Logo
Logo

Programming by Stealth

A blog and podcast series by Bart Busschots & Allison Sheridan.

PBS 154 of X: Expansions & Brackets Redux (Bash)

02 Sep 2023

This final Bash instalment is intended to pull together everything we’ve learned so far into a sensible structure, and to introduce the correct Bash jargon so you can effetely search for the answers you need going forward. Remember, Bash really is a very dense language, so needing to check the manual and/or quick reference guides is totally expected. Unless you’re one of the authors of Bash, or unless you spend all day every day writing in Bash, you’re almost certain forget some of the details. The key is to know what Bash and can can’t do, and the right keywords to search for when you hit a proverbial bump in the road.

Having said that, we will learn a few new details in this instalment. None of these new things are vital to know, but they can be darn useful, and they’ll get you instant nerd street-cred when fellow scripters see you using them 🙂

Matching Podcast Episodes

Listen along to this instalment on episode 775 of the Chit Chat Across the Pond Podcast.

You can also Download the MP3

Read an unedited, auto-generated transcript: CCATP_2023_09_02

Episode Resources

PBS 153 Challenge Solution

The challenge from the previous instalment was quite basic — update your solution to the previous challenge to replace any duplicated code with functions.

In my previous solution there was just one obvious piece of repeated logic — checks for whether or not something is an integer, resulting and many if statements starting:

if echo "$OPTARG" | egrep -q "$intRE"

I added a simple function definition to the top of my script:

# test if the first argument is an integer
# Arguments   : NONE
# STDIN       : value to test
# STDOUT      : NOTHING
# Return Codes:
#   0 - the value is an integer
#   1 - the value is not an integer
is_int () {
    echo "$1" | egrep -q '^[-]?\d+$' && return 0
    return 1
}

Which then allowed me to replace those if statements with simpler and clearer ones of the form:

if is_int $OPTARG

You’ll find my full sample solution in the Instalment ZIP as pbs153-challengeSolution.sh.

Command Grouping

Sometimes it’s useful to run multiple commands as if they were one. Two common reasons are to have a different scope for a few commands, and to redirect streams for multiple commands at once.

When you decide you want to group commands, you have two choices:

  1. Use { } to run the grouped commands in the current shell, this means the commands share the same scope as the rest of the script.
  2. Use ( ) to run the grouped commands in a sub-shell so they run in a copy of the script’s scope, so any changes to variables don’t affect the rest of the script.

As an example, this script (pbs154a-noGrouping.sh in the instalment) writes some output to STDOUT, and some to a log file:

#!/usr/bin/env bash

dessert=pancakes
echo "Dessert is $dessert"
echo "Initial dessert: $dessert">log.txt # start new file
dessert=waffles
echo "Updated dessert: $dessert">>log.txt # append
echo 'dessert is tasty!'>>log.txt # append
echo "Dessert is now $dessert"

Notice that the first time we write the file we replace its content (>), then we append to it (>>).

Running the script and viewing the contents of the log file gives the following results:

bash-3.2$ ./pbs154a-noGrouping.sh 
Dessert is pancakes
Dessert is now waffles
bash-3.2$ cat log.txt
Initial dessert: pancakes
Updated dessert: waffles
dessert is tasty!
bash-3.2$ 

We can use command grouping to simplify this script (pbs154b-grouping.sh):

#!/usr/bin/env bash

dessert=pancakes
echo "Dessert is $dessert"

# grouped commands
{
    echo "Initial dessert: $dessert" # no need to redirect!
    dessert=waffles
    echo "Updated dessert: $dessert" # no need to redirect!
    echo 'dessert is tasty!' # no need to redirect!
}>log.txt # start new file

echo "Dessert is now $dessert"

Notice that in this example the $dessert variable inside and outside the group are in the same scope — the change made in the group affects the code outside the group:

bash-3.2$ ./pbs154b-grouping.sh 
Dessert is pancakes
Dessert is now waffles
bash-3.2$ cat log.txt
Initial dessert: pancakes
Updated dessert: waffles
dessert is tasty!
bash-3.2$ 

If we change to grouping in a sub-shell then the commands inside the group see the initial value for $dessert from the code outside the group, but the change made is only seen inside the group. We can see this in pbs154c-subshell.sh:

#!/usr/bin/env bash

dessert=pancakes
echo "Dessert is $dessert"

# grouped commands
(
    echo "Initial dessert: $dessert" # no need to redirect!
    dessert=waffles
    echo "Updated dessert: $dessert" # no need to redirect!
    echo 'dessert is tasty!' # no need to redirect!
)>log.txt # start new file

echo "Dessert is still $dessert"

Running the script now shows the grouped commands can see the outer code’s variables, but not alter them:

bash-3.2$ ./pbs154c-subshell.sh 
Dessert is pancakes
Dessert is still pancakes
$ cat log.txt
Initial dessert: pancakes
Updated dessert: waffles
desert is tasty!
bash-3.2$ 

A very important point to note is that the braces cannot be cuddled, and the closing brace must be a separate command, i.e. it must be the start of a line, or, be separated from the last command in the group by a semicolon (;).

Expansions in Bash

Throughout every instalment in this series-within-a-series, and throughout just about every instalment of Taming the Terminal we have been making use of shell expansions without explicitly calling attention to that fact.

When we pass arguments to terminal commands the values we type and the values received by the command are often not the same. The official name for these transformations are Shell Expansions.

If that sounds a but abstract, we can use the echo command to illustrate the point — echo simply prints out the value of its arguments, so if there were no shell expansions, echo would print back exactly what we type in, but of course we know it does not, as illustrated by this little sequence:

bash-3.2$ echo ~
/Users/bartbusschots
$ echo 'Hello World'
Hello World
$ echo $PWD
/Users/bartbusschots
bash-3.2$

Bash’s official documentation lists seven distinct expansions, and mentions that when all the expansions are done it performs another transformation (that for some reason is not referred to as an expansion) — quote removal. It also lists the order in which all these transformations get applied:

Expansion Description Result
Brace Expansion Basic sequence generation, e.g. doc{1..3}.txtdoc1.txt doc2.txt doc3.txt String
Tilde Expansion Home directory path lookups (and more), e.g. ~/Documents/Users/bartbusschots/Documents File path
Shell Parameter Expansion Shell variable value lookups, e.g. I am $USER or I am ${USER}I am bart. Also works with arrays, given a=(a b c); ${a[#]}3, ${a[@]}a b c & ${a[0]}a String or List of strings
Command Substitution Substitute the result (STDOUT) of executing a command — e.g. today is $(date +%A)today is Wednesday String
Arithmetic Expansion Perform a numeric calculation and use the result, e.g. doc$(( 4 + 5 )).txtdoc9.txt Number as String
Word Splitting The results of any parameter expansions, command substitutions, or arithmetic expansions that were not quoted get split on $IFS (blank space by default). Parameter list
Filename Expansion Pattern-based path lookups, only paths that exist are included Paths as parameter list
Quote Removal Quotes surrounding parameters get removed The final parameter list

We’re not going to go into detail on all of these, but I do want to draw your attention to some, and highlight some nice little power features I’ve not found an excuse to mention before.

Brace Expansion

When you need it, brace expansion is very powerful, but you very rarely see it used.

Brace expansion explodes one argument into many by repeating part of it over a range of values. You define an optional prefix, a range enclosed in curly braces, and an optional postfix. The range can be letters or numbers, and the start and end are separated by two periods.

Note — the prefix, the range, and the postfix must be cuddled to the curly brackets!

As an example, the following command will create three blank text files named doc1.txt, doc2.txt, and doc3.txt:

touch doc{1..3}.txt

Tilde Expansion

While it has some advanced features we’ll be ignoring, the primary use of ~ expansions is to find the paths to users home directories. The ~ character followed by a username gets expanded into the path to that user’s home directory, if you leave out the username it defaults to the current user (i.e. $USER). If the user doesn’t exist, or the user doesn’t have a home dir, the text is left un-changed:

bash-3.2$ echo ~
/Users/bartbusschots
$ echo ~poop
~poop
bash-3.2$

Shell Parameter Expansion and Arrays

At its most basic level shell parameter expansion just replaces a variable’s name with its value, but when you expand an array something a little more interesting happens — each time the expansion meets another array element it splits the result into a new argument (or word in Bash jargon). Consider the following array:

arr=(one 2 iii)

Now, let’s expand it as the arguments to a loop that prints each entry on a new line pre-fixed with a star by running for i in "pre${arr[@]}post"; do echo "* $i"; done:

bash-3.2$ for i in "pre${arr[@]}post"; do echo "* $i"; done
* preone
* 2
* iiipost
bash-3.2$ 

Notice that starting the array expansion did not start a new argument, but the boundary between the first and second element did, as did that boundary between the second and third, and again, the end if the array also didn’t start a new argument.

Word Splitting

When Bash finishes replacing any variables with their values (parameter expansion), any embedded commands with their outputs (command expansion), and any arithmetic expansions with their results, those outputs get split into separate arguments (or words to use the documentation’s vocabulary). The default separator is white space, i.e. spaces, tabs, and newline characters, but if you want the splitting to happen on another delimiter you can specify one by setting the value of $IFS. Word splitting is not performed within double-quoted strings!

Consider the following string variable:

s='a string with spaces'

If we loop over that string without quoting its expansion, the loop will run once for each word in the string, because word splitting will happen after the variable’s value is expanded. We can see this by using a loop to print all the arguments it sees on a separate line pre-fixed with a star with for i in $s; do echo "* $i"; done:

bash-3.2$ for i in $s; do echo "* $i"; done
* a
* string
* with
* spaces
bash-3.2$ 

If we run the same loop by quote the string with for i in "$s"; do echo "* $i"; done we see that no word splitting happens:

bash-3.2$ for i in "$s"; do echo "* $i"; done
* a string with spaces
bash-3.2$ 

The same logic explains the behaviour of arrays depending on whether or not you quote their expansion. Consider the following array:

a=('string one with space' string2)

If we loop over that array and print each word the loop sees on a separate line with a bullet before it without quoting the variable expansion, i.e. by looping over ${a[@]} we see that word splitting happens:

bash-3.2$ for i in ${a[@]}; do echo "* $i"; done
* string
* one
* with
* space
* string2
bash-3.2$

But if we do the same thing but with the array expansion quoted ("${a[@]}"), we see word splitting is skipped for the elements in the array:

bash-3.2$ for i in "${a[@]}"; do echo "* $i"; done
* string one with space
* string2
bash-3.2$ 

Notice that the splitting of the array itself is not affected by word splitting, that splitting is part of the parameter expansion itself.

Filename Expansion

While we all know the basics of filename expansion, like that *.txt means ‘all text files’, it is worth looking at the details a little more.

Firstly, you are performing a file system search, so the result will be one or more file paths separated by spaces. If there are no matching files found, the text of the expansion remains in the argument list. We can see this by looping over a filename expansion that returns no results (*.waffles), and printing each received arg on a new line pre-fixed with a dash for clarity, i.e. with the command for f in *.waffles; do echo "- $f"; done. As you can see, the expansion didn’t match anything, so it gets passed to the echo command as a single argument:

bash-3.2$ for f in *.waffles; do echo "- $f"; done
- *.waffles
bash-3.2$

Another important point to note is that file name expansion happens after word splitting, so spaces in file names do not cause the filenames to get split into multiple arguments. This surprises some, so let’s prove it!

First, create three empty files with spaces in their name:

touch 'spaced file'{1..3}.txt

Now, look over them without quoting the expansion, and print each result one a separate line pre-fixed with a dash with the command for f in spaced*; do echo "- $f"; done, as you can see, the file names did not get split:

bash-3.2$ for f in spaced*; do echo "- $f"; done
- spaced file1.txt
- spaced file2.txt
- spaced file3.txt
bash-3.2$

Now, let’s take a moment to remind ourselves that there are more possible wild-card characters than just the widely used *!

Character Description
* Zero or more occurrences of any character.
? One occurrence of any character.
[…] Exactly one occurrence of any character between the brackets, e.g. [abc] means an a, a b, or a c. Can contain alphabetic and numeric ranges, e.g. [a-c] also means an a, a b, or a c.
[:POSIX_CLASS:] A single occurrence of a character in a given named POSIX character class, e.g. [:punct:] for any one punctuation character.

Note that if you have a new enough version of Bash, and if you enable a special option, you can have more powerful regular expressions, but we won’t be covering those here, you’ll find the details in the Bash documentation on pattern matching.

Also note that you can get a full list of the named POSIX character classes on the WikiPedia page on Regular Expressions.

Quote Removal

Simply put, all those quotes you add to tell Bash how to handle the text you type get stripped off before the arguments get passed to the commands by Bash.

The only real subtlety to bear in mind is the treatment of empty strings produced by expansions.

If an un-quoted expansion produces no output it effectively evaporates into nothingness. But, a quoted expansion that produces no output results in an empty string being passed as an argument. This is important for commands with positional arguments!

Quick Reference/Redux — An Ocean Brackets Summarised

One of the most confusing things to keep straight in your head when coding in Bash is the meanings of all the different kinds of brackets, so here’s a quick refresher on what we’ve learned. Hopefully this concise list will prove a useful quick-reference going forward.

Brackets Description Special Rules Result
[] (cuddled) Character classes — when square brackets appear without a space after the opening one or before the closing one that are interpreted as delimiting a character class within a filename expansion. Must a valid Bash character class definition  
[ ] (un-cuddled) Legacy tests — included in Bash for backwards compatibility with SH, do not use    
[[ ]] Tests — evaluate a boolean expression to an exit code. Primarily used in if statements, e.g. if [[ $n -eq 4 ]]; then echo 'FOUR!'; fi Variables do not need to be quoted Exit code
${ } Variable expansion — fetch the value of a variable. The braces can be omitted when the variable is not an array and the variable name has no special characters. E.g. echo "I am ${User}" or echo "I am $User"   String
$() Sub-shell Expansion — execute commands in a sub-shell and substitute with text from STDOUT. E.g. echo "It's now $(date)"   String
$(( )) Arithmetic Expansion — perform calculation and substitute the result, e.g. echo "1 + 1 = $(( 1 + 1 ))" Automatic variable expansion — use x instead of ${x} or $x String
{} (cuddled) Rage Expansion — explode an optional prefix, a range, and an optional postfix into separate arguments, e.g. f{1..3}.zipf1.zip, f2.zip & f3.zip The parts of the expansion must be cuddled Argument List
{ ; } (not cuddled) Command Grouping — combine multiple commands into one without creating a sub-shell The content must not be cuddled to the braces, and the closing brace must be separate Exit code
( ) Sub-shell Command Grouping — combine multiple commands in a sub-shell   Exit code
=() Array Declaration — declare an array. Note that the = must be cuddled to the opening parenthesis and the array name. E.g. myArray=(waffles pancakes pizza)   Array

The Official Bash Docs

You may have noticed that for most of this series-within-a-series I’ve pushed the official docs much less frequently than I usually do. This is because I feared they may confuse more than help. In my opinion the docs are organised in such a way that they only make sense if you already know enough Bash to think in a bash-like way, so they’re a useful reference for existing Bash users, but are likely to frustrate more than help a Bash newbie.

Throughout this series I’ve been careful to use the same wording the docs do, so now, as we arrive at the end of our Bash journey, it’s time to bookmark those docs!

https://www.gnu.org/software/bash/manual/bash.html

Final Thoughts

That brings us to the end of what proved to be a much longer series-within-a-series than I’d planned or imagined. I’ve learned so much in preparing these instalments — I really feel I grok Bash now, and instead of being frustrated and confused by it’s quirks, I now understand it’s logic. The syntax is of course very dense, and I know I’ll be checking the manual regularly, I’m now confident that I’ll be able to find the relevant section quickly, and actually understand what it says!

My intention had been to dive headlong into the XKPasswd rewrite after Bash, and I expected to be doing that early in the summer, just in time for our usual summer hiatus, but Bash proved so rewarding, and the community so engaged that we ran very long, and didn’t pause!

The paperwork is not quite all signed, sealed, and delivered, but my expectation is that I’ll be taking extended leave in early 2024 and doing the XKPasswd as a full-time endeavour. So, what to do I the mean time?

Well, as it happens I’ve bumped into another programming and terminal related technology that I need to really learn, and as Allison explained so we’ll in her recent MacStock presentation, the best way to learn is to teach!

For both a personal project and something I’m developing in work I need to process JSON data on terminal. I’ve been using the wonderful jq terminal command for years, but only to do simple things. Like with Bash until now, I’ve used 10% of jq’s capabilities for years, but without a deep understanding. It’s time to really learn the command’s very powerful query language.

Join the Community

Find us in the PBS channel on the Podfeet Slack.

Podfeet Slack