Logo
Logo

Programming by Stealth

A blog and podcast series by Bart Busschots & Allison Sheridan.

PBS 151 of X: Printf and More (Bash)

27 May 2023

In the previous instalment we dug deep into how Bash interacts with the three standard POSIX data streams (/dev/stdin, /dev/stdout & /dev/stderr), and we finished with a look at the special file /dev/tty which provides access to the terminal our script is running in, regardless of where the three standard streams have been redirected to. The plan for this instalment was to revisit scope, and to look at better string outputs with printf rather than echo, but thanks to some excellent feedback from listener Jill of Kent (her choice of title), we’ll postpone scope until the next instalment, and start this instalment with an excellent tip related to the [[ builtin and /dev/tty.

Matching Podcast Episode

Listen along to this instalment on episode 768 of the Chit Chat Across the Pond Podcast.

You can also Download the MP3

Read an unedited, auto-generated transcript: CCATP_2023_05_27

Episode Resources

PBS 150 Challenge Solution

The challenge set at the end of the previous instalment was to update the menu script we’ve been working on for the previous few challenges to ensure all user interaction is via the terminal, even when standard in or standard out are redirected, and to allow the user control the source of the menu based on the following simple rules:

  1. By default, the menu will be read from menu.txt in the same folder as the script.
  2. The menu can be read from standard in with the option -m -.
  3. The menu can be read from any file with the option -m FILE.

To simplify things, there was the option to remove the snark (-s) flag.

I chose to remove the -s flag to minimise the distractions in my sample solution. You’ll find my full solution in the instalment ZIP as pbs150-challengeSolution.sh.

The first thing I want to draw your attention to is the use of redirects to and from /dev/tty to force the user interaction via the terminal. I added >/dev/tty after all echo statements related the accepting the order, and I added both a >/dev/tty and </dev/tty to the end of the select statement so it both writes to the terminal and reads from the keyboard no matter where standard in and out have been redirected:

# present the menu, with a done option
if [[ -z $limit ]]
then
    echo 'Choose your breakfast (as many items as you like)' >/dev/tty
else
    echo "Choose up to $limit breakfast items" >/dev/tty
fi
select item in done "${menu[@]}"
do
    # skip invalid selections ($item is empty)
    [[ -z $item ]] && continue

    # exit if done
    [[ $item == done ]] && break

    # store and print the item
    order+=("$item")
    echo "Added $item to your order" >/dev/tty

    # if we're limiting, check the limit
    if [[ -n $limit ]]
    then
        [[ ${#order[@]} -ge $limit ]] && break
    fi
done </dev/tty >/dev/tty

Finally, lets look at how I chose to initialise the menu.

To avoid code duplication, or, to avoid adding a lot of inline logic to the end of the while loop for processing the menu I chose to break the task into two parts:

  1. slurp the raw menu into a string
  2. process the raw menu line-by-line to built up the array of options

Looking at the first part, I chose to initialise the menu source to the default, then use a simple if statement to read from standard in or a file:

# initialise the options to their default values
# …
menuSource=$(dirname "$BASH_SOURCE")/menu.txt # menu.txt in script dir

# …

# slurp the menu into a string
menuString=''
if [[ $menuSource == '-' ]]
then
    menuString=$(cat)
else
    menuString=$(cat "$menuSource")
fi

With the menu loaded into a string named $menuString, I then loop through it line-by-line using a here string (<<<). As a reminder, this operator redirects the contents of a variable to standard in:

# process the menu
declare -a menu
while read -r menuLine
do
    # skip comment lines
    echo "$menuLine" | egrep -q '^[ ]*#' && continue

    # skip empty lines
    echo "$menuLine" | egrep -q '^[ ]*$' && continue

    # store the menu item
    menu+=("$menuLine")
done <<<"$menuString"

The logic inside the loop is un-changed from the previous version of the script.

The only other change was updating getopts to remove support for the -s flag, and add the -m optional argument.

We can verify that a menu can be loaded from an alternative file by using the -m optional argument to load the menu from the file menu-montyPython.txt which offers just one choice, spam 😉:

./pbs150-challengeSolution.sh -m menu-montyPython.txt

We can verify the correct handling of redirecting both standard out and standard in with the command:

echo -e "bacon\neggs\ntoast" | ./pbs150-challengeSolution.sh -m - > order.txt

This command redirects standard out and standard in, but the menu is still presented in the terminal, and the keyboard can still be used to enter the options. Finally, the chosen items are correctly written to order.txt, which you can verify with:

cat ./order.txt

Followup — Detecting Terminals

As mentioned in the introduction, listener Jill of Kent sent me a wonderful message after we posted the previous instalment with a related tip, and I thought it was just too good not to include in the series.

One of the things you may have noticed over the years on the terminal is that some commands seem to somehow know when they’re outputting to a terminal rather than a redirected stream, and behave differently in each case.

Many of the Git commands do this, for example, git log will show you the full Git commit history of your currently checked out branch, and when your output is going to a terminal, it will run the output through less for you so you get it one page at a time, waiting in your input to move to the next page. But, if you redirect the identical command to a file, the information gets written directly to the file without being run through less which would block the operation, e.g. git log > log.txt.

How does that work? Presumably, the POSIX spec provides an API for figuring out if a given stream is connected to a terminal, so the real question is, how does Bash expose this functionality?

Back in instalment 145 we learned to use the [[ builtin command to perform boolean tests, and we learned about some of the operators [[ supports. As I was at pains to point out, we only looked at a subset of the available operators, and one of the ones we ignored, -t, lets us test if a given file handle is a terminal or not. The syntax is a little abstract looking, because the operator works on the numeric ID for the file descriptor (0 for standard input, 1 for standard output, and 2 for standard error). If a stream is a terminal, then -t will give a successful exit code, and if not, it will give an error code, so we can test if standard out is a terminal with [[ -t 1 ]].

As a more practical example, let’s look at pbs151a-terminalTester.sh from the instalment ZIP:

#!/usr/bin/env bash

# check standard input
if [[ -t 0 ]]
then
    echo 'STDIN is a terminal' >/dev/tty
else
    echo 'STDIN is NOT a terminal' >/dev/tty
fi

# check standard output
if [[ -t 1 ]]
then
    echo 'STDOUT is a terminal' >/dev/tty
else
    echo 'STDOUT is NOT a terminal' >/dev/tty
fi

# check standard error
if [[ -t 2 ]]
then
    echo 'STDERR is a terminal' >/dev/tty
else
    echo 'STDERR is NOT a terminal' >/dev/tty
fi

We can see this little script in action with the following commands:

./pbs151a-terminalTester.sh
echo 'pancakes' | ./pbs151a-terminalTester.sh
./pbs151a-terminalTester.sh >/dev/null
./pbs151a-terminalTester.sh 2>/dev/null
echo 'pancakes' | ./pbs151a-terminalTester.sh >/dev/null 2>&1

Better String Formatting with printf

One of the most important programming languages in the history of computer programming is C. We’ve not written a single line of C code in this series, but C’s impact on what we have learned has been huge — semi colons to end lines, curly braces to wrap code blocks, if and while statements with conditions in round brackets, function calls with parameter lists wrapped in round brackets — all that is from C. Apart from its general style, C’s other biggest export is probably the printf function, which stands for ‘print format’ because it prints strings to standard out that can contain formatted versions of the values of variables. Do do this, printf uses a rich placeholder syntax, and it’s that syntax even more than the function itself that has spread far and wide.

The syntax for the actual printf command is trivial. The first argument is a format string, and subsequent arguments provide the values for the placeholders in the format string. If you have a format string with two placeholders, you’d call printf with three arguments — the format string, the value for the first placeholder, and the value for the second.

printf FORMAT_STRING [VAL …]

For example:

printf 'I like to have %s %d times a week!\n' pancakes 5

Bash’s printf command only supports one optional argument, -v, which allows you to specify a variable name to save the output to rather than printing it, so the following works:

printf -v dessert 'some %s please!\n' waffles
echo $dessert

It’s the format string syntax where both the power and the complexity lie!

printf Format Strings

The format string syntax looks simple at first glance, but it’s deceptively complex and powerful. The good news is that all the complexity is optional. The syntax for doing simple things is really simple so you can get started with printf very quickly, but you’ll probably never learn everything it can do, even in a lifetime of programming!

As is our custom, we’ll be cherry picking the most interesting aspects of the spec, so if you want all the details, you’ll need to spend some time with the documentation.

Let’s start with the basics — a format string consists of a series of zero or more of the following:

  1. escape sequences
  2. format specifications
  3. plain text

Escape sequences are pre-fixed with a backslash (\), format specifications start with a percentage sign (%), and everything that’s not part of an escape sequence or format spec is plain text, and is included in the output unchanged.

Escape Sequences

There are just three escape sequences we need to know about:

Sequence Description
\n Add a new line character
\t Add a tab character
\\ Add an actual backslash character

For the most part, Bash is smart enough to make these works as expected in both interpolated ("") and uninterpolated strings (''):

printf 'ho\tho\nho\thum\n'
printf "ho\tho\nho\thum\n"

However, the one exception to that is \\. When possible, I would advise avoiding using this escape sequence in an interpolated string, because you need to double-escape it!

printf 'ho\\hum\n'
printf "ho\\\\hum\n"

Finally note that printf does not end lines, so you need to add your own \n if you want to move to a new line!

Format Specifications

Format specifications always start with a % symbol and always contain a type, but they can optionally contain more details. For our purposes the following is the full syntax for format specifiers (there are actually even more options):

%[FLAGS][MIN_WIDTH][.PRECISION]TYPE

Since types are the only required part of the format spec, let’s start there. From our POV there are just three types and a special case:

Type Description
d Format as as whole number (think digits).
f Format as a floating-point number.
s Format as a string.
% Output an actual percentage symbol (i.e. %% is the format string for a percentage symbol)

This gives us the following simple usage:

printf '%d %s cost $%f\n' 5 pancakes 5.55
# outputs: 5 pancakes costs $5.550000

Straight away we see our first problem — we need to control the precision!

We can do that by pre-fixing the type with a period followed by the number of decimal places we’d like, so for a currency that would be %.2f:

printf '%d %s cost $%.2f\n' 5 pancakes 5.55
# outputs: 5 pancakes cost $5.55

Much better!

Now let’s look at some bigger numbers:

earthDiameter=40075.017
printf 'The earth is %fkm around the equator.\n' $earthDiameter
# outputs: The earth is 40075.017000km around the equator.

OK, let’s lose the decimals completely:

printf 'The earth is %.0fkm around the equator.\n' $earthDiameter
# outputs: The earth is 40075km around the equator.

Better, but we’re missing the thousand separators! How can we add those in? This is where we meet our first optional flag, the ' (single quote) character indicates that thousand separators should be used (think of it as a comma in the air 😉). As well as being weird, this is awkward, because a single quote ends an uninterpolated string. The simplest fix is to use interpolated strings when your format needs to specify the thousand separator:

printf "The earth is %'.0fkm around the equator.\n" $earthDiameter
# outputs: The earth is 40,075km around the equator.

Note that the flags are the first thing after the % symbol.

Aligning Data with printf

If you’re just going to insert values into lines of text what we’ve covered so far is all you’re likely to need, but if you need to align your data, say in some kind of table-like output, you need to learn a few more things.

Firstly, you’ll need to specify the required minimum width for each format spec by adding a number between the flags and the precision (both of which are optional). By default the rendered text is right-aligned and padded with spaces as needed.

For example:

rowFormat='%20s %8.2f\n'
printf "$rowFormat" Waffles 4.5; printf "$rowFormat" Pancakes 5.4

This prints:

             Waffles     4.50
            Pancakes     5.40

There’s a few things to note here. Firstly, to make the output look right when typed into a terminal, I placed multiple printf commands on one line by using the ; separator. If I was running this code in a script I could simply have those two statements as separate lines.

Secondly, note that I used a variable to store the format - this is particularly important of you’ll be outputting a lot of lines of information! Finally, also note that you need to quote variables passed as format strings.

To left-align our description we use the - flag:

rowFormat='%-20s $%8.2f\n'
printf "$rowFormat" Waffles 4.5; printf "$rowFormat" Pancakes 5.4

This prints:

Waffles              $    4.50
Pancakes             $    5.40

In some situations you might want to pad numeric fields with 0 rather than spaces, you can do that with the 0 flag:

rowFormat='%03d\n'
printf "$rowFormat" 1; printf "$rowFormat" 20; printf "$rowFormat" 300

Prints:

001
020
300

If you have a mix of positive and negative numbers you might want to either add + symbols to the positive ones which you can do with the + flag:

rowFormat='%+d\n'
printf "$rowFormat" -1; printf "$rowFormat" 0; printf "$rowFormat" 1

Which prints:

-1
+0
+1

A nicer solution might be to add a space before positive numbers, which you can do by using a space as a flag, no, seriously, a space is a flag!

rowFormat='% d\n'
printf "$rowFormat" -1; printf "$rowFormat" 0; printf "$rowFormat" 1

Prints:

-1
 0
 1

You can of course use multiple flags in the same format spec:

rowFormat="% '.2f\n"
printf "$rowFormat" -1234.5678; printf "$rowFormat" 9876.5432

Prints:

-1,234.57
 9,876.54

One final tip — you can use the precision operator to truncate strings. Remember that the width property specifies a minimum width, so it won’t stop long strings breaking a table. If you want to set a maximum width you can combine the precision operator (.) with the string type, e.g.:

rowFormat='%-3.3s %2d\n'
printf "$rowFormat" monday 5; printf "$rowFormat" tuesday 11

Prints:

mon  5
tue 11

Final Thoughts

Firstly, Jill’s message serves as a good reminder that this series is very much community driven, if you’d like to join in, check out Allison’s Slack at podfeet.com/slack!

Secondly, I hope you can appreciate just how powerful printf is for formatting strings. What’s extra useful is that the printf format syntax is very portable. Like Perl’s Regular Expression syntax, printf formatting is used very widely, so having at least the basics mastered is very empowering!

An Optional Challenge

Write a script to render multiplication tables in a nicely formatted table. Your script should:

  1. Require one argument — the number to render the table for
  2. Default to multiplying the number given by 1 to 10 inclusive
  3. Accept the following two optional arguments:
    • -m to specify a minimum value, replacing the default of 1
    • -M to specify a maximum value, replacing the default of 10

For bonus credit, pipe the output through less if and only if standard out is a terminal.

Join the Community

Find us in the PBS channel on the Podfeet Slack.

Podfeet Slack