Logo
Logo

Programming by Stealth

A blog and podcast series by Bart Busschots & Allison Sheridan.

PBS 152 of X: xargs & Easier Arithmetic (Bash)

24 Jun 2023

While developing my sample solution to the challenge set at the end of the previous instalment I found myself running into three pain-points, and it so happens that addressing them in this instalment and the next fits in perfectly into the bigger arc of our shell scripting journey.

The two pain points we’ll focus on today fell out of my simple desire to use the ASCII characters for drawing pretty tables to frame my table. To do that I needed to know the maximum length of each column, and doing that reliably required a lot of character counting and arithmetic.

To do the character counting efficiently, I needed to make use of an extremely useful terminal command that’s simultaneously extremely simple, and extremely confusing! I’ve been dreading trying to explain it, but now is the time, prepare to meet xargs!

After that heavy lifting we’ll move on to doing arithmetic in a cleaner and simpler way than piping strings through the bc (basic calculator) command.

The final pain point — copying and pasting code - that will be the main focus of the next instalment, POSIX functions, and, the very much related topic of variable scope in Bash.

Matching Podcast Episodes

Listen along to this instalment for Part A on episode 771 of the Chit Chat Across the Pond Podcast.

You can also Download the MP3

Read an unedited, auto-generated transcript: CCATP_2023_06_24


Listen along to this instalment for Part B on episode 773 of the Chit Chat Across the Pond Podcast.

Listen along to this instalment on episode 773 of the Chit Chat Across the Pond Podcast.

You can also Download the MP3

Read an unedited, auto-generated transcript: CCATP_2023_07_08

Episode Resources

PBS 151 Challenge Solution

The challenge set at the end of the previous instalment was to write a script to render multiplication tables in a nicely formatted table. The script was to take the number it should do the tables for as a required argument, default to rendering the table from 1 to 10, and allow optional named arguments -m and -M for custom minimum and maximum values. There was also bonus credit for outputting the table through less if and only if the output was going to a terminal.

On reflection, I should not have been thinking minimum and maximum, but start and end, so I changed the spec for my sample solution to:

Write a script that renders the multiplication tables for a given number. By default the table should start at 1 and run to 10. The script should:

  1. Require one argument — the number to render the table for
  2. Accept the following two optional arguments: * -s to specify a custom starting value, replacing the default of 1
    • -e to specify a custom ending value, replacing the default of 10

Because I needed it to save my own sanity while trying to debug my sample solution, I also added an extra feature not in the requirements, a -d flag to enable debug output to STDERR.

You’ll find the full code for my sample solution in the file pbs151-challengeSolution.sh in the instalment ZIP:

#!/usr/bin/env bash

# Exit codes:
# 1: missing required args or unsupported flags or optional args
# 2: invalid value for supported arg

#
# === set defaults ===
#
start=1    # default to starting at 1
end=10     # default to ending at 10
doDebug='' # default to not debugging

#
# === process args ===
#

# utility variables
usage="Usage: $(basename $0) [-s START] [-e END] [-d] N"
intRE='^-?[0-9]+$'

# process optional args & flags
while getopts ':s:e:d' opt
do
    case $opt in
        s)  
            if echo "$OPTARG" | egrep -q "$intRE"
            then
                start=$OPTARG
            else
                echo "invalid starting value '$OPTARG', must be a whole number" >&2
                echo "$usage" >&2
                exit 2
            fi
            ;;
        e)
            if echo "$OPTARG" | egrep -q "$intRE"
            then
                end=$OPTARG
            else
                echo "invalid ending value '$OPTARG', must be a whole number" >&2
                echo "$usage" >&2
                exit 2
            fi
            ;;
        d)
            doDebug=1
            ;;
        ?)
            echo "$usage" >&2
            exit 1
            ;;
    esac
done
shift $(echo "$OPTIND-1" | bc)

# process positional arg
n=$1
if [[ -z $n ]] # no first real arg
then
    echo "$usage" >&2
    exit 1
fi
if ! echo "$n" | egrep -q "$intRE" # invalid first real arg
then
    echo "invalid number '$n' first argument must be a whole number" >&2
    exit 2
fi

# if debugging, print values
if [[ -n $doDebug ]]
then
    echo "DEBUG: n       = $n" >&2
    echo "DEBUG: start   = $start" >&2
    echo "DEBUG: end     = $end" >&2
fi

#
# === Build row format string ===
#

# calculate maximum lengths for each column when nicely formatted
nLen=$(printf "%'d" $n | wc -c | xargs) # character length of the number when formatted
maxMLen=1 # maximum character length of any formatted multiplier
maxPLen=1 # maximum character length of any formatted product
for m in $(seq $start $end)
do
    # Multiplier length when nicely formatted
    mLen=$(printf "%'d" $m | wc -c | xargs)
    [[ $mLen -gt $maxMLen ]] && maxMLen=$mLen

    # Product length when nicely formatted
    pLen=$(echo "$n*$m" | bc | xargs printf "%'d" | wc -c | xargs)
    [[ $pLen -gt $maxPLen ]] && maxPLen=$pLen
done

# calculate the length of the middle piece of the table caps
capLen=$(echo "8+$nLen+$maxMLen+$maxPLen" | bc)

# if debugging, print the calculated numbers
if [[ -n $doDebug ]]
then
    echo "DEBUG: nLen    = $nLen" >&2
    echo "DEBUG: maxMLen = $maxMLen" >&2
    echo "DEBUG: maxPLen = $maxPLen" >&2
    echo "DEBUG: capLen  = $capLen" >&2
fi

# build the row format string, and print if debugging
fString="┃ %'"$nLen"d x %'"$maxMLen"d = %'"$maxPLen"d ┃\n"
[[ -n $doDebug ]] && echo "DEBUG: fString = $fString" >&2

# build the cap insert, and print if debugging
capMid=''; for c in $(seq 1 $capLen); do capMid+='━'; done
[[ -n $doDebug ]] && echo "DEBUG: capMid  = $capMid" >&2

#
# === print the table ===
#

# the variable to build the table into
table=''

# render the top cap row
printf -v row '┏%s┓\n' $capMid
table+=$row

# print the table body
for m in $(seq $start $end)
do
    # calculate the product
    p=$(echo "$n*$m" | bc)

    # render the table row
    printf -v row "$fString" $n $m $p
    table+=$row
done

# render the bottom cap row
printf -v row '┗%s┛\n' $capMid
table+=$row

# print the table
if [[ -t 1 ]]
then
    echo "$table" | less --no-init --quit-if-one-screen
else
    echo "$table"
fi

You can see the three-times tables by running the script with the argument 3:

./pbs151-challengeSolution.sh 3

This sample solution is really quite long indeed, and the majority of the code is using concepts that we’ve been using heavily in recent instalments, so I won’t go through it line-by-line, but I do want to draw your attention to a few important points.

Enabling Easier Debugging

I want to start by pointing out some features I added to make the script easier to debug. Up to this point in this series-within-a-series, our scripts have been too simplistic to warrant this kind of software engineering effort, but in the real world I do these kinds of things when writing even moderately complex code.

Firstly, if a script can fail for multiple reasons, assign different exit codes to each reason, and document them clearly. Notice that my sample solution terminates with an exit code of 1 when the user passes an incorrect set of arguments, and an exit code of 2 if they pass a valid set of arguments containing an invalid value for one or more of those arguments. And, notice the clear comments at the top of the script describing these exit codes:

# Exit codes:
# 1: missing required args or unsupported flags or optional args
# 2: invalid value for supported arg

If you run the script without the minimum needed single argument you can see it exits with 1 by running the script that way, then printing the special POSIX variable $? to show the most recent exit code:

./pbs151-challengeSolution.sh
echo $?

If you pass an invalid number, you get an exit code of 2:

./pbs151-challengeSolution.sh pancakes
echo $?

Secondly, when you’re working on code of any kind you often want more detailed feedback than you’ll want when the code is being used for real. What would be infuriating noise in real use is invaluable insight when developing! You can eat your proverbial cake and still have it by adding a flag to put your code into debug mode.

I chose to use an optional -d flag to enable debug mode:

./pbs151-challengeSolution.sh -d 3

To implement this I used a variable named $doDebug with a default value of an empty string. Debug messages only get printed when this variable is not an empty string. This approach allows me to use the -n operator to test for debug mode (see the code snippets below).

Another point of note is that it’s best to keep debug messages separate from code’s true output. This is why I chose to write all debug messages to STDERR.

This gives you the power to redirect debug and regular messages separately:

# save debug messages to a log file
./pbs151-challengeSolution.sh -d 3 2>debug.log

# see only debug messages (STDOUT to /dev/null)
./pbs151-challengeSolution.sh -d 3 >/dev/null

Putting it all together, these code snippets show how I implemented debugging support:

while getopts ':s:e:d' opt
do
    case $opt in
        # …
        d)
            doDebug=1
            ;;
		# …
    esac
done

# …

# if debugging, print values
if [[ -n $doDebug ]]
then
    echo "DEBUG: n       = $n" >&2
    echo "DEBUG: start   = $start" >&2
    echo "DEBUG: end     = $end" >&2
fi

Pretty ASCII Tables

Before GUIs became popular, text-only interfaces could draw nice tables. This is because even old text-encoding schemes like ASCII made use for a set of characters that form the parts of a table, including corners and junctions. You can see and copy these characters using your OS’s character viewer.

You’ll see them in use throughout my templating strings.

Calculating Widths

Something important to realise is that because we are now using printf to pretty-print our integers, knowing their character length has become much more complicated — it’s not simply determined by the order of magnitude and the sign, but by the number of thousand separators needed. I’m sure it’s possible to derive a formula to get the answer, but I opted for a much simpler approach — loop over all the numbers in the sequence, pretty print them, and count their length in characters with wc -c (word count with the character option). This sounds easy to do, but it’s actually quite complicated, and to do it efficiently, I needed to utilise the two most common uses of the xargs command:

# calculate maximum lengths for each column when nicely formatted
nLen=$(printf "%'d" $n | wc -c | xargs) # character length of the number when formatted
maxMLen=1 # maximum character length of any formatted multiplier
maxPLen=1 # maximum character length of any formatted product
for m in $(seq $start $end)
do
    # Multiplier length when nicely formatted
    mLen=$(printf "%'d" $m | wc -c | xargs)
    [[ $mLen -gt $maxMLen ]] && maxMLen=$mLen

    # Product length when nicely formatted
    pLen=$(echo "$n*$m" | bc | xargs printf "%'d" | wc -c | xargs)
    [[ $pLen -gt $maxPLen ]] && maxPLen=$pLen
done

First, note that the overall algorithm is not complex:

  1. assume the maximum length of the multiplier and product are both one.
  2. loop over all the multipliers in the sequence, and for each one:
    1. calculate the product
    2. calculate the length, in characters when pretty-printed, of both the multiplier and product
    3. for each, check if the newly calculated max length is bigger than the stored max length, and if it is, update the stored max length

The devil is very much in the detail, in calculating the character lengths. Understanding just this one line is key to understanding the power of xargs:

pLen=$(echo "$n*$m" | bc | xargs printf "%'d" | wc -c | xargs)

Put a mental pin in this, we’ll be breaking it down piece-by-piece in the next section.

Before we get into the weeds of xargs I want to focus on two more things.

Building the Format String

Let’s start with the new skill this assignment was intended to test — string formatting with printf. Because I wanted my table to have a border on both sides, I needed to be sure each row was exactly the same length. Once I used the loop above to calculate the maximum width of the multiplier and the product, I could combine that with the width of the number being multiplied to build my template:

fString="┃ %'"$nLen"d x %'"$maxMLen"d = %'"$maxPLen"d ┃\n"

My string starts and ends with a space, the ASCII character for a vertical table side, and another space. It then has a format specification for a whole number, a space, an x, and another space, another format specification for an integer, another space, an =, and a space, and a final format specifier.

To understand how this string is being built up, it’s important to remember that in Bash, placing strings end-to-end concatenates them, so there are actually 7 strings being concatenated:

  1. "┃ %'"
  2. $nLen
  3. "d x %'"
  4. $maxMLen
  5. "d = %'"
  6. $maxPLen
  7. "d ┃\n"

If we imagine $nLen is 1, $maxMLen is 2, and $maxPLen is 3, the format string becomes:

fString="┃ %'1d x %'2d = %'3d ┃\n"

You’ll notice each of the three format specifications now take the same form — %'nd where n is a number. As we learned in the previous instalment, %d is a digit, %'d is a digit with thousand separators, and %d'4 is a digit with thousand separators with a minimum length of 4 characters, right-aligned.

So, the format string simply breaks down to a table edge, a column with the original number with a thousand separator and a minimum length of it’s actual length, a multiplication symbol, the number the original is being multiplied by with a thousand separator and a minimum length of the longest such number in the table, an equals sign, the product of the multiplication with a thousand separator and a minimum width of the widest product, and the other edge of the table.

Multiple Commands on a Single Line with the Statement Separator (;)

Up to this point in the series I’ve gone out of my way to write my shell scripts in a very explicit style, avoiding many of the kinds of shortcuts I make use of in the real world. One example of this is that I have always split my conditionals and loops over multiple lines. Often, this results in clearer code, but it can result in distractions.

For example, I would previously have written the code that simply appends the needed number of table-top symbols into a string like this:

capMid=''
for c in $(seq 1 $capLen)
do
    capMid+='━'
done

But you’ll notice in the sample code that I collapsed that down into a single line:

capMid=''; for c in $(seq 1 $capLen); do capMid+='━'; done

The ; character tells Bash that although there is more code on this line, treat it as a new command. In effect, pretend there is a new line here.

Why did I choose to do that? Simple, this is such a trivial part of the script that I wanted it to fade into the background and not take up previous screen space, distracting me from the complicated parts of the code that needed my attention.

Conditionally Paging the Output

The final thing I want to draw your attention to before we get back to xargs is the conditional automatic paging of the output. In other words, if the output is longer than one screen, and if the output is being written to a terminal, present the output to the user one page at a time using the less command.

Rather than printing my table directly, I build it up by appending each row to a string named $table, then, at the end of the script, I check to see whether or not the output is a terminal, and then I either write directly to standard out, or, I pipe my table through less:

# print the table
if [[ -t 1 ]]
then
    echo "$table" | less --no-init --quit-if-one-screen
else
    echo "$table"
fi

As we learned in the previous instalment, the -t n operator tests if the stream number n is a terminal or not. Standard out is always numbered 1, hence the if statement above.

In my English description I said I only wanted to route the output through less if it was going to the terminal and longer than one screen, but the code does something subtly different that has the same apparent effect. The code always send the output through less, but it uses two important flags to tell less to adopt our desired behaviour.

Before looking at the flags, let’s remind ourselves what less does by default:

echo -e "hello\nworld" | less

What happens? Less scrolls the screen so our two lines appear at the very bottom, it pauses there and blocks the script until we press q to exit, and when we do exit, it deletes the content it wrote to the screen. We don’t want it to do any of that!

Firstly, we can stop it waiting for the key press when there is less than a full page of content with the --quit-if-one-screen flag:

echo -e "hello\nworld" | less --quit-if-one-screen

That now works perfectly when there is less than a full screen of text. But what if there is more than one screen of text?

seq 1 100 | less --quit-if-one-screen --no-init

Initially this seems to work — we get the numbers from 1 to 100 one screen at a time, but when we get to the end all the numbers vanish! By default, less cleans up the screen when it pages. We can stop it doing that with the --no-init flag:

seq 1 5 | less --quit-if-one-screen --no-init

In case you’re wondering how I knew these obscure flags for less, I didn’t! As soon as I saw the ways the default behaviour was not what I wanted, I read the manual pages with man less 🙂

Note the Need for the -- Option for Negative Numbers

Note: this sub-section was added in August 2024 based on listener feedback, it is not included in the audio recording.

If we want to print the 2 times tables from -10 to -1 we need to pass -10 as the value for the -s option and -1 for the -e option. Because optargs knows -s and -e require values, it will correctly interpret the following argument list:

./pbs151-challengeSolution.sh -s -10 -e -1 2

But what if we want to print the -2 times tables from -10 to -1? Now we need to pass -2 as a regular argument (not an option), so we must explicitly tell optargs that -e -1 is the last option by adding the standard -- option between it and the -2:

./pbs151-challengeSolution.sh -s -10 -e -1 -- -2

Introducing xargs

This section is not covered by the podcast episode currently published, if listening along to the podcast, skip to the section titled Arithmetic Expressions in Bash.

But First …

The xargs command is all about processing arguments, so to help us visualise what it does, we need to be able to see the exact arguments a script receives. To that end, you’ll find a small utility script named pbs152a-argPrinter.sh in the instalment ZIP:

#!/usr/bin/env bash

# convert the arguments to a true array
# note: pre-fixing with empty string to make 1-indexed
argsArray=( '' "$@" )

# loop over the indexes of the arguments
for i in $(seq 1 $#)
do
    echo "\$$i->|"${argsArray[$i]}"|<-\$$i"
done

This script prints all the arguments it receives, wrapping the value with labeled arrows so you can clearly see any spaces around the values.

You can see this in action with the simple command:

./pbs152a-argPrinter.sh 'first arg' ' second arg '

This outputs the following:

$1->|first arg|<-$1
$2->| second arg |<-$2

For the most part this is a very simple script that iterates from 1 to the number of arguments passed ($#), but there is one subtlety. When converting the arguments to an array, an empty string is added to the array before the arguments. This effectively converts the array from being zero-indexed to one-indexed, i.e. the argument $1 will be at ${argsArray[1]} instead of ${argsArray[0]}.

The Problem xargs Solves

The big problem xargs was designed to solve is that there are times when we want the data from the standard input stream (STDIN) to be passed to another command as one or more arguments. The stream redirection operators (or terminal plumbing, i.e. >, | etc.) allow us to redirect content to a command’s standard input stream (STDIN), but not into even one argument, let alone multiple arguments.

What xargs does is read STDIN, process it using the same rules the shell uses when processing command arguments, and then run another command with those arguments. In other words, xargs converts STDIN to a list of arguments, and calls another command with those arguments.

Using xargs

The syntax is simply:

xargs [CMD] [INITIAL_ARGS ...]

Where CMD is the command to run, which defaults to echo, and INITIAL_ARGS are additional arguments for CMD that will come before the args read from STDIN.

This is harder to describe than to show, so let’s make use of our argument printing script to demonstrate what xargs does:

echo "stdin_arg1 'stdin arg2'" | xargs ./pbs152a-argPrinter.sh initial_arg1 'initial arg2'

This outputs:

$1->|initial_arg1|<-$1
$2->|initial arg 2|<-$2
$3->|stdin_arg1|<-$3
$4->|stdin arg2|<-$4

As you can see, the example xargs command is entirely equivalent to:

./pbs152a-argPrinter.sh initial_arg1 'initial arg2' stdin_arg1 'stdin arg2'

Notice that xargs has no problems understanding quoted and unquoted values.

But, when I say that xargs parses STDIN ‘just like it was a list of arguments’, that’s true, but it’s not the whole truth!

Firstly, xargs also splits on newline characters and tabs as well as spaces, so that means it can handle output from commands that list their output in columns, like ls does by default.

On my Mac, ls / gives the following columnar output:

Applications	Users		cores		home		sbin		var
Library		Volumes		dev		opt		tmp
System		bin		etc		private		usr

If I pipe that into xargs and call our arg printer you’ll see that both the tabs and newline characters were treated as argument separators:

ls / | xargs ./pbs152a-argPrinter.sh

Outputs:

$1->|Applications|<-$1
$2->|Library|<-$2
$3->|System|<-$3
$4->|Users|<-$4
$5->|Volumes|<-$5
$6->|bin|<-$6
$7->|cores|<-$7
$8->|dev|<-$8
$9->|etc|<-$9
$10->|home|<-$10
$11->|opt|<-$11
$12->|private|<-$12
$13->|sbin|<-$13
$14->|tmp|<-$14
$15->|usr|<-$15
$16->|var|<-$16

But what happens when you want to send a list of files with spaces in the names to another command through xargs? By default, it breaks 🙁

To illustrate the problem I’ve added a file into the instalment zip with spaces in its name (and a little Easter Egg inside), so if we use ls . to list the current directory (from inside the extracted instalment zip) we get:

name with spaces.txt		pbs151-challengeSolution.sh	pbs152a-argPrinter.sh

If we pipe that to our argument printer through xargs we see the file named name with spaces.txt gets split into 3 separate arguments:

ls . | xargs ./pbs152a-argPrinter.sh

Outputs:

$1->|name|<-$1
$2->|with|<-$2
$3->|spaces.txt|<-$3
$4->|pbs151-challengeSolution.sh|<-$4
$5->|pbs152a-argPrinter.sh|<-$5

How can we fix this?

The obvious answer would seem to be to use a custom separator, maybe the tab and/or newline characters, and that is sometimes possible, but not in a cross-platform way 🙁 Again, the Mac is the cause of the incompatibility, with a less capable version of xargs than you generally find on Linux.

As you might guess, there are solutions though! Notice the plural – there’s at least two ways to solve this exact problem, and probably many more.

Firstly, all versions of xargs in use today do support one alternative separator: the null character or \0. To tell xargs to split the arguments on the null character, simply add the -0 (zero) flag.

The next question is, how do you get ls to split its output with nulls instead of tabs and line breaks? You don’t, you use an alternative command with a special flag explicitly designed to play nice with xargs, find.

You can learn a lot about the find command in Taming the Terminal instalment 20, but what we need to know here is that when we use find with the -print0 flag, it will use null characters to separate its results, so you can pipe it straight to xargs:

find . -print0 | xargs -0 ./pbs152a-argPrinter.sh

Outputs:

$1->|.|<-$1
$2->|./pbs151-challengeSolution.sh|<-$2
$3->|./name with spaces.txt|<-$3
$4->|./pbs152a-argPrinter.sh|<-$4

OK, so the command for finding files has a mode specifically designed to play nice with xargs, but what if you really want to use ls, or, if you really need to split on another character?

The answer is use the tr (transliterate) command to replace all instances of your separator with the null character. You need to insert tr in your pipe line just before xargs, and you need to remember to use the -0 flag with xargs.

You may not know this, but the ls command can output its results one file per line without additional details using the -1 flag. So, if we use ls -1 . to list the instalment zip we get:

name with spaces.txt
pbs151-challengeSolution.sh
pbs152a-argPrinter.sh

We can now use tr to convert those newlines to null characters and then pipe the output to xargs for sending to our argument printer with:

ls -1 . | tr "\n" "\0" | xargs -0 ./pbs152a-argPrinter.sh

This outputs the correct values:

$1->|name with spaces.txt|<-$1
$2->|pbs151-challengeSolution.sh|<-$2
$3->|pbs152a-argPrinter.sh|<-$3

So, if you want to safely send file listings to another command through xargs, either use find -print0 or ls -1 | tr "\n" "\0" before piping to xargs -0.

Worked Examples (from the Sample Solution)

This is obviously a totally contrived example to illustrate how the command works, not why it’s useful to know, so let’s pivot back to that line we put a pin in earlier:

pLen=$(echo "$n*$m" | bc | xargs printf "%'d" | wc -c | xargs)

What this code does is calculate the character length of the product of the number in $n with the number in $m and save that number in $pLen.

Note: wc -c doesn’t actually count the character length – it counts the number of bytes. It returns the correct answer in this example, but this will break if you use emoji or if you use accented characters.

`echo '🤣' | wc -c`    # returns 5
`echo '🤣' | wc -m`    # returns 2 because the emoji is a combination ofs 2 characters
`echo 'voilà' | wc -c` # returns 7
`echo 'voilà' | wc -m` # returns 6 because the accent counts as a separate character

So that we can see this line in action isolation, let’s set values for $n and $m:

n=3; m=14

Let’s break the command down piece-by-piece. We start by echoing a simple arithmetic expression to the bc command to calculate the product of the two numbers:

echo "$n*$m" | bc
# outputs '42' to STDOUT

We now need to send this number to printf as the second argument, after the format string. Without xargs we would need to break this command apart and save the product to a variable, then call printf with that variable as the second arg:

prod=$(echo "$n*$m" | bc)
printf "%'d" $prod
# outputs '42' to STDOUT

But with xargs we can convert STDIN to an argument list, so we can get what we need in a single pipeline:

echo "$n*$m" | bc | xargs printf "%'d"
# outputs '42' to STDOUT

Next, we calculate the character length with wc -c:

echo "$n*$m" | bc | xargs printf "%'d" | wc -c
# outputs '       2' to STDOUT

Note that the wc command writes its answer with a bunch of leading space. This really messes things up when you try use the value to do math, so we need to trim it down to just the real characters.

This is when the second most common use of xargs comes in. You may or may not know that in shell scripting the separator between arguments is not ‘a space’. It is actually one or more spaces, so if you put extra spaces around your arguments, they get ignored, and the argument values received by the command/script contain no leading or trailing spaces. We can prove this with our argument printing script:

./pbs152a-argPrinter.sh 1 2  3   4    5

Note that there is one space between 1 and 2, two spaces between 2 and 3, three spaces between 3 and 4, and four spaces between 4 and 5. However, all those spaces get ignored, and the output is simply:

$1->|1|<-$1
$2->|2|<-$2
$3->|3|<-$3
$4->|4|<-$4
$5->|5|<-$5

Remember that xargs parses its STDIN (i.e. the STDOUT of what ever was on the other side of the |) as if it were an argument list, and, that xargs defaults to the echo command. That means it collapses spaces in its inputs:

echo '1 2  3   4    5' | xargs
# outputs '1 2 3 4 5'

echo '    4    ' | xargs
# outputs '4'

So, by adding a final pipe to xargs we can trim our character count:

echo "$n*$m" | bc | xargs printf "%'d" | wc -c | xargs
# outputs '2' to STDOUT

Finally, we can use the $() operator to save the output of our pipeline to the variable $pLen, giving us our original command:

pLen=$(echo "$n*$m" | bc | xargs printf "%'d" | wc -c | xargs)
echo $pLen
# outputs '2'

So, in this one pipeline we use xargs to pass the contents of STDOUT from one command as an argument to another, and to trim the spaces around our final output.

Arithmetic Expressions in Bash

Up to this point in the series we haven’t done much arithmetic in our code, and for the little we have done we’ve used the bc (basic calculator) terminal command. This works, but it’s a little involved, requiring the use of the echo command, a pipe, and $() subshell to capture the output.

Is this really the most elegant and clear code?

p=$(echo "$n*$m" | bc)

Bash provides a better solution, but I’ve held off on introducing it because we’ve already seen so many different bracketed expressions that I didn’t want to confuse things early on. Now seems an appropriate time to introduce what will be our penultimate type of bracket (we’ll meet one more type in a future instalment).

Just as a quick reminder, so far we’ve met:

Now, let’s meet two more:

  1. (( … )) for executing arithmetic expressions and capturing the resulting exit code
  2. $(( … )) for executing arithmetic expressions and capturing the result of the expression itself

Within arithmetic expressions we can:

  1. Use variables without needing to pre-fix them with the $ operator (officially documented as ‘automatic variable expansion’)
  2. Use the assignment operator (=) without needing to cuddle it
  3. Use common arithmetic operators like those listed below, again, without needing to cuddle them:
    1. Basic math: +, -, /, *
    2. More advanced math: ++ & -- for increment and decrement, % for modulus, and ** for raising to a power
    3. Comparisons: ==, !=, <, >, <= & >=
    4. Assignments: =, +=, -=, *=, /=, and even %=
    5. The ternary operator, but only for arithmetic, e.g. $(( p == 42 ? 1 : 0 ))

For more details you can read the Bash documentation on arithmetic expressions.

This all means we can re-write a statement like:

p=$(echo "$n*$m" | bc)

As simply:

(( p = n * m ))

We can also use arithmetic expressions in conditional statements, e.g.:

if (( p == 42 ))
then
	echo 'Life, the universe, and everything'
fi

We can even use it with the boolean operators to create very short conditionals:

(( p == 42 )) && echo 'Life, the Universe & Everything'

By adding the $ we can use it to calculate arguments on the fly:

seq 1 $(( n * m ))

Final Thoughts

While many people find xargs confusing, it’s well worth spending the time it takes to get comfortable with this command. Having it in your scripting toolkit will often allow you to write much simpler and cleaner code than you could without it.

Similarly, arithmetic expressions can be a little daunting at first, adding yet another type of bracket, and magically using variables without the $, but they make all code that relies on arithmetic dramatically easier to write, read, and hence maintain. Again, it’s well worth taking the time to get comfortable using them.

With these two new skills we’ve addressed two of the three pain points I encountered when coding my sample solution, but one remains — I found myself duplicating code, which is always a bad software engineering smell! Because the one required argument and two of the optional arguments need to be whole numbers, I had to duplicate my test for valid numbers three times. I made that a little more efficient by saving the regular expression into a variable, but in any other language, I’d have used a function. It’s about time we learned how to achieve similar functionality when shell scripting! That will be our starting point for the next instalment.

An Optional Challenge

Update your solution to the PBS 151 challenge to make use of arithmetic expressions for all your calculations. Also re-evaluate your code to see if there are any places where using xargs could simplify your code.

Join the Community

Find us in the PBS channel on the Podfeet Slack.

Podfeet Slack