PBS 133 of X: Firming up our Foundations (1 of 2)
We’re getting closer to having all our proverbial ducks in a row for starting work on the rewrite of the Crypt::HSXKPasswd
Perl module in JavaScript. Before we could start we needed to introduce some new technologies and make some technology decisions. Here’s a quick status update:
- Introduce ES6 modules — done ✅
- Introduce Node & NPM — done ✅
- Choose a Linter and learn how to use it — ESLint (with thanks to guest teacher Helma van der Linden) ✅
- Chose a documentation generator and learn how to use it — JSDoc with the DocDash theme ✅
- Choose a Test Driven Development (TDD) platform and learn how to use it — work in progress (see below)
- Choose a bundler and learn how to use it — to do
I spent my Christmas break working on the fifth point — figuring out the most appropriate TDD platform for the project. The contenders were QUnit, Mocha, Jasmin, and Jest.
I’ve picked Jest, but before we can learn to use it, there are some fundamental JavaScript concepts we need to refresh in our minds, and explore a little more deeply. Simultaneously, the community over on Slack have highlighted a few additional concepts that some are finding a little difficult to digest. Given those two facts, it seems sensible to pause briefly to refresh our understandings.
This instalment and the next will focus on the following:
- Clearing up some confusion around the difference between
npm install
andnpm ci
. - Some guidance on which JSDoc tags to use when, especially when documenting plain objects.
- A refresher on the different ways of defining functions, specifically function statements, function expressions, and arrow functions.
- A reminder on how function chaining works (heavily used by Jest)
- An introduction to the concept of functions that return functions (used by Jest)
- An explanation of how getters can be used to construct short but powerful syntaxes that seem quite counterintuitive at first glance (heavily used by Jest)
Matching Podcast Episode
Listen along to this instalment on episode 711 of the Chit Chat Across the Pond Podcast.
You can also Download the MP3
A Refresher on Installing Modules with NPM
I’ve noticed some confusion as to what the npm install
and npm ci
commands actually do, and when you should use each, so let’s try clarify that a little.
Firstly, let’s set some context here — NPM does many things, and package.json
can store many different types of information, but for this discussion we’re only interested in installing modules into a Node project folder.
Let’s start with the basics. A Node project folder is simply a folder that contains a valid package.json
file. When you use either the node
or npm
commands within a Node project folder they will interact with three key files/folders:
package.json
specifies the 3rd-party modules the project depends on via thedependencies
anddevDependencies
, and specifies acceptable version ranges for each.node_modules
is where 3rd-party modules get installed.package-lock.json
specifies the exact versions of each 3rd-party module that have been installed.
There are two distinct scenarios in which you can use a Node project folder, and it’s the scenario that determines whether you need npm ci
or npm install
.
Scenario 1 — Initialising a Previously Created Project for Use
In this scenario you are given a Node project folder that already contains both a package.json
and a package-lock.json
file, but no node_modules
folder. Since 3rd-party modules are intended to be downloaded and installed by NPM, the node_modules
folder should not be committed to source control, nor included in ZIP files for distribution.
Any Node project folders included in the Instalment ZIPs in this series fall into this category, as do other people’s Git repositories you clone.
The problem to be solved here is that you want to initialise the node_modules
folder exactly as it was on the author’s machine. Given the fact that package.json
defines version ranges for the dependencies, the same package.json
files can produce many possible mixes of module versions, potentially leading to subtly different behaviour as different versions will have different bug fixes applied. We don’t want ‘a set of modules that meets the spec’, we want the exact set the author had, i.e. we want to replicate the packages recorded in package-lock.json
.
The npm ci
command deletes anything that exists within the node_modules
folder, then reads package-lock.json
and installs the exact versions listed into node-modules
. I.e. it does a clean install, hence the command name ci
.
In other words, when initialising a copy of someone else’s Node project, always use npm ci
.
Scenario 2 — Managing Dependencies in a Node Project You Contribute To
There are four sub-scenarios here:
- You’ve checked your own code out of source control on a new machine and want to pick up exactly where you left off
- You’re starting a new project
- You want to add a new dependency to your project
- You want to update your existing dependencies to the latest versions that match the contents of
package.json
The first sub-scenario is just scenario 1 really — initialising a copy of your own code to exactly where you left off is no different to initialising a copy of someone else’s code exactly where they left off! So, to pick up where you left off on your own code, also use npm ci
.
When you start a new project there are no dependencies, so really, the second sub-scenario is the same as the third sub-scenario where you’re adding a new dependency to an existing project. To add a new dependency use npm install --save
(or npm install --save-dev
for a dev dependency). This will add a new entry to package.json
and specify the acceptable version range as anything newer than the current version with the same major version. (Remember, all NPM modules use Semantic Versioning). The exact version installed will also be written to package-lock.json
, as well as the exact version of every dependency that was installed along with the module you expressly installed.
Finally, to roll all your dependencies forward to the latest versions allowed by the current package.json
use npm update
. This will update both package.json
to set new minimum versions, and package-lock.json
. If you want to upgrade a dependency to a new major version you’ll need to manually edit package.json
and then run npm update
.
A Quick Summary of the Most Important JSDoc Tags
When you look at the list of tags JSDoc supports, it’s easy to get overwhelmed. The thing is, all tags are not equally important. The clichéd 80/20 rule springs to mind — with many technologies 80% of the time you only need 20% of the features, but actually, for JSDoc I think it’s more like 95% of the time you need just 5% of the tags!
Add to that the cliché that any documentation is better than none, and you can understand why my advice is to to build up the complexity of your JSDoc comments incrementally. Beyond a few basic tags, I would suggest ignoring the others until you have a specific problem to solve, and then learning about the relevant tags at that point.
Firstly, complex doc comments are not necessarily better than simple ones — like so much in life, your doc comments should be as complex as they need to be, but no more. If you’re documenting a simple variable or function, it should have a short, simple doc comment!
Secondly, remember why you’re documenting your code — what is it future you will need to know?
For the big-picture items like modules and classes, future you will want to know what concepts the modules and classes represent, and what functionality they provide. What abstractions is the code built around? What assumptions are built into the design? What kinds of problems does this code solve? When might I want to use it? And just as importantly, what are its shortcomings? When would I not want to use it? What problems does it not solve?
You often don’t need any tags at all for these big-picture items — it’s all about the descriptions, with perhaps some links to other items or outsides resources, i.e the {@link}
and @see
tags might be useful.
For smaller items (like variables, functions, and arguments), the details are much more important, so you’ll be relying less on the description and more on specific tags. The better you name your variables, functions, and arguments, the less need you’ll have for descriptions!
When documenting a variable or a class property, the two most important things to capture are what it’s for and exactly what types of values will be stored.
A good variable name may be enough to make it clear what a variable is for, but a short description often helps. The @type
tag is vital for capturing what the variable will store.
/**
* The average duration of a gamma ray burst in seconds.
*
* @type {number}
*/
let avgGRBDuration = 42;
When documenting functions, the three most important things to capture are what the function does, what inputs it expects, and what the outputs will be.
Again, good names may be enough to capture what a function does, but @param
tags are vital for capturing the inputs. Again, within each @param
tag a good name may omit the need for an English description, but you should always start the tag’s contents with a type expression (in curly braces).
When it comes to capturing the outputs the most important tag is @returns
. Again, if the function is well-named there may be no need for an English description within the tag, but you should always start the tag with a type expression (in curly braces).
What often gets overlooked is that fact that functions can output more than just return values, they can also throw errors! If your function can throw an error, be sure to add @throws
tags for each type of error. You should specify both the type of the error that could be thrown, and describe the circumstances under which it would happen.
/**
* Cube a given number.
*
* @param {number} n - the number to cube
* @returns {number}
* @throws {TypeError} A type error is thrown when invalid arguments are passed.
*/
function cube(n){
if(isNaN(n)){
throw new TypeError('must pass a number');
}
return n * n * n;
}
Describing Object Key Expectations
In JavaScript, objects are extremely generic, so they can be used to solve many problems. When it comes to documenting a variable, function argument, or return value that’s an object, simply saying ‘this is an object’ is not actually very helpful at all!
you need to dig a little deeper. In general, objects wear one of three possible hats:
- They are instances of some class.
- They are plain objects where specific keys are required, so-called records
- They are plain objects where both the keys and values are variable, but each have a meaning that you need to describe. This concept is often referred to as a map.
To document an instance simply use the name of its class as the type, e.g.:
/**
* Calculate the start of a given day.
*
* @param {Date} d - A time during the day whose start is to be calculated.
* @returns {Date} The given date & time with the hours, minutes, seconds, and milliseconds set to zero.
* @throws {TypeError} A Type Error is thrown if invalid args are passed.
* /
function startOfDay(d){
if(!(d instanceOf Date)){
throw new TypeError('must pass a JavaScript Date object');
}
const ans = new Date(d); // clone the date
ans.setHours(0, 0, 0, 0); // un-intuitively, this sets hours, minutes, seconds & ms to zero!
return ans;
}
For plain objects you have to capture the expectations for both the keys and the values.
When you think about it, the keys in plain objects fall into two philosophical groups — those where you know the exact keys, or those where both the keys and the values are variable.
Let’s start with the more generic of the two — you don’t know exactly what the keys will be, but you know that they’ll represent something. What you have here is a meaningful mapping between one set of values, the keys, and another set of values, the values.
What you need to capture in your documentation is that mapping from keys to values.
The basic syntax for describing plain objects that map variable keys to values is object.<keyType, valueType>
.
Let’s start with a simple example:
/**
* A mapping of number names to their values.
*
* @type {object.<string, number>}
*/
const numsByName = {
zero: 0,
one: 1,
two: 2,
three: 3,
four: 4,
five: 5,
six: 6,
seven: 7,
eight: 8,
nine: 9
};
In JavaScript the keys in plain objects are always strings, so you have to either describe the keys meaning in the doc comment’s paragraph like we did above, or use a custom type definition to add more meaning, e.g.:
/**
* Abbreviated week days in all lower case.
*
* @typedef {string} weekDayAbbrev
* /
/**
* Average sales per day of the week.
*
* @type {object.<weekDayAbbrev, number>}
*/
const avgSalesByDay{
mon: 42,
tue: 34.7,
wed: 55.2,
thur: 33,
fri: 42.42,
sat: 47,
sun: 0
};
Finally, we have plain objects where we know exactly what the keys will/should be. These are referred to as records. For example, you might represent locations on a map as a collection of plain objects which all specify the keys latitude
and longitude
. Or you might use plain objects to represent people’s full names where you support the keys firstName
, lastName
, middleInitial
, and nickName
.
For simple records, JSDoc provides the {key1: type1, key2: type2}
syntax:
/**
* The location of the best coffee in Maynooth.
*
* @type {{latitude: number, longitude: number}}
*/
const bestCoffee = {
latitude: 53.380999,
longitude: -6.593129
};
This works fine for plain objects with just a few keys, but simple does not work when you need to describe many keys. When your records will have many keys you need to use separate JSDoc tags to describe each key. Rather annoyingly, the actual tags to use are different for variables and function arguments.
To document record-type plain object variables, use the @type
and @property
tags, e.g.:
/**
* The location of the best coffee in Maynooth.
*
* @type {object}
* @property {number} latitude - The latitude in degrees.
* @property {number} longitude - The longitude in degrees.
* @property {string} name
*/
const bestCoffee = {
latitude: 53.380999,
longitude: -6.593129,
name: 'Puppa Coffee'
};
Notice that the @type
tag specifies that the variable is an object, and then a separate @property
tag is used to give the type, name, and an optional description for each key-value pair.
To document record-type plain object arguments, use the @param
tag multiple times, e.g.:
/**
* Print a location.
*
* @param {object} loc - A record representing the location.
* @param {number} loc.latitude - The latitude in degrees.
* @param {number} loc.longitude - The longitude in degrees.
* @param {string} loc.name
*/
function printLoc(loc){
console.log(`${loc.name}: ${loc.latitude}, ${loc.longitude}`);
}
Notice that the first @param
tag says that the argument named loc
is an object, then we add additional @param
tags for each key, using JavaScript dot notation to specify the names of the keys within loc
.
A Refresher on Functions
As we know, functions are pieces of code that can be executed. They take zero or more inputs, can produce zero or one outputs, and can throw errors. They have all sorts of uses — if we name them, they allow us to reuse code, and we can pass them around as data so they can be executed by some other code at some other time in response to some event.
Functions that are intended to be reused need names, but functions that get passed around as data to be executed later are often one-offs, so they often don’t. We refer to functions without names as being anonymous.
Like just about everything in JavaScript, functions are objects, so they can be stored in variables, passed as arguments, or returned from functions just like any other object.
Function define their own scope, and they provide two special variables this
which is a reference to an appropriate object the function somehow belongs to (it varies by context), and arguments
, an object representing the arguments passed to the function.
In JavaScript, we can create traditional functions in a few different ways:
- function statements
- method definitions within class definitions
- traditional function expressions
We’ve seen all these uses many times before, but here are some examples to refresh your memory (you can copy-and-paste these into the JavaScript console in your browser, or into the NodeJS console which you get by running node
with no arguments):
// a function statement
function cube(n){
return n * n * n;
}
console.log(cube(1));
// a method definitions in a class
class MyMath{
// a static method
static cube(n){
return n * n * n;
}
// an instance method
cube(n){
return n * n * n;
}
}
console.log(MyMath.cube(2));
const mm = new MyMath();
console.log(mm.cube(3));
// a function expression saved to a variable
const anonyCube = function(n){ return n * n * n };
console.log(anonyCube(4));
// an anonymous function expression passed to a timeout
setTimeout(function(){ console.log(cube(5)); }, 2000);
Arrow Functions
ES6 introduced a whole new kind of function named arrow functions. (Sometimes referred to as fat arrow functions, but that’s not really appropriate in 2022, so we’ll be sticking with arrow functions).
You should think of arrow functions as being function lite, they have an extremely compressed syntax (more on this in a moment), and they only implement a sub-set of the features provided by traditional functions:
Feature | Traditional Functions | Arrow Functions |
---|---|---|
own scope? | yes | yes |
own this ? |
yes | no |
own arguments ? |
yes | no |
supports yield ? |
yes | no |
Not only is the arrow function syntax terse, lots of even terser shortcuts are supported!
You’ll almost always use arrow functions as arguments in function calls, so let’s create a function that takes a function as its first argument, and an array of arguments as its second argument so we can use it in our arrow function examples:
/**
* A function to execute a given function with given arguments and
* output the returned value to the console.
*
* @param {function} fn - The function to execute.
* @param {Array.<*>} [args] - The args to pass to the function as an array.
*/
function runFn(fn, args=[]){
console.log(fn(...args));
}
Again, to play along with the examples below, copy-and-paste the function above into the JavaScript console in your browser, or a NodeJS console.
OK, let’s start with the complete arrow function syntax:
// an arrow function that raises the first
// argument to the power of the second
runFn(
(n, x) => { // the arrow function expression (1st arg to runFn)
return Math.pow(n, x);
},
[2, 3] // the args to call the arrow function with (2nd arg to runFn)
);
In the above example we’re calling our running function with two arguments, an arrow function as the first argument, and the array [2, 3]
as the second. So, the arrow function is just:
(n, x) => { // the arrow function expression (1st arg to runFn)
return Math.pow(n, x);
}
The complete arrow function syntax starts with an argument list in parentheses, then the symbols =>
to form the so-called arrow, then a code block containing the function’s body. So argument list, then arrow, then function body.
Note that arrow functions don’t need to take arguments, the parameter list can be empty:
// and arrow function that prints some tasty food
runFn(
() => { return '🥞 + 🍁'; } // the arrow function
);
If there’s only one line of code, the block and the return
statement can be omitted. The following is functionally identical to the example above:
runFn( () => '🥞 + 🍁' );
That’s pretty concise, but wait, there’s more!
If an arrow function has exactly one argument, the parentheses around the argument list is optional too:
// an arrow function that takes 1 arg and cubes it
runFn(
n => n * n * n, // the arrow function
[6] // the args to call the arrow function with
);
In the above example, the following is the entire arrow function:
n => n * n * n
It really is the functions logic distilled to it’s essence — take n
and multiply it by itself three times!
The ‘long’ arrow function syntax would make it:
(n) => {
return n * n * n;
}
And to do it as a traditional function expression it would be:
function(n){
return n * n * n;
}
In programming jargon
->
is often referred to as an arrow, especially in C, and=>
as a fat arrow. So, technically speaking fat arrow is a more accurate description for the JavaScript arrow function syntax, but it’s just not appropriate in 2022.
Function chaining
Another concept that Jest uses heavily is function chaining, this is when a function returns an object and you immediately call an instance function on that object, optionally repeating the process for many more calls. The key with function chaining is to start on the left and keep track of the return types at each step.
Take the following rather contrived example:
(new Date()).toString().toLowerCase().replace(/[ ]/g, '-')
We start on the left with new Date()
which returns a JavaScript Date
object representing the current time. To avoid syntax errors we wrap the call in parenthesis, and then we chain another call to it, .toString()
. This second call executes the Date
class’s .toString()
instance function on the Date
object returned by the Date
constructor. As its name suggests, it returns a string, so we now have a string. The next call in the chain is .toLowerCase()
, this will call the .toLowerCase()
instance function from the String
class on the string which returns a new string in all lower case. At this point in the chain we still have a string. The final call in the chain is to .replace()
which will be the .replace()
instance function from the String
class which will return a new string with all spaces replaced with dashes.
This is what I see in my Node console:
> (new Date()).toString().toLowerCase().replace(/[ ]/g, '-')
'mon-jan-10-2022-15:16:10-gmt+0000-(greenwich-mean-time)'
Final Thoughts
Hopefully the important but perhaps subtle differences between the different npm
commands now make more sense to you, and hopefully you’ll find it a little easier to know where to begin in terms of JSDoc tags.
In terms of preparing for Jest, we’ve covered two of the four preparatory steps needed — we’ve refreshed our knowledge of the different syntaxes for creating functions, and we’re reminded ourselves of how to read chained function calls. That leaves two newish concepts for next time. We won’t be learning any new fundamental principles, but we will be exploring new ways of using the building blocks we’ve already met.
We know that functions are objects in JavaScript, so they can be passed around like any other piece of data — we can store them in variables, we can add them to arrays, we can add them as values in plain objects, we can pass them as arguments to functions, and so on. We know that functions can return any value, and that functions can be used as values, so in theory we’ve known that you can have functions that return functions, but we’ve never done anything like that. It’s quite brain-bending, but also very powerful, and Jest uses the concept in its APIs, so we need to understand the concept before we start trying to understand Jest.
Similarly, we know we can use getters to add functions that don’t look like functions to our classes, but doing so can open up some unusual API syntaxes. Jest makes very heavy use of this design philosophy, so I think it’s important we at least understand that it’s not Jest-magic, but regular ES6 JavaScript, even if we don’t start using the same design logic in our own code (although I have been doing so for a few years already).
We’ll dedicate the next instalment to those two extreme uses for functions.