Logo
Logo

Programming by Stealth

A blog and podcast series by Bart Busschots & Allison Sheridan.

PBS 120 of X: Ignoring Files (Git)

10 Jul 2021

Before we finish up with Git there’s one important feature we accidentally skipped over — telling Git what to ignore!

At first glance this might seem like an odd thing to want to do, but actually, it proves to be very important in all sorts of ways. Some OSes like to litter directories with special files that contain different information on different computers, many build tools generate vast amounts of temporary files, many editors create temp files, and many IDEs create config files that you may not want to sync between computers or team members.

Matching Podcast Episode

Listen along to this instalment on episode 692 of the Chit Chat Across the Pond Podcast.

You can also Download the MP3

Instalment Resources

Playing Along

If you’d like to play along with the examples, you’ll need to download this instalment’s ZIP file and unzip it. Open a terminal and change to the folder into which you extracted the ZIP . You’ll find two bash scripts named pbs120-init.sh and pbs120-genIgnoreFiles.sh along with various other files we’ll be using in this instalment.

What does it Mean for Git to Ignore a File?

In your working tree, Git assigns every file one of three statuses:

  1. tracked — the file has been staged or committed (i.e. git add was used on it at some time in the past).
  2. untracked — the file is not being tracked, and, Git has not been instructed to ignore it.
  3. ignored — the file matches one of the active ignore patterns, and is not tracked.

Note that if a file is both tracked and marked to be ignored, being tracked wins — in other words, Git ignores instructions to ignore tracked files!

What this means that ignoring files from the start is easy, but ignoring previously tracked files later is fiddly at best. In my opinion ignoring tracked files is one of Git’s most annoying weaknesses — it can be done, but there is no single simple command to do it for you 🙁

For this reason, it’s considered best practice to make a pro-active decision to track or ignore each new file as it gets added to your repo.

How does Git Ignore Files?

Git uses so-called ignore patterns to decide whether or not each un-tracked file should be ignored. These patterns are very similar to wild-card file paths in DOS or a Linux/Unix shell, but Git does have its own specific syntax.

Just like Git settings, Git ignore patterns can be defined at three levels — OS-wide (system), for the current user (global), and for the current repository (local).

Local ignore patterns are defined in a file named .gitignore at the repository’s root, global ignore patterns are defined in a file specified with the core.excludesfile global setting, and OS-wide ignore patterns in a file specified with the core.excludesfile system setting. In this instalment we’re going to keep things nice and simple and only look at local and global ignore patterns.

Git Ignore Patterns

Regardless of the level at which you’re defining ignore patterns, the syntax is the same. The patterns will be added to a file, one pattern per line, and blank lines and lines starting with a # symbol will be ignored.

Git ignore patterns don’t treat file paths as just strings — Git understands the difference between file names and a file paths, and treats them very differently. If you forget this subtle but important detail you’ll be in for no end of confusion and frustration!

Git will ignore each pattern as a filename, an entire folder, or a path. If a pattern is a filename then it applies in all folders and sub-folders, if it’s a folder it applies to the entire contents of the folder, and if it’s a path it only applies if the pattern matches the full path. How does Git decide whether a pattern is a file, a folder, or a path? Simple — if a pattern doesn’t contain a path separator it’s a file, if it ends with a trailing path separator it’s a folder, otherwise it’s a path.

Regardless of your OS, Git ignore patterns use / as the path separator.

Ignoring File Names

Remember, if an ignore pattern does not contain a path separator, Git will treat it as a filename and match it at every level of your repo.

So, to ignore all files named .DS_Store, regardless of which folder they’re in, you’d use the pattern .DS_Store!

Specific file names have their uses, but usually, you want to specify some kind of pattern. The two most common patterns Git ignore definitions support are * for zero or more of anything except a path separator, and ? for exactly one of anything except a path separator. The most common thing to want to do is to ignore all files with a given extension, for example, to ignore all .tmp files you would use the ignore pattern *.tmp.

Ignoring Folders

The rules for ignoring the entire contents of a folder are basically the same as those for ignoring a file, you just have to end the pattern with a trailing /. So, to ignore all files in any folder named logs anywhere in the repo simply use the pattern logs/. To ignore any folder ending in log anywhere in the repo use *log/.

Ignoring File Paths

Remember, if an ignore pattern contains a path separator, Git will match the pattern against the full file path, not just the file name.

Git treats file paths quite similar to file names, but there is one important subtlety to note — since * does not match path separators, its range is confined within a single item within a file path, be that a folder name or a file name. For example*/*.txt will match someFolder/someFile.txt, but it will not match someFolder/someSubFolder/someFile.txt! For this reason, Git has one more wild-card for use within file paths — **, which means zero or more folders. For example, someFolder/**/*.txt will match someFolder/someFile.txt and someFolder/someSubFolder/someFile.txt.

Making Exceptions

Git makes it possible to write broader rules than you might expect by providing a mechanism for carving out exceptions. One small potential gotcha is that the exception has to come after the broader rule it overrides in the file.

Exceptions work the same as regular ignore patterns except you prefix them with an exclamation point (!).

As an example, the following pair of patterns blocks all log files except build.log in the repository’s root:

*.log
!/build.log

Worked Example 1 — Ignoring All .DS_Store Files on a Mac

Before starting this worked example, you should try to force your Mac to create a few .DS_Store files for you inside our example repo by opening the folder in the finder and changing some Finder view options to something other than the default. I find that using list view and expanding a few folders does the trick every time.

I triggered the Finder to create two .DS_Store files for me in the pbs120a repo:

bart-imac2018:pbs120a bart% git status
On branch main

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	.DS_Store
	contrib/.DS_Store

nothing added to commit but untracked files present (use "git add" to track)
bart-imac2018:pbs120a bart%

As mentioned above, to ignore files in all repositories you need to create an ignore file somewhere, then set the global setting core.excludesfile to point to that file.

You could create this global ignore file anywhere and call it anything, but the convention is to use ~/.gitignore_global (~ being your home directory in POSIX-compliant operating systems like Unix, Linux & MacOS).

We want to ignore all files named .DS_Store in all folders, so the pattern we need is simply .DS_Store.

Note that if you’ve already been using Git, it’s possible you already have a global git ignore file, so, to be safe, the example command uses the POSIX append operator >> rather than the replace operator >. What we want to do is write our pattern into ~/.gitignore_global. You could use your favourite plain text editor, or, you can use the following terminal command:

echo '.DS_Store' >> ~/.gitignore_global

The echo command simply writes whatever was passed to it to standard out (STDOUT), and the >> operator appends STDOUT to the end of the given file.

You can view the contents of your ~/.gitignore_global file with the command:

cat ~/.gitignore_global

The file now exists and contains our desired pattern, so, we’re now ready to configure Git to use it:

git config --global core.excludesfile ~/.gitignore_global

With that done, .DS_Store files are now hidden from our view in all repos:

bart-imac2018:pbs120a bart% git status    
On branch main

nothing to commit, working tree clean
bart-imac2018:pbs120a bart% 

Listing Ignored Files

How do we know the files are really there but being ignored? We can pass the --ignored file to git status and it will add a section to list all ignored files to the end of the output:

bart-imac2018:pbs120a bart% git status --ignored
On branch main

Ignored files:
  (use "git add -f <file>..." to include in what will be committed)

	.DS_Store
	contrib/.DS_Store

nothing to commit, working tree clean
bart-imac2018:pbs120a bart%

Worked Example 2 — Ignoring Untracked Files

In order to ignore some files, let’s create some 🙂

The instalment resources contain a script named pbs120-genIgnoreFiles.sh, this script uses the uuidgen command (learn more at its online man page) to generate some random glop (a universally unique identifier) which it redirects into files, and the mkdir command to create a directory. In total the script creates the following:

ignoreFile1.tmp
contrib/ignoreFile2.tmp
ignoreDir/ignoreFile3.txt
ignoreDir/ignoreFile4.txt

Assuming you are still in your repo, you can execute the script with the following command:

../pbs120-genIgnoreFiles.sh

Once you’ve run the script you should see the new files your git status:

bart-imac2018:pbs120a bart% git status
On branch main

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	contrib/ignoreFile2.tmp
	ignoreDir/
	ignoreFile1.tmp

nothing added to commit but untracked files present (use "git add" to track)
bart-imac2018:pbs120a bart% 

Notice that the folder ignoreDir is listed, but its contents are not. Until at least one file within a folder is tracked, Git’s status command won’t descend into it.

We want to ignore all these files, so we’ll need to create a .gitignore file at the base of our repo. You could use your favourite plain text editor, but I’m going to pipe the patterns into the file using terminal commands.

The first thing we want to do is ignore the entire ignoreDir folder. If we wanted to ignore all folders named ignoreDir anywhere in the repo we would use the pattern ignoreDir/, remember the trailing / means ignore all my contents. However, I advise against using overly broad patterns — they can come back to bite you! Instead, my advice is to get into the habit of ignoring specific folders unless you actually have many folders with the same name to ignore. We do this by starting the pattern with a /, so, to ignore the entire folder ignoreDir in the repository’s root folder we use the pattern /ignoreDir/:

echo '/ignoreDir/' > .gitignore

Our Git status now ignores the entire directory:

bart-imac2018:pbs120a bart% git status                     
On branch main

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	.gitignore
	contrib/ignoreFile2.tmp
	ignoreFile1.tmp

nothing added to commit but untracked files present (use "git add" to track)
bart-imac2018:pbs120a bart%

Notice that the .gitignore file is now listed as an untracked file. When we’re done we should add it to the repo and commit it.

Before we get that far, let’s deal with the remaining two files to be ignored, ignoreFile1.tmp & contrib/ignoreFile2.tmp. These files in are different folders, but they both end in .tmp, and that’s not an extension we want to track in general, so let’s ignore all .tmp files. The pattern for that is simply *.tmp:

echo '*.tmp' >> .gitignore

Git status now ignores both of those files:

bart-imac2018:pbs120a bart% git status
On branch main

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	.gitignore

nothing added to commit but untracked files present (use "git add" to track)
bart-imac2018:pbs120a bart% 

Imagine at some later time we end up creating a new file with .tmp extension that we do care about, how do we avoid it being hidden from our view? We can make an exception!

Let’s create such a file:

uuidgen > important.tmp

This file is hidden because it matches the ignore rule *.tmp. We can verify this by showing the status including ignored files:

bart-imac2018:pbs120a bart% git status --ignored
On branch main

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	.gitignore

Ignored files:
  (use "git add -f <file>..." to include in what will be committed)

	.DS_Store
	contrib/.DS_Store
	contrib/ignoreFile2.tmp
	ignoreDir/
	ignoreFile1.tmp
	important.tmp

nothing added to commit but untracked files present (use "git add" to track)
bart-imac2018:pbs120a bart%

To make an exception for our important temp file, we prefix a normal rule for that file with an exclamation mark, i.e. !/important.tmp. Notice I don’t want to ignore all files named important.tmp everywhere, just the one file at the repo’s root, hence the leading /:

echo '!/important.tmp' >> .gitignore

We can now see our important temp file again:

bart-imac2018:pbs120a bart% git status                          
On branch main

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	.gitignore
	important.tmp

nothing added to commit but untracked files present (use "git add" to track)
bart-imac2018:pbs120a bart%

Let’s commit both files now (as separate commits):

bart-imac2018:pbs120a bart% git add .gitignore 
bart-imac2018:pbs120a bart% git commit -m 'Chore: created git ignore file'
[main 78434ec] Chore: created git ignore file
 1 file changed, 3 insertions(+)
 create mode 100644 .gitignore
bart-imac2018:pbs120a bart% git add important.tmp
bart-imac2018:pbs120a bart% git commit -m 'Chore: added important temporary file'
[main f3330aa] Chore: added important temporary file
 1 file changed, 1 insertion(+)
 create mode 100644 important.tmp
bart-imac2018:pbs120a bart%

We now have a clean status:

bart-imac2018:pbs120a bart% git status
On branch main
nothing to commit, working tree clean
bart-imac2018:pbs120a bart% 

Finally, this is now what our .gitignore file looks like:

bart-imac2018:pbs120a bart% cat .gitignore 
/ignoreDir/
*.tmp
!/important.tmp
bart-imac2018:pbs120a bart% 

Worked Example 3 — Ignoring Tracked Files

Let’s say we now change our mind and want to ignore all temp files, even important.tmp. The first step will be to remove the pattern carving out an exception for it from our .gitignore (the last line, i.e. !/important.tmp).

You could use your favourite plain-text editor to remove this line, or, we can trim the last line from our .gitignore file with the following terminal command:

temp=$(head -2 .gitignore) && echo $temp > .gitignore

If you’re curious, this command saves the first two lines of our .gitignore into a shell variable named $temp and then echoes that variable’s value into a fresh .gitignore file.

(added to show notes after recording)

There are at least two ways to do just about everything in Linux, if not more. Unsatisfied with my solution, a listener suggested that the following would accomplish the same thing with possibly less confusing syntax. The version of head in MacOS doesn’t have this option, which is why I didn’t include it originally, but if you’re using a modern Linux distribution, try:

head -n -1 .gitignore > .gitignore

This prints all lines except the last (1) line of our .gitignore file to STDOUT and then redirects it into a fresh .gitignore file. The - before the 1 means “print all except the last N lines of the file”. If we had typed head -n -2, it would have cut off the last 2 lines of the file instead.

If this file were new, it would now be ignored, but it’s not new, it’s tracked, so the file still exists in the repo!

We can see all the files tracked by Git in the currently checked out branch with the git ls-tree command. We’ve not looked at this command in detail yet, but you can learn all about it with man git-ls-tree.

To see all tracked files, we want to recurse into folders so we’ll need the -r flag. We want to see the contents of the currently checked out branch, so we can use the special place-holder branch HEAD as the source for the listing, and we really only want to see the file names, so we can condense the output with the --name-only option:

bart-imac2018:pbs120a bart% git ls-tree -r HEAD --name-only
.gitignore
EasterEgg.png
README.md
contrib/Bootstrap4.5/LICENSE
contrib/Bootstrap4.5/bootstrap.bundle.min.js
contrib/Bootstrap4.5/bootstrap.bundle.min.js.map
contrib/Bootstrap4.5/bootstrap.min.css
contrib/Bootstrap4.5/bootstrap.min.css.map
contrib/MomentJS2.29/moment-with-locales.min.js
contrib/jQuery3.5/LICENSE.txt
contrib/jQuery3.5/jquery.min.js
contrib/jQuery3.5/jquery.min.map
important.tmp
index.html
bart-imac2018:pbs120a bart%

Since this is a relatively small repo we can easily pick important.tmp out of the list, but, in the real world you’ll probably want to filter your results with something like the egrep command (see TTT #19):

bart-imac2018:pbs120a bart% git ls-tree -r HEAD --name-only | egrep important
important.tmp
bart-imac2018:pbs120a bart%

While it’s useful to be able to check whether or not a specific file is tracked, what we really need when we made a change to an ignore file in a big pre-existing project is a way of finding all files that are tracked but should now be ignored. We can use the git ls-files command for this. This command shows information about tracked files, and it supports flags for filtering tracked files by whether or not they are ignored, based on which source of ignore patterns. The flag to filter by ignored status is --ignored, but you can’t just use that flag on its own, you have to tell it what ignore patterns to use.

The most common thing to do is to check for all files ignored under Git’s default behaviour, you do this with the --exclude-standard flag:

bart-imac2018:pbs120a bart% git ls-files --ignored --exclude-standard                
important.tmp
bart-imac2018:pbs120a bart% 

This shows us our one and only ignored file.

What might also be useful is to only see the files that are tracked, but should be ignored, based on a specific ignore file — usually either your system-wide one, or the one in your current repo, e.g.:

bart-imac2018:pbs120a bart% git ls-files --ignored --exclude-from .gitignore         
important.tmp
bart-imac2018:pbs120a bart% git ls-files --ignored --exclude-from ~/.gitignore_global 
bart-imac2018:pbs120a bart%

If you have a more recent version of Git, you may need to add the -c or --cached flag to the command:

 bart-imac2018:pbs120a bart% git ls-files --ignored -c --exclude-standard

In this case, our one and only tracked file that should be ignored is being marked as ignored by the repo’s .gitignore file, not our global one.

Stop Tracking a File

Once we’ve identified the files that we’d like to stop tracking, how do we tell Git to actually stop tracking them? We use the rather scary-looking git rm command.

Beware — by default, git rm will delete the file from your working copy as well as remove it form the Git index!

This is quite often desired, but, if you only want to stop tracking a file, and not actually remove it from your computer, use the --cached flag.

So, in our example above, to stop tracking important.tmp which is now marked to be ignored, we would use the command:

git rm --cached important.tmp

Our Git status now shows that our ignore file has changed (we deleted the last line), and that important.tmp has been deleted from the Git repo:

bart-imac2018:pbs120a bart% git status
On branch main
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	deleted:    important.tmp

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   .gitignore

bart-imac2018:pbs120a bart%

However, note that the file remains in our working tree, it’s just being ignored:

bart-imac2018:pbs120a bart% git status --ignored
On branch main
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	deleted:    important.tmp

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   .gitignore

Ignored files:
  (use "git add -f <file>..." to include in what will be committed)

	.DS_Store
	contrib/.DS_Store
	contrib/ignoreFile2.tmp
	ignoreDir/
	ignoreFile1.tmp
	important.tmp

bart-imac2018:pbs120a bart% 

And just to prove it really is still there:

bart-imac2018:pbs120a bart% cat important.tmp 
4225E088-C10C-43C8-AEC8-5115014A5277
bart-imac2018:pbs120a bart%

Ignoring Files in GUI Clients

Usually, when you ignore a file, you want to stop tracking it, but you want to keep it in your working tree. That means that what you almost always want to do at a low-down Git level is edit the repo’s .gitignore file and then untrack the file with git rm --cached. Good clients will allow you just ignore, or ignore and stop tracking. See the screenshot below from GitKraken:

A screenshot showing GitKraken's question when you choose to ignore a file, it provides two buttons, one to just ignore, and one to ignore and stop tracking

Final Thoughts

And with that, we bring our introduction to Git and GitHub to a close. We now know how to use Git to version our code, to move it between computers in an organised way, and to facilitate collaboration. We’ve also learned how to use free GitHub accounts to get ourselves free Git-as-a-Service in the cloud, and to engage with the Open Source community.

From this point on we’re never going to leave Git behind us really — source control is an essential part of every developer’s toolkit, once you start using it, you just can’t go back! From this instalment forward, we’ll be assuming basic Git competency.

We’re also going to double-down on our new Git knowledge in this series’ next series-within-a-series when we look at using the Git-powered open source tool Chez Moi for managing our Linux/Unix/MacOS config files across multiple machines. This will be a cross-over series with Taming the Terminal.

Join the Community

Find us in the PBS channel on the Podfeet Slack.

Podfeet Slack