Scott's Big Blue Box


<home

Note: This was originally a tutorial posted on repl.it, adaped slightly for the interblogs

What is GNU find?

GNU find is a tool that by default, “selects” all the files in a directory. So if we take an example directory (my wallpapers) and run find we see:

.
./dw
./dw/Xj1Lby.jpg
./dw/doctor-who-wallpaper-3.jpg
./dw/dalek.jpg
./dw/1
./dw/dalek-bluer.jpg
./.ani1.png
./.ani2.png
./.ani3.png
./.ani4.png
./.ani.tar.xz
./.ani5
./.ani6

Find takes options, of which there are four types: - Modifiers, which change the operation of the program for the whole file. (Example: -print) - Tests, which return a boolean (Example: -name) - Actions, which return a number but do something else (Example: -exec) - Operators, which connect other options: (Example: -and) You use find by entering find [paths] [options]. Find defaults to your pwd if you provide no path.

Now let’s start. Allons-y!

Basic operations

There are a couple of tests which are the most common.

By name

To filter files by name, use -name.

You can use wildcards like * (which matches 0 or more chars) and ? (matches exactly 0 or 1 char).

For example, to get all hidden files:

find -name ".*":

.
./.ani1.png
./.ani2.png
./.ani3.png
./.ani4.png
./.ani.tar.xz
./.ani5
./.ani6

Or to find all hidden PNG wallpapers starting with ani:

find -name '.ani*.png'

./.ani1.png
./.ani2.png
./.ani3.png
./.ani4.png

To ignore case, use -iname. find -name '.ani*.png' and find -iname '.Ani*.PNg' return the same thing (if you have the listing above).

By path

To filter by using a path, use -path

The difference with -path and -name is that the content of basename is what is checked.

So if I do find -name /home/swl/wallpapers/.ani?.png, I get nothing, because a file cannot have a / in it’s name.

But with -path it does work.

Again, -ipath ignores case.

By regex.

To search a path — not a filename - by regex, use -regex.

For an explanation of the syntax, go to the GNU emacs manual

Once more: use -iregex to ignore case.

You can set the regex type by using -regextype

‘emacs’ Regular expressions compatible with GNU Emacs; this is also the default behaviour if this option is not used. ‘posix-awk’ Regular expressions compatible with the POSIX awk command (not GNU awk) ‘posix-basic’ POSIX Basic Regular Expressions. ‘posix-egrep’ Regular expressions compatible with the POSIX egrep command ‘posix-extended’ POSIX Extended Regular Expressions

This only affects -regex lines placed after it, so put it at the start.

I can’t think of an example here

By access/modify/chmod time

So there are three types of times find can filter by: - Access time: The time the file was last read - Change time: The time the permissions/owner of the file changed - Modified time: The time the file was last written to You can filter in two ways: by day or by minute. To search by 24-hour block, use either: - -atime - -ctime - -mtime For ?time, 0 means less than 24 hours, 1 means more than 24 and less than 48, etc.

To filter by minute: - -amin - -cmin - -mmin For example, to file all files accessed less than 5 minutes ago:

find -amin -5

You can also give -daystart at the beginning of the command to make -?time to work from the start of the day and not 24 hours ago.

For example, to see all files you used today:

find / -daystart -atime 0

When giving an integer argument to a test, you can use +n to mean more than n, -n to mean less than n, and n to mean n.

By size

To filter by size, use -size n. You’ll probably want to add a suffix, since the default units is 512 bytes.

The suffixes are: - b: 512 byte blocks - c: one byte - w: two bytes - k: kibibytes (1024 bytes) - M: mebibytes (1024 kibibytes) - G: gibibytes (1024 mebibytes) Note that 1 Gigabyte does not equal one Gibibyte. If you’re searching for a 2 gigabyte file you’ll want about 1907M, not 2G. As an example, I want to find all file greater than one MiB in my wallpapers:

> find -size +1M
./.ani2.png
./.ani5
./.ani6

You can also use -empty, which returns true if the file has no contents.

By user

To filter by user or group, use -user or -group. You can also use -uid or -gid, which support ranges…

So to find all wallpapers owned by me:

> find -uid 1000
.
./dw
./dw/Xj1Lby.jpg
./dw/doctor-who-wallpaper-3.jpg
./dw/dalek.jpg
./dw/dalek-bluer.jpg
./.ani1.png
./.ani2.png
./.ani3.png
./.ani4.png
./.ani5
./.ani6

By permissions

To filter by permissions, use -readable, -writable or -executable. You can also use -perm (which is actually the only standard one) You pass the usual chmod octal triplet into -perm, like 644, 755 or 777 To find all wallpapers I can write to:

> find -writable
.
./dw
./dw/Xj1Lby.jpg
./dw/doctor-who-wallpaper-3.jpg
./dw/dalek.jpg
./dw/dalek-bluer.jpg
./.ani1.png
./.ani2.png
./.ani3.png
./.ani4.png
./.ani5
./.ani6

Directory stuff

You can control directory usage with a couple of options, the most common being -maxdepth and -prune

-prune means that if the file is a directory, don’t descend into it. -maxdepth x means to stop after doing down x directories.

Boolean stuff

Find is kind of like a DSL (domain specific language (but it’s not TC)). You can use boolean operators, like NOT, AND, and OR.

For example, to find all files that end with .iso or are greater than 2 Gibibytes: find -name '*.iso' -or -size +2G (note that -o is the same as -or)

You can also use -not, ! (you may need to quote it) By default, each option has an implicit and between it. find -name '*.iso' -mmin -60 is the same as find -name '*.iso' -a (or -and) -mmin -60

Actions on the found files

After finding the files, having a list is cool ‘n all, but we may want to do something with it.

Showing found files

There are four ways to show files: -print (the default), -print0 (same as -print but with a \0 instead of \n), -fprint [file] (prints to file), and fprint0 [file] (prints to file with \0)

You can also use -ls to get more info, and -fls to ls to a file

You can use -printf, but that’s an “advanced topic” and beyond the (right now, 1151 words I have here). See more at the man page

Running commands on the file

There are two ways to run a command on a file: -execdir, and -exec.

-execdir first cd’s to the found files, and then runs the command -exec just runs the command on the relative path.

You can invoke this two ways: Let’s say I want to view all of my wallpapers. I can either do this: find -name '.ani?*' -o -name '*.jpg' -exec feh {} \; ({} is replaced by the filename), but then one window of feh (my image viewer) opens for each file.

If, instead of the \; (which is escaped to save it from the shell, so find can get it), we use a +? You see, a ; runs the command once for each file, and the + expands to a list of all the files and runs the command once. This is also better for performance: Running find -type f -exec sha256sum {} \; gives us:

> time find -type f -exec sha256sum {} \;
905724e4ce5ca0cd5ce1ba18b50480c9dac3025919a3c782ca47c3b32ff9c710  ./dw/Xj1Lby.jpg
24ab8803e3838ce63a8fb941870680dfe970f562c48704235a73a0d12e68dee2  ./dw/doctor-who-wallpaper-3.jpg
68ce1f66c7b8569c0eae3bfa83bddd5806ca6adda83dd3ae306242aa32ae03fa  ./dw/dalek.jpg
d154e21de2bb6364292a8a46105b455f6305a3a586d73879a4d6544670821f25  ./dw/dalek-bluer.jpg
f489d21d8c9151bce73cce6b14760d25e87457d1397c9b08c232cc8b1f518228  ./.ani1.png
bac9bc7819f6ff2222fe96397df2db45a0c6cc79f23814f7a45cbea6d6dd7f35  ./.ani2.png
7939f651ab6270ac1468aecce78c7a7b33d7f8af8953483f197875a9044fee89  ./.ani3.png
d0a637c82d0309e763f8c807ab09a023f2de4ef3ba814970f96e3abd84d01e7a  ./.ani4.png
8d7e75aca7748d5513ed46f3e34fb2bfbe48336f934a4c7d481ede19a3dbe262  ./.ani5
bc253d8afdbb41b0cf8dc2de98faf577bfdd7b1672791d76f3c04e2418a1fe12  ./.ani6

________________________________________________________
Executed in   73.63 millis    fish           external 
   usr time   52.74 millis  995.00 micros   51.75 millis 
      sys time   16.32 millis  144.00 micros   16.18 millis 

versus

> time find -type f -exec sha256sum {} +
905724e4ce5ca0cd5ce1ba18b50480c9dac3025919a3c782ca47c3b32ff9c710  ./dw/Xj1Lby.jpg
24ab8803e3838ce63a8fb941870680dfe970f562c48704235a73a0d12e68dee2  ./dw/doctor-who-wallpaper-3.jpg
68ce1f66c7b8569c0eae3bfa83bddd5806ca6adda83dd3ae306242aa32ae03fa  ./dw/dalek.jpg
d154e21de2bb6364292a8a46105b455f6305a3a586d73879a4d6544670821f25  ./dw/dalek-bluer.jpg
f489d21d8c9151bce73cce6b14760d25e87457d1397c9b08c232cc8b1f518228  ./.ani1.png
bac9bc7819f6ff2222fe96397df2db45a0c6cc79f23814f7a45cbea6d6dd7f35  ./.ani2.png
7939f651ab6270ac1468aecce78c7a7b33d7f8af8953483f197875a9044fee89  ./.ani3.png
d0a637c82d0309e763f8c807ab09a023f2de4ef3ba814970f96e3abd84d01e7a  ./.ani4.png
8d7e75aca7748d5513ed46f3e34fb2bfbe48336f934a4c7d481ede19a3dbe262  ./.ani5
bc253d8afdbb41b0cf8dc2de98faf577bfdd7b1672791d76f3c04e2418a1fe12  ./.ani6

________________________________________________________
Executed in   39.97 millis    fish           external 
   usr time   37.03 millis  967.00 micros   36.06 millis 
      sys time    2.90 millis  139.00 micros    2.76 millis 

Removing files

To remove a found file, use -delete. For example: find -empty -delete

And that’s about it! To learn more, read the fine manual

This tutorial was made with the help of Magical Mystery Tour, _Sgt. Pepper’s Lonely Hearts Club Band and Vim

One thing I can tell you is you’ve got to be free