Command Line Searches Using The find Command
This is another command that I think is very powerful but most people don't know how to use it to its full potential. I decided to make this a written post since there are so many different parameters to use with the command and I was worried that making a video would not allow the viewer to capture the actual commands.
Why use 'find'?
While most people are used to using a GUI for conducting searches on a file system, you may sometimes find yourself in a situation where all you have is the command line (like when you have connected remotely via SSH to a server on the other side of the world). Enter the 'find' command. Below are some common searches you would execute in a GUI but here we will get the same results using the 'find' command.
Finding files by name
One of the most common searches you perform on a directory is to look for files that have a similar pattern. For example, if you wanted to find all the files that have the "*.txt" extension, you would type this:
% find . -name "*.txt"
./end_date.txt
./log_08022021.txt
./log_08012021.txt
Let's break that command down.
% find
- the actual command at the prompt.
.
- the directory that the find command will search. You must provide a path for it to search and in Linux the "dot" represents the current working directory.
-name
- this argument tells the find command that you are searching by name.
"*.txt"
- this is the patter to search for. In this case, we are looking for files with the extension "txt".
Finding files by type
Let's say you run the previous command looking for files with a certain name but instead of just files you also get directories with a similar name.
% find . -name "shake*"
./shakespeare
./shakespeare_bio.txt
If you really only want files with a particular name, you have to add the -type
parameter like this:
% find . -type f -name "shake*"
./shakespeare_bio.txt
Likewise, if you only wanted to find what directories were under the current path, you would type this command:
% find . -type d
.
./shakespeare
./shakespeare/tragedy
./shakespeare/comedy
Notice that the first result is the "dot" which, again, represents the current working directory.
Finding files greater than a certain size
Now let's say you want to find all the files that are above a certain size. You would run the command like this:
% find . -size +1M
./10-MB-Test.docx
The -size
parameter tells find that you are searching only by size and the +1M
tells it to look for any file that is larger than 1MB. Note that the "+" sign is very important. If you leave it off, the find command will only look for files that are exactly 1MB in size (recall that 1MB in Linux is 1048576 bytes).
% find . -size 1M
./1MB_file.txt
Finding files by timestamp
What if you want to find files that older than a certain date? The find command has a -newer
parameter that you can use, you just have to provide if a reference file and use negative logic. What do you mean a reference file? Well, recall that find works with files so in order to determine if a file is older or newer than a date, it needs to compare all files to a file that was created on a specific date and/or time. Fortunately, Linux allows you to create a file and specify its timestamp by using the touch command. Say, for example, that you want to find all the files that are older than August 2021. You first would create a reference file like this:
% touch -d "2021-09-01T00:00:00" end_date.txt
This command creates a file and sets its creation date to September 1, 2021. Now you can use the find command like this to find any files that are older than the end_date.txt file:
% find . -not -newer end_date.txt
./end_date.txt
./log_08022021.txt
./log_08012021.txt
Note two things about this command:
- The use of the
-not
parameter to negate the-newer
parameter. So, if you think in terms of negative logic, older == not newer. - The reference file will always show up in the results precisely because it is not newer than itself. Just remember to always ignore the first result if it is the reference file.
Sorting files by size
One of the most common operations you will perform is to list all files by their size. In Linux, you can use the find command to do this but it will lead us to a tangent, the use of the -exec
parameter.
The -exec
parameter is powerful because it allows the user to use two commands in one. Think of it like this:
- For every file that matches the criteria for the find command,
- Perform (execute) some action on it
For example, say you want to get all the file information for every "txt" file. You would use the find command like this:
% find . -name "*.txt" -exec ls -alh {} \;
-rw-r--r-- 1 jaimevillela staff 1.0K Aug 18 10:13 ./1KB_file.txt
-rw-r--r-- 1 jaimevillela staff 1.0M Aug 18 10:23 ./1MB_file.txt
-rw-r--r-- 1 jaimevillela staff 0B Sep 1 2021 ./end_date.txt
-rw-r--r-- 1 jaimevillela staff 256B Aug 18 11:30 ./log_08022021.txt
-rw-r--r-- 1 jaimevillela staff 2.9K Aug 18 11:32 ./shakespeare_bio.txt
-rw-r--r-- 1 jaimevillela staff 512B Aug 18 11:29 ./log_08012021.txt
Let's break down everything that comes after the -exec
parameter:
ls -alh
- lists all the information for a file. The h
parameter means the size will be in human readable form which is easier to read and shortens the size column.
{}
- this represents every file that matches the find criteria (in this case, any file that has the "txt" extension).
\;
- the command must be terminated by a semicolon and, because we are in a shell, the semicolon must be "escaped" by the slash ("\") or the shell will treat it as a control operator and we don't want that.
We now have the size for all the files we are interested in but now we have to sort them. Unfortunately, the find command alone will not give us this information; we need to pipe the output of this command to another one: the sort command.
To sort all the files by their size, the full command would look like this:
% find . -name "*.txt" -exec ls -alh {} \; | sort -k 5 -r -h
-rw-r--r-- 1 jaimevillela staff 1.0M Aug 18 10:23 ./1MB_file.txt
-rw-r--r-- 1 jaimevillela staff 2.9K Aug 18 11:32 ./shakespeare_bio.txt
-rw-r--r-- 1 jaimevillela staff 1.0K Aug 18 10:13 ./1KB_file.txt
-rw-r--r-- 1 jaimevillela staff 512B Aug 18 11:29 ./log_08012021.txt
-rw-r--r-- 1 jaimevillela staff 256B Aug 18 11:30 ./log_08022021.txt
-rw-r--r-- 1 jaimevillela staff 0B Sep 1 2021 ./end_date.txt
Let's break down everything after the find command:
|
- this is the pipe character; it takes the results of the find command and passes them to the sort command
sort
- hopefully, this is self-explanatory
-k 5
- this will perform the sort according to the 5th column of the input. In this case, it is the 5th column that contains the file sizes.
-r
- this will sort in reverse order (i.e. descending; by default, a sort is performed in ascending mode)
-h
- sort according to human-readable numeric mode. Because the results of the find command displayed the size in human-readable format, the sort should perform it's action accordingly
Conclusion
Whew! That last example was tricky but if you review it a few times it will make sense. In particular, the -exec
parameter can be powerful if used to delete files but you have to be very careful because in Linux once you delete a file it's gone. So do run through some examples of your own and I hope this helps you the next time you have to perform some searches from the command line.