Useful Commands For Filtering Text for Effective File Operations in Linux
In this article, let us learn about different commands used in Linux for filtering text to implement effective file operations on Linux Machines.
What is a filter in Linux?
In Linux Operating System, Filters act as a specialized program to get desired output from the client/user by taking the help of other files and pipes to perform a series of operations for the specified results. This helps in processing information in an effective way for desired output generation.
Syntax:
[filter method] [filter option] [data | path | location of data]
Let us discuss different filter commands which follow the above syntax and provide desired results for end users, namely:
pr command
pr command helps in generating the input in a printable format, with a properly defined column structure.
Syntax of pr command:
pr [options][filename}
Sample code:
Let’s say we created a file abc.txt that lists numbers from 1-to 10.
cat > file.txt
Output:
Now, let us apply a filter on the above file using the pr command :
pr -3 file.txt
Output:
Explanation :
By using the pr command in the above file, we filtered the contents of file.txt into 3 columns.
tr command
tr command is used, for translation or deletion of input string data, and also we can change the case of letters from upper to lower and vice versa. It helps in formatting the case of text data whenever needed.
Syntax of tr command:
tr [OPTION] SET1 [SET2]
Sample code:
Uppercasing the inputted string using tr command
echo “www.w3wiki.com” | tr [:lower:] [:upper:]
Output:
Explanation:
Here we are converting the lowercase string into the uppercase format by using tr command.
find command
The find command can be used as a filter to search files seamlessly on the server console in Linux Machines, given on one condition that users are aware of the file name for which they are applying this filter command in Linux Machines.
Syntax of find command:
find [where to start searching from]
[expression determines what to find] [-options] [what to find]
Sample code:
Finding the file.txt text file in / (root) directory using the find command.
find / -name "file.txt"
Output :
Explanation:
Here, we are finding the file.txt formatted file in the root directory Linux system by using the find command.
uniq command:
The uniq command is used for filtering out the unique characters from a file containing duplicate input data.
Syntax of uniq command:
uniq [OPTION] [INPUT[OUTPUT]]
Sample code:
Let us try creating a text file names.txt that contains several duplicate names of input data.
cat > names.txt
Now, let’s see the functionality of uniq command in filtering the above duplicate data :
uniq -c names.txt
Output:
Explanation:
We have only displayed the unique data or contents from the names.txt file by using uniq command.
sort command
The sort command is used for sorting the text data in ascending order.
Syntax of sort command:
sort filename
Sample code:
Let us create a text file chr.txt which has some random letters in random order :
cat > chr.txt
sort chr.txt
Output:
Explanation:
We have sorted the names.txt file in ascending order format by using the sort command along with the filename.
grep command
The Grep command will extract the matched string data from the target file.
Note: Grep, Egrep, Fgrep, Rgrep are all commands performing the same function i.e extracting the matched string data from the target input file in Command Line Interface, hence we discuss only the grep command here.
Syntax of grep command:
grep [options] pattern [files]
Sample code:
Let us create a text file that contains some sentences in text.
cat > files11.txt
Now, we will apply the grep filter on some text data in the above file to retrieve the desired value.
grep -i "ravi" files11.txt
Output:
Explanation:
Here, we are matching the particular string data with the contents of files11.txt.
sed command
sed Command, used for filtering and transforming complex data, acts as a powerful stream editor and is mostly applied in the shell or dev jobs for filtering out complex data. Using the sed command we can retrieve only the required lines of input text data by specifying the required index value.
Syntax of sed command:
sed OPTIONS… [SCRIPT] [INPUTFILE…]
Sample code:
Let us create a sample text file that contains some user names and their mail ids.
cat > data.txt
Now, let’s implement the sed command to retrieve input values starting from the 2nd index to the 5th index value only.
sed -n '2,5p' data.txt
Output:
Explanation:
In this section, we have filtered and transformed the data.txt file contents and outputted less complex data which can be easily viewed by the users.
fmt command
fmt command is a Linux Filtering Command, which helps in reformatting input data and printing it with the result of standard output.
Syntax of fmt command
fmt [-WIDTH] [OPTION]… [FILE]…
Sample code:
Let us consider we had a text file that contains data of some names and we can reformat them using this fmt command.
cat data.txt
Now, let’s reformat the data in the above file using fmt:
fmt -w 1 data.txt
Output :
Explanation:
We have reformatted the contents of data.txt file by using the fmt command. Our contents of data.txt have been formatted and now the contents of data.txt are in a more readable format.
more command
more command is used for retrieving analysis of text data in the file, it takes big size file and displays it in page format and one important thing we need to note is, page down and page up keys do not work and we need to click on enter key for the display of new records on screen. To apply more command and view its working functionality, we need to take any system folder on Linux Machines that contain a huge set of text data. For this reason, I have chosen /var/log/messages folder.
Syntax of more command:
more [-options] [-num] [+/pattern] [+linenum] [file_name]
what is /var/log/messages ?
/var/log/messages is a system-managed folder in Linux that contains all system messages and notifications recorded at System Boot, This folder has all logs of mail, kernel, auth, daemon, etc.
Sample code:
cat /var/log/messages | more
Output:
Explanation:
Here we can see the output as static text data representing a complex chunk of the input text data, in contrast, if we use the normal cat /var/log/messages command, we will see a continuous running logs output dynamically, we cannot capture that output in a picture and show you here, but you can try it useful with above-mentioned command, surely you would notice the difference and also please note that we have a “more” option, by clicking enter we can see the continuation of the output data.
less command
The working of less command is similar to that of more command but the only difference is the less command will display much faster in speed the desired output for which applied and also page up and page down option exists for the less command which is not in the case of more command discussed in above lines.
Syntax of less command:
less filename
Sample code:
To test the working of less command, let us take the same /var/log/messages which we implemented our more command.
cat /var/log/messages | less
Output:
Explanation:
Please note that we get the same output result like more command but only a difference in display speed and, we had page up, page down options through which we can scroll for continuing the output result logs by either using the mouse cursor or keyboard buttons.