Processing text is an essential part of many bash scripts. Bash provides many handy commands and utilities for manipulating plain text that should be in every scripter's toolkit.
This guide will demonstrate how to use text processing utilities like sed, awk, grep, cut, tr, and more to search, edit, and analyze text data in your bash shell scripts.
Finding Text with grep
The grep
command is used to find matching lines of text in files or input streams. It is based on regular expressions, which allows flexible pattern matching.
Some examples of using grep
to find text:
Basic Matching
# Find lines containing text
grep "some_text" myfile.txt
This prints any lines in myfile.txt that contain "some_text".
Inverted Match
# Print lines that do NOT match
grep -v "some_text" myfile.txt
The -v flag inverts the match, printing only non-matching lines.
Recursive Directory Search
# Search all files in directory recursively
grep -R "some_text" /path/to/dir
-R enables recursive directory traversal to search files in subdirectories.
As you can see, grep
is very useful for finding text patterns in files quickly and flexibly. But it doesn't allow editing or modifying text, only searching. For that, we need more powerful utilities like sed and awk.
Advanced Text Processing with sed and awk
sed
and awk
are programming languages built for processing text. They allow more advanced search and replace operations, numeric calculations, data formatting, and more.
Replacing Text with sed
The sed
utility is ideal for simple search and replaces on text. For example:
# Replace text
sed 's/day/night/' myfile.txt
This replaces "day" with "night" in myfile.txt. The s signifies substitution.
You can also use regular expressions:
# Replace with regex
sed 's/[0-9][0-9]/XX/' myfile.txt
This replaces any 2 digits with XX.
Parsing Text with awk
awk
is more full-featured for structured text processing. For example, to print the 5th column of a CSV:
# Print 5th CSV column
awk -F, '{print $5}' myfile.csv
The -F flag sets the column delimiter. $5 prints the 5th column value.
awk can also process text line-by-line:
# Print lines based on condition
awk '{if ($3 > 10) print $0}' myfile.txt
This prints lines where the 3rd column is greater than 10.
As you can see, awk
enables more advanced text processing and analysis.
Cutting, Sorting, Changing Case
A few more useful Bash text processing utilities:
cut
- Cut columns from a file:# Get username from /etc/passwd cut -d: -f1 /etc/passwd
sort
- Sort lines alphabetically:# Sort contents of file sort myfile.txt
tr
- Translate or delete characters:# Change lowercase to uppercase tr '[:lower:]' '[:upper:]' < myfile.txt
fmt
- Format text by wrapping lines at width:# Wrap lines at 80 chars fmt -w 80 myfile.txt
Mastering these text processing tools will enable you to easily search, parse, transform, and analyze text in your bash scripts. They can be combined and piped together for incredibly flexible data processing capabilities.
Refer to this guide for syntax examples of grep, sed, awk, cut, tr, fmt, and more. Text processing is essential for generating reports, parsing logs/CSVs/JSON, and handling string manipulation.