AWK Hints

   |   3 minute read   |   Using 591 words

Text processing hints with AWK

awk is a powerful and versatile tool at your disposal. It allows you to perform complex operations on text files with ease. In this blog post, we’ll explore 10 common awk examples, complete with inputs and expected outputs, to help you unlock its full potential.

1. Extracting Columns from a CSV File

Imagine you have a CSV file, and you want to extract specific columns. Here’s how you can do it:

Input (file.csv):

Name, Age, City
John, 25, New York
Alice, 30, Los Angeles

Command:

awk -F',' '{print $1, $3}' file.csv

Output:

Name City
John New York
Alice Los Angeles

2. Summing Values in a Column

Summing up values in a column is a common task. Let’s say you have a file with numbers, and you want to find the total:

Input (data.txt):

10
20
30

Command:

awk '{sum+=$1} END {print sum}' data.txt

Output:

60

3. Searching for Specific Lines

Searching for lines that match a specific pattern can be handy. Suppose you have a file with various words, and you want to find lines containing the letter ‘a’:

Input (file.txt):

apple
banana
cherry
date

Command:

awk '/a/' file.txt

Output:

apple
banana
date

4. Calculating Column Averages

You can calculate the average of values in a column using awk. Consider you have a file with grades, and you want to find the average:

Input (grades.txt):

90
85
78
92

Command:

awk '{sum+=$1} END {print "Average: ", sum/NR}' grades.txt

Output:

Average: 86.25

5. Filtering Lines by Length

Sometimes, you need to filter lines based on their length. Let’s say you have a file with text, and you want lines longer than 20 characters:

Input (text.txt):

This is a short line.
This is a really long line that exceeds 20 characters.

Command:

awk 'length($0) > 20' text.txt

Output:

This is a really long line that exceeds 20 characters.

6. Formatting and Aligning Columns

Formatting columns for better readability is crucial. If you have a file with names and ages, you can align them like this:

Input (data.txt):

Alice 25
Bob 30
Carol 22

Command:

awk '{printf "%-10s %s\n", $1, $2}' data.txt

Output:

Alice      25
Bob        30
Carol      22

7. Counting Lines in a File

Counting lines in a file is a straightforward task. Let’s say you have a file with multiple lines:

Input (file.txt):

Line 1
Line 2
Line 3

Command:

awk 'END {print NR}' file.txt

Output:

3

8. Find and Replace Text

Searching and replacing text is a common operation. If you have a file with the text “Hello, World!” and you want to change “World” to “Universe,” you can do it like this:

Input (text.txt):

Hello, World!

Command:

awk '{gsub("World", "Universe")}1' text.txt

Output:

Hello, Universe!

9. Removing Duplicate Lines

Removing duplicate lines from a file can help clean up your data. Suppose you have a file with duplicate words:

Input (data.txt):

apple
banana
apple
cherry

Command:

awk '!seen[$0]++' data.txt

Output:

apple
banana
cherry

10. Calculating Column-Wise Sum and Average

Lastly, you can calculate the sum and average of each column in a multi-column file. Suppose you have a file with two columns:

Input (numbers.txt):

10 20
5 15
8 25

Command:

awk '{for(i=1; i<=NF; i++) {sum[i]+=$i}} END {for(i=1; i<=NF; i++) {print "Column", i, "Sum:", sum[i], "Average:", sum[i]/NR}}' numbers.txt

Output:

Column 1 Sum: 23 Average: 7.66667
Column 2 Sum: 60 Average: 20

These awk examples showcase its flexibility and power in text processing and data analysis. Whether you need to manipulate data, extract specific information, or perform calculations, awk is an invaluable tool in your Bash toolkit.



denis256 at denis256.dev