Hurry! You've gotten your hands on a large file of leaked records and want to determine if any of your friends are affected:

first_name last_name phone email zip random
Lars Hurley (784)764-9965 [email protected] 56662 1
Solomon Guerra (431)799-5443 [email protected] 89120 4
Stone Grant (322)526-5155 [email protected] 53121 6
Isabelle Massey (731)671-8236 [email protected] 77743 1
Geraldine Cooke (865)364-9487 [email protected] 58123-1293 6
Neve Nicholson (430)324-7527 [email protected] 53124 1
Fredericka Myers (247)982-0158 [email protected] 68376-7256 9
Leo Castaneda (771)652-6444 [email protected] 20158 8
Kane Guerrero (299)545-4314 [email protected] 63028 2
Quinn Forbes (463)614-4569 [email protected] 73599 4

Have no fear for awk is here! The text-processing utility is great for filtering and extracting data from tabular formats, so here's the cheat sheet:

Extract fields

One field:

awk '{ print $2 }' table.txt
last_name
Hurley
Guerra
Grant
Massey
Cooke
Nicholson
Myers
Castaneda
Guerrero
Forbes

Multiple fields:

awk '{ print $2, $3 }' table.txt
last_name phone
Hurley (784)764-9965
Guerra (431)799-5443
Grant (322)526-5155
Massey (731)671-8236
Cooke (865)364-9487
Nicholson (430)324-7527
Myers (247)982-0158
Castaneda (771)652-6444
Guerrero (299)545-4314
Forbes (463)614-4569

Pretty-printed fields (by piping the awk output to column):

awk '{ print $2, $3 }' table.txt | column -t
last_name  phone
Hurley     (784)764-9965
Guerra     (431)799-5443
Grant      (322)526-5155
Massey     (731)671-8236
Cooke      (865)364-9487
Nicholson  (430)324-7527
Myers      (247)982-0158
Castaneda  (771)652-6444
Guerrero   (299)545-4314
Forbes     (463)614-4569

Filter records

Content between the forward slashes is a regex and applied as a filter on the entire record:

awk '/89120/ { print $2, $5 }' table.txt
Guerra 89120

Note that the regex applies to all fields in each record even when those fields aren't extracted:

awk '/89120/ { print $2 }' table.txt
Guerra

Filter records by one field

As demonstrated in the previous section, this regex finds all records that have a 4 anywhere in the record:

awk '/4/ { print $3, $5, $6 }' table.txt | column -t
(784)764-9965  56662       1
(431)799-5443  89120       4
(731)671-8236  77743       1
(865)364-9487  58123-1293  6
(430)324-7527  53124       1
(247)982-0158  68376-7256  9
(771)652-6444  20158       8
(299)545-4314  63028       2
(463)614-4569  73599       4

On the other hand, this filter finds records where only the 6th column is 4:

awk '$6==4 { print $3, $5, $6 }' table.txt | column -t
(431)799-5443  89120  4
(463)614-4569  73599  4

Such a simple tool, and yet so useful. I'm feeling inspired and might put together similar cheat sheets for ls, sed, grep, and other common utilities!