6

I have a file like this:

ID  A56
DS  /A56
DS  AGE 56

And I'd like to print the whole line only if the second column starts with a capital letter.

Expected output:

ID  A56
DS  AGE 56

What I've tried so far:
awk '$2 ~ /[A-Z]/ {print $0}' file
Prints everything: capital letters are found within the second column.

awk '$2 /[A-Z]/' file
Gets a syntax error.

dovah
  • 1,687
  • 6
  • 21
  • 39

2 Answers2

11

You must use regex ^ to denote start of string:

$ awk '$2 ~ /^[[:upper:]]/' file
ID  A56
DS  AGE 56
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
cuonglm
  • 150,973
  • 38
  • 327
  • 406
5

You could use awk as @cuonglm suggested, or

  1. GNU grep

    grep -P '^[^\s]+\s+[A-Z]' file 
    
  2. Perl

    perl -lane 'print if $F[1]=~/^[A-Z]/' file
    
  3. GNU sed

    sed -rn '/^[^\s]+\s+[A-Z]/p' file 
    
  4. shell (assumes a recent version of ksh93, zsh or bash)

    while read -r a b; do 
        [[ $b =~ ^[A-Z] ]] && printf "%s %s\n" "$a" "$b"; 
    done < file 
    
cuonglm
  • 150,973
  • 38
  • 327
  • 406
terdon
  • 234,489
  • 66
  • 447
  • 667
  • That assumes GNU `grep`, GNU `sed` and for the last one, recent versions of ksh93 zsh or bash and that the file doesn't contain backslash characters. Except for the `perl` one what `[A-Z]` matches depends on the locale and doesn't make much sense except in the C locale. – Stéphane Chazelas Jul 25 '14 at 11:49
  • @StéphaneChazelas so `-P` is a GNU extension? OK. Why does the `[A-Z]` not make sense? Presumably, the OP would want whatever is defined as a capital letter in their locale right? I added `-r` for the backslashes. – terdon Jul 25 '14 at 11:56
  • 1
    backslashes still an issue with Unix-conformant echos. Yes `-P` is a GNU extension though it's found as well in some BSDs that have forked or rewritten their GNU grep. See [there](http://unix.stackexchange.com/a/87763/22565) for `[A-Z]`. Also note that `\s` is `[[:space:]]`, not `[[:blank:]]`. – Stéphane Chazelas Jul 25 '14 at 12:03
  • @StéphaneChazelas I see, thanks. I switched to `printf` then, just in case. – terdon Jul 25 '14 at 12:09