0

The following behavior occurs in OS X (10.9.5) with the BSD utils, but not Linux (Ubuntu 15.10) with the GNU utils.

OS X:

 $ man -v
 man, version 1.6c

Linux:

 $ man --version
 man 2.7.4

I'm trying to write a shell script to automatically parse the supported options for a given program for use in bash/tcsh autocompletion. For instance here's the name and synopsis section output by name perl when viewed in the terminal.

$ man perl | cat

NAME
       perl - The Perl 5 language interpreter

SYNOPSIS
       perl [ -sTtuUWX ]      [ -hv ] [ -V[:configvar] ]
            [ -cw ] [ -d[t][:debugger] ] [ -D[number/list] ]
            [ -pna ] [ -Fpattern ] [ -l[octal] ] [ -0[octal/hexadecimal] ]
            [ -Idir ] [ -m[-]module ] [ -M[-]'module...' ] [ -f ]
            [ -C [number/list] ]      [ -S ]      [ -x[dir] ]
            [ -i[extension] ]
            [ [-e|-E] 'command' ] [ -- ] [ programfile ] [ argument ]...

$ man perl > /tmp/perl.out

I first got suspicious when

$ </tmp/perl.out awk '/SYNOPSIS/ {print}'

didn't print any lines, then I opened the file in vi and saw this. Why the heck are there so many unnecessary duplicated characters and ^H characters.

N^HNA^HAM^HME^HE
       perl - The Perl 5 language interpreter

S^HSY^HYN^HNO^HOP^HPS^HSI^HIS^HS
       p^Hpe^Her^Hrl^Hl [ -^H-s^HsT^HTt^Htu^HuU^HUW^HWX^HX ]      [ -^H-h^Hhv^Hv ] [ -^H-V^HV[:_^Hc_^Ho_^Hn_^Hf_^Hi_^Hg_^Hv_^Ha_^Hr] ]
            [ -^H-c^Hcw^Hw ] [ -^H-d^Hd[t^Ht][:_^Hd_^He_^Hb_^Hu_^Hg_^Hg_^He_^Hr] ] [ -^H-D^HD[_^Hn_^Hu_^Hm_^Hb_^He_^Hr_^H/_^Hl_^Hi_^Hs_^Ht] ]
DisplayName
  • 11,468
  • 20
  • 73
  • 115
Greg Nisbet
  • 2,996
  • 2
  • 25
  • 42
  • I started fiddling with this for my own purposes a while ago, but, while the overstriking looks daunting, it's very feasible to parse it and extract the formatting information as well as literal text. – LiberalArtist Apr 25 '16 at 05:37

1 Answers1

1

The output of man is not pure text. Try to use man perl | col -b instead.

DisplayName
  • 11,468
  • 20
  • 73
  • 115