0

I have a comma separated string that could contain quoted elements with commas. For example:

issuer=C = US, O = "DigiCert, Inc.", CN = DigiCert High Assurance TLS Hybrid ECC SHA256 2020 CA1

I would like to extract the different elements ignoring the quoted comma (DigiCert, Inc.).

The script should be POSIX compliant and run on non GNU systems.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
Matteo
  • 9,676
  • 4
  • 49
  • 66
  • Could you post the expected output? – schrodingerscatcuriosity Nov 09 '21 at 15:03
  • @Quasímodo is not about reinventing the wheel. The script is used on several Unix variants (including BSD) where GNU tools are not available. Furthermore on small appliances with Alpine Linux where the list of available tools is limited. Forcing users to install a lot of dependences is not the best option. – Matteo Nov 09 '21 at 15:12
  • @Quasímodo I don't need a general solution. I just need to parse the issuers from TLS certificates as in the example. Quoted cannot be nested and are always balanced. – Matteo Nov 09 '21 at 15:20
  • 1
    Does this answer your question? [Is there a robust command line tool for processing csv files?](https://unix.stackexchange.com/questions/7425/is-there-a-robust-command-line-tool-for-processing-csv-files) See in particular the answers on `csvtool` and `miller`. – AdminBee Nov 09 '21 at 15:28
  • 2
    If that line is the output of `openssl x509 -issuer -noout`, see also the `-nameopt` option there. Like: `openssl x509 -noout -nameopt sep_multiline,utf8,esc_ctrl -issuer` to make it easier to parse. – Stéphane Chazelas Nov 09 '21 at 22:34
  • BTW, that's not CSV, that looks more like some representation of a Distinguished Name – Stéphane Chazelas Nov 09 '21 at 22:45
  • @StéphaneChazelas Thanks! I was focussed on parsing the output without thinking about changing the input :-) – Matteo Nov 11 '21 at 11:05

1 Answers1

1

Given that you don't want a general solution, i.e. you're looking for a hack and don't desire a robust solution, this seems pretty hack-ish and yet produces the right output, at least if the sample input you give is the most complex case you can reasonably encounter:

#!/usr/bin/env bash

set -o posix

grep '^[[:blank:]]*Issuer:' |
sed -Ee 's/^.* O[[:blank:]]*=[[:blank:]]*("[^"]*"|[^",]*),.*/\1/'

Even as a hack, I'm certain this could be improved, if one had the need.

The above is nearly POSIX-compliant, and runs on my non-GNU system.

$ grep -w Issuer: /usr/local/etc/ssl/cert.pem | head -5; \
    echo '...'; grep -w Issuer: /usr/local/etc/ssl/cert.pem | tail -5
        Issuer: C = ES, O = FNMT-RCM, OU = AC RAIZ FNMT-RCM
        Issuer: C = ES, O = FNMT-RCM, OU = Ceres, organizationIdentifier = VATES-Q2826004J, CN = AC RAIZ FNMT-RCM SERVIDORES SEGUROS
        Issuer: CN = ACCVRAIZ1, OU = PKIACCV, O = ACCV, C = ES
        Issuer: C = IT, L = Milan, O = Actalis S.p.A./03358520967, CN = Actalis Authentication Root CA
        Issuer: C = US, O = AffirmTrust, CN = AffirmTrust Commercial
...
        Issuer: C = US, ST = New Jersey, L = Jersey City, O = The USERTRUST Network, CN = USERTrust ECC Certification Authority
        Issuer: C = US, ST = New Jersey, L = Jersey City, O = The USERTRUST Network, CN = USERTrust RSA Certification Authority
        Issuer: C = US, O = "VeriSign, Inc.", OU = VeriSign Trust Network, OU = "(c) 1999 VeriSign, Inc. - For authorized use only", CN = VeriSign Class 1 Public Primary Certification Authority - G3
        Issuer: C = US, O = "VeriSign, Inc.", OU = VeriSign Trust Network, OU = "(c) 1999 VeriSign, Inc. - For authorized use only", CN = VeriSign Class 2 Public Primary Certification Authority - G3
        Issuer: C = US, OU = www.xrampsecurity.com, O = XRamp Security Services Inc, CN = XRamp Global Certification Authority
$ ./test.sh < /usr/local/etc/ssl/cert.pem | head -5; \
    echo '...'; ./test.sh < /usr/local/etc/ssl/cert.pem | tail -5
FNMT-RCM
FNMT-RCM
ACCV
Actalis S.p.A./03358520967
AffirmTrust
...
The USERTRUST Network
The USERTRUST Network
"VeriSign, Inc."
"VeriSign, Inc."
XRamp Security Services Inc
Jim L.
  • 7,188
  • 1
  • 13
  • 25
  • Note that POSIX doesn't specify a `bash` utility nor the path of the `env` utility nor the shebang mechanism nor the `-E` option to `sed` (yet), nor `head -5` nor `tail -5`. – Stéphane Chazelas Nov 09 '21 at 18:39
  • @StéphaneChazelas Point taken. The shebang was added only to assuage `shellcheck.net`. `head` and `tail` do not appear in the solution. They're used only to make explicit the edits to the posted input and output. – Jim L. Nov 09 '21 at 18:53
  • I changed the ```-nameopt``` parameters as suggested by @StéphaneChazelas but thanks anyway for the answer! – Matteo Nov 11 '21 at 11:06