zgrep (the one shipped with gzip) is a shell script which in the end does something like zcat | grep. The one from zutils does the same except it's written in C++ and supports more compression formats. It still calls gzip and grep in separate processes, connected with a pipe.
With such a simple search, grep has a much easier job than zcat, so if you keep the same approach to organise your data, I would suggest focussing on working at improving the compression side of things.
Here working on a file generated with xxd -p -c35 < /dev/urandom | head -n 760000 | sort, I find that with it being gzip-compressed, using pigz -dc instead of zcat (aka gzip -dc) speeds things up by a factor 2.
Compressing it with lz4 --best, I get a 30% bigger file, but decompression times are reduced 100 fold:
$ zstat +size a*(m-1)| sort -k2n | column -t
a.xz 26954744
a.lrz 26971363
a.bz2 27412562
a.gz 30353089
a.gz3 30727911
a.lzop 38000050
a.lz4 40261510
a 53960000
$ time lz4cat a.lz4 > /dev/null
lz4cat a.lz4 > /dev/null 0.06s user 0.01s system 98% cpu 0.064 total
$ time pigz -dc a.gz > /dev/null
pigz -dc a.gz > /dev/null 0.36s user 0.02s system 126% cpu 0.298 total
$ time gzip -dc a.gz > /dev/null
gzip -dc a.gz > /dev/null 0.47s user 0.00s system 99% cpu 0.476 total
$ time lz4cat a.lz4 | LC_ALL=C grep '^af' > /dev/null
lz4cat a.lz4 0.07s user 0.02s system 60% cpu 0.142 total
LC_ALL=C grep '^af' > /dev/null 0.07s user 0.00s system 53% cpu 0.141 total
$ time pigz -dc a.gz | LC_ALL=C grep '^af' > /dev/null
pigz -dc a.gz 0.36s user 0.04s system 130% cpu 0.303 total
LC_ALL=C grep '^af' > /dev/null 0.06s user 0.01s system 23% cpu 0.302 total
$ time gzip -dc a.gz | LC_ALL=C grep '^af' > /dev/null
gzip -dc a.gz 0.51s user 0.00s system 99% cpu 0.513 total
LC_ALL=C grep '^af' > /dev/null 0.08s user 0.01s system 16% cpu 0.512 total
lzop --best is not far behind lz4, and compresses slightly better on my sample.
$ time lzop -dc a.lzop | LC_ALL=C grep '^af' > /dev/null
lzop -dc a.lzop 0.24s user 0.01s system 85% cpu 0.293 total
LC_ALL=C grep '^af' > /dev/null 0.07s user 0.01s system 27% cpu 0.292 total