I have a large utf-8 text file which I frequently search with grep. Recently grep began reporting that it was a binary file. I can continue to search it with grep -a, but I was wondering what change made it decide that the file was now binary.
I have a copy from last month where the file is no longer detected as binary, but it's not practical to diff them since they differ on > 20,000 lines.
file identifies my file as
UTF-8 Unicode English text, with very long lines
How can I find the characters/lines/etc. in my file which are triggering this change?
The similar, non-duplicate question 19907 covers the possibility of NUL but grep -Pc '[\x00-\x1F]' says that I don't have NUL or any other ANSI control chaarcters.