Note that pcregrep (from the PCRE library) and GNU grep -P (when built with PCRE support) can take perl-like regular expressions and grep -P will work OK on UTF-8 data when in UTF-8 locales.
If you wanted to use perl instead, you could define a script or function to do so. Aliases won't do as aliases are just aliases, just meant to replace one string with another.
You could do:
perlgrep() (
export RE="${1?}"; shift
exec perl -Mopen=locale -Twnle '
BEGIN {$ret = 1; $several_files = @ARGV > 1}
if (/$ENV{RE}/) {
$ret = 0;
print $several_files ? "$ARGV: $_" : $_
}
END {exit $ret}' -- "$@"
)
But beware of the implications of running perl -n on arbitrary file names which are only partly mitigated by the -T option above.
Also, with -Mopen=locale, we're decoding the input and encoding the output as per the locale's charset, but file names themselves will be encoded but not decoded, which means that if filenames have byte values above 127, that won't work properly unless the locale's charset is iso8859-1.
In the end, you just need to decode the lines of input for matching only. You don't need to reencode it, not to decode/encode the file names.
So, instead, with recent versions of perl, you could do:
#! /usr/bin/perl --
use warnings;
use strict;
use Encode::Locale;
use Encode;
my $re = shift @ARGV;
my $several_files = @ARGV > 1;
my $ret = 1;
while (<<>>) {
if (decode(locale => $_) =~ $re) {
$ret = 0;
print $several_files ? "$ARGV: $_" : $_
}
}
exit $ret;
To prevent arbitrary code injection from the arguments, regexp operators like (?{code}), (??{code}) are disabled. If you want them back, you can add a use re 'eval'; towards the top of that script.