15

In bash $0 contains the name of the script, but in awk if I make a script named myscript.awk with the following content:

#!/usr/bin/awk -f
BEGIN{ print ARGV[0] }

and run it, it will only print "awk". Besides, ARGV[i] with i>0 is used only for script arguments in command line. So, how to make it print the name of the script, in this case "myscript.awk"?

cuonglm
  • 150,973
  • 38
  • 327
  • 406
cipper
  • 363
  • 2
  • 11
  • I've changed the title from awk to mawk because all the solutions require gawk and don't work with general awk, and in particular with mawk which is widely used (e.g. default on Ubuntu) – cipper Sep 08 '15 at 07:01
  • What makes you think `mawk` is default on Ubuntu? On my 15.04 VM, the default `awk` is `gawk`. While mawk is installed it is not the default. – terdon Sep 08 '15 at 11:58
  • What you have posted is a shell script, not an awk script, so it shoud be named "myscript.sh" or similar, not "myscript.awk". That fact that you are calling awk inside that shell script is completely irrelevant - you could replace awk with perl or a bunch of shell commands and the script would have the same functionality and it'd still be a shell script. – Ed Morton Sep 08 '15 at 14:24
  • 1
    It's an awk script if you call it by `awk -f myscript.awk`. However, this is unrelated to the problem in question. – cipper Sep 08 '15 at 15:55
  • 1
    @EdMorton It's an `awk` script because it begins with `#!/usr/bin/awk -f`. Shell scripts begin with `#!/bin/sh` (or something similar). – Barmar Sep 09 '15 at 19:43
  • 1
    I've been talking to various shell experts and trying to get a definitive answer on whether that's a shell or awk script and surprisingly according to POSIX the interpretation of files that begin with #! is undefined and has no specific type name. While some people refer to it as a "hash bang interpreter script" rather than a shell or awk script, the consensus seems to be that it should be considered an awk script even though the kernel (not shell) interprets the first line because awk still has to be able to parse that first line too (as a comment) and you can execute it using `awk -f file`. – Ed Morton Sep 10 '15 at 22:18

5 Answers5

7

I don't know any direct way of getting the command name from within awk. You can however find it through a sub-shell.

gawk

With GNU awk and the ps command you can use the process ID from PROCINFO["PID"] to retrieve the command name as a workaround. For example:

cmdname.awk

#!/usr/bin/gawk -f

BEGIN {
  ("ps -p " PROCINFO["pid"] " -o comm=") | getline CMDNAME
  print CMDNAME
}

mawk and nawk

You can use the same approach, but derive awk's PID from the $PPID special shell variable (PID of the parent):

cmdname.awk

#!/usr/bin/mawk -f

BEGIN { 
  ("ps -p $PPID -o comm=") | getline CMDNAME
  print CMDNAME
}

Testing

Run the script like this:

./cmdname.awk

Output in both cases:

cmdname.awk
Thor
  • 16,942
  • 3
  • 52
  • 69
  • I got an error: /bin/sh: 1: -o: not found – cipper Sep 07 '15 at 16:28
  • @cipper: This only works with GNU awk, I added the missing shebang line. – Thor Sep 07 '15 at 16:50
  • From [gawk manual](https://www.gnu.org/software/gawk/manual/html_node/Getline_002fPipe.html#Getline_002fPipe): _According to POSIX, ‘expression | getline’ is ambiguous if expression contains unparenthesized operators other than ‘$’—for example, ‘"echo " "date" | getline’ is ambiguous because the concatenation operator is not parenthesized. You should write it as ‘("echo " "date") | getline’ if you want your program to be portable to all awk implementations._ – cipper Sep 07 '15 at 16:50
  • but we cannot use parentheses in your method because PROCINFO would be interpreted as bash command as well... – cipper Sep 07 '15 at 16:52
  • @cipper: The `PROCINFO` hash is GNU awk specific, so this only works with GNU awk. – Thor Sep 07 '15 at 16:55
  • On default ubuntu it does not work. It would be nice to find a portable solution. – cipper Sep 07 '15 at 17:00
  • @cipper: then use the answer by _cuonglm_. – Thor Sep 07 '15 at 17:02
  • 1
    If it needs `gawk` it is a `gawk` solution instead of an `awk` solution. I think @cipper should add his wish "a portable solution" to the question. –  Sep 07 '15 at 17:30
  • 1
    @Thor: the answer by cuonglm is not a solution since it requires to feed manually the script with its name. It's like calling `awk -vNAME="myscript.awk" ./myscript.awk` and then print NAME inside the script. Not a solution. – cipper Sep 08 '15 at 06:52
  • @cipper: You can get the process id through a sub-shell in mawk. See the edit. – Thor Sep 08 '15 at 13:18
  • I just found out you were already using the PROCINFO approach, which I didn't see when I posted my answer. I will keep mine since I use `system()`, but of course +1 yours! – fedorqui Sep 08 '15 at 13:53
  • @fedorqui: no worries, you also quoted the relevant manual bits and showed an alternative use of `ps`. – Thor Sep 08 '15 at 13:57
  • Why not simply `"ps -p $PPID -o comm=" | getline` – Stéphane Chazelas Jan 11 '17 at 12:32
  • @StéphaneChazelas: Not sure why I used two sub-shells. I have updated the answer. Thanks. – Thor Jan 11 '17 at 21:21
  • Note that there's nothing Linux specific in there. The only non-POSIX feature is gawk's `PROCINFO["pid"]` here. – Stéphane Chazelas Jan 11 '17 at 21:34
  • @StéphaneChazelas: The wording is better now thanks. You are right as well about it not being Linux specific, the solution only requires a way to get at the process id and associated command name. – Thor Jan 12 '17 at 01:57
7

With GNU awk 4.1.3 in bash on cygwin:

$ cat tst.sh
#!/bin/awk -f
BEGIN { print "Executing:", ENVIRON["_"] }

$ ./tst.sh
Executing: ./tst.sh

I don't know how portable that is. As always, though, I wouldn't execute an awk script using a shebang in a shell script as it just robs you of possible functionality. Keep it simple and just do this instead:

$ cat tst2.sh
awk -v cmd="$0" '
BEGIN { print "Executing:", cmd }
' "$@"

$ ./tst2.sh
Executing: ./tst2.sh

That last will work with any modern awk in any shell on any platform.

Jeff Schaller
  • 66,199
  • 35
  • 114
  • 250
Ed Morton
  • 28,789
  • 5
  • 20
  • 47
  • Note that the first one only work in bash, zsh or ksh. The later is about shell script, not awk script. – cuonglm Sep 08 '15 at 14:27
  • The OP simply wants to print the name of a shell script which he has very unfortunately named `myscript.awk`. – Ed Morton Sep 08 '15 at 15:13
  • Let @cipper clarify. – cuonglm Sep 08 '15 at 15:35
  • 2
    Thank you! `ENVIRON["_"]` works perfectly, and it doesn't call any external program. The second option `awk -v ...` depends on how one runs the script; I don't want this. – cipper Sep 08 '15 at 16:04
  • @cipper: Don't rely on it, as I said in my comment above, it only work with bash, zsh, or ksh variants. – cuonglm Sep 08 '15 at 16:16
  • 2
    Calling your script `tst.sh` is misleading. It's an `awk` script, not a shell script. `BEGIN` is not a valid shell command. – Barmar Sep 09 '15 at 19:45
  • `ENVIRON` is very portable. It is a [special variable](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_03) defined in the POSIX standard itself. – Adam Katz Oct 10 '17 at 17:30
  • 1
    Right but the portability question isn't "is ENVIRON[] portable" it's "does `ENVIRON["_"]` produce the calling shell script path when printed from every awk called via a shebang from every shell"? I would never call an awk script from a shebang to I personally don't care about the answer but just thought I'd mention it.... Oh I see in the comments above that @cuonglm answered that it's only supported in some shells. – Ed Morton Oct 10 '17 at 18:08
  • 1
    Good point, @Ed. Verified as failing in dash (which returns the _previous_ command (or else the shell itself) rather than the current one). ksh93 interestingly prefixes the PID in asterisks, e.g. `*12345*/tmp/test.awk`. `ARGV[0]` is reliably always `awk` in dash, bash, zsh, and ksh93. – Adam Katz Oct 11 '17 at 15:43
  • instead of `-v cmd="$0"`, you could do a more complex one to : 1) take into account non-gnu environnements (but I still assume you use bash... ymmv), 2) find if possible the name of a sourced script, and whether the scirpt is invoked with a full path, from $PATH, etc : with : `truename=${BASH_SOURCE[0]:-$0}; name="$( basename "$truename" )"; dir="$( cd -P "$( dirname "$truename" )" && pwd)"; awk -v cmd="${dir}/${name}" '....' "$@"` – Olivier Dulac Jun 15 '23 at 09:09
5

I don't think this is possible as per gawk documentation:

Finally, the value of ARGV[0] (see section 7.5 Built-in Variables) varies depending upon your operating system. Some systems put awk there, some put the full pathname of awk (such as /bin/awk), and some put the name of your script ('advice'). Don't rely on the value of ARGV[0] to provide your script name.

On linux you can try using a kind of a dirty hack and as pointed in comments by Stéphane Chazelas it is possible if implementation of awk supports NUL bytes:

#!/usr/bin/awk -f

BEGIN { getline t < "/proc/self/cmdline"; split(t, a, "\0"); print a[3]; }
taliezin
  • 9,085
  • 1
  • 34
  • 38
  • your script as is seems not working. It just prints "k" if called with "awk -f script.awk", and it prints "s" if called by "./script.awk" – cipper Sep 07 '15 at 16:35
  • @cipper: Here it works with `gawk` and fails (like your description) with `mawk`. Interesting! –  Sep 07 '15 at 16:51
  • It works for me in linux, `awk` - 4.0.2. In freebsd with `/proc/curpoc/cmdline`, and `awk` result is like yours but works with `gawk`. – taliezin Sep 07 '15 at 16:52
  • On default ubuntu it does not work. It would be nice to find a portable solution. – cipper Sep 07 '15 at 16:58
  • @cipper, consider cuonglm answer. – taliezin Sep 07 '15 at 17:01
  • 1
    @taliezin: the answer by cuonglm is not a solution since it requires to feed manually the script with its name. It's like calling `awk -vNAME="myscript.awk" ./myscript.awk` and then print NAME inside the script. Not a solution. – cipper Sep 08 '15 at 06:54
  • @cipper, I see, but I think there is no portable solution. – taliezin Sep 08 '15 at 06:56
  • You need an implementation of `awk` that supports NUL bytes. So only gawk and recent versions of mawk. – Stéphane Chazelas Jan 11 '17 at 12:33
  • @StéphaneChazelas, I added this to the answer. – taliezin Jan 11 '17 at 12:49
  • Use this instead: no need to parse null bytes: `BEGIN { getline < "/proc/self/comm"; print }`. You can also assign the value to a variable from `$0`. – Thomas Guyot-Sionnest Mar 18 '22 at 06:41
5

With POSIX awk:

#!/usr/bin/awk -f

BEGIN {
    print ENVIRON["AWKSCRIPT"]
}

Then:

AWKSCRIPT=test.awk ./test.awk
test.awk
cuonglm
  • 150,973
  • 38
  • 327
  • 406
  • 6
    You manually feed the name of the script in it, this is not a self-printing way – cipper Sep 07 '15 at 16:33
  • @cipper: Well, that's the easiest and portable way I can imagine. – cuonglm Sep 07 '15 at 16:34
  • 5
    It's like calling `awk -vNAME="myscript.awk" ./myscript.awk` and then print the variable `NAME` inside the script. Not a solution. – cipper Sep 08 '15 at 06:59
  • @cipper: That's the only way, if you mention `mawk`. And also using `ENVIRON` isn't the same as using `-vNAME="myscript.awk"`, since when `mawk` will expand escape sequence in `NAME`. – cuonglm Sep 08 '15 at 07:08
4

Using GNU awk

Checking the GNU awk user's guide - 7.5.2 Built-in Variables That Convey Information I stumbled upon:

PROCINFO #

The elements of this array provide access to information about the running awk program. The following elements (listed alphabetically) are guaranteed to be available:

PROCINFO["pid"]

The process ID of the current process.

This means that you can know the PID of the program during runtime. Then, it is a matter of using system() to look for the process with this given PID:

#!/usr/bin/gawk -f
BEGIN{ pid=PROCINFO["pid"]
       system("ps -ef | awk '$2==" pid " {print $NF}'")
}

I use ps -ef, which displays the PID on the 2nd column. Assuming the executiong is done through awk -f <script> and no other parameters, we can assume the last field of the line contains the information we want.

In case we had some parameters, we would have to parse the line differently -or, better, use some of the options of ps to print just the columns we are interested in.

Test

$ awk -f a.awk 
a.awk
$ cp a.awk hello.awk
$ awk -f hello.awk 
hello.awk

Note also that another chapter of the GNU awk user's guide tells us that ARGV is not the way to go:

1.1.4 Executable awk Programs

Finally, the value of ARGV[0] (see Built-in Variables) varies depending upon your operating system. Some systems put ‘awk’ there, some put the full pathname of awk (such as /bin/awk), and some put the name of your script (‘advice’). (d.c.) Don’t rely on the value of ARGV[0] to provide your script name.

fedorqui
  • 7,603
  • 7
  • 35
  • 71
  • unfortunately PROCINFO is only a gawk feature, not general awk. For example it is not available in mawk (which is installed by default in ubuntu) – cipper Sep 08 '15 at 06:49
  • I know... Why did you tag the question with [gawk] then? – fedorqui Sep 08 '15 at 07:03
  • You're right. When I posted the question I wasn't aware about all these differences between mawk and gawk. The tag has changed to mawk now. – cipper Sep 08 '15 at 08:18
  • @cipper good : ) I was in fact testing with `mawk` and couldn't make it work, so that I installed `gawk` in my Ubuntu and it worked. So a workaround can be to use `gawk` : D – fedorqui Sep 08 '15 at 08:30
  • I know, but I prefer a portable solution since mawk is installed by default in all debian-based distros (basically half of the linux platforms, at least) – cipper Sep 08 '15 at 09:50
  • 1
    @terdon, `gawk` is not installed by default on Ubuntu (or at least some Ubuntu versions, where `mawk` is the default `awk` implementation). IIRC, I had to install it as well on Debian. – Stéphane Chazelas Jan 11 '17 at 14:46
  • @StéphaneChazelas well I'll be! You're quite right, I just tested this on an Ubuntu VM. I stand corrected, thanks. – terdon Jan 11 '17 at 14:51
  • @terdon, yes confirmed in a dpkg.log. mawk is also the default awk implementation on Debian (at least stretch) and gawk is not installed by default. – Stéphane Chazelas Jan 11 '17 at 14:55