6

Gawk has "isarray":

if (isarray(x))
  print "is array"
else
  print "is scalar"

However Mawk and "gawk --posix" do not:

fatal: function 'isarray' not defined

This can cause problems:

x
x[1]
fatal: attempt to use scalar 'x' as an array

Or:

x[1]
x
fatal: attempt to use array 'x' in a scalar context

Can Awk detect an array without using the "isarray" function?

Zombo
  • 1
  • 5
  • 43
  • 62
  • Out of curiosity, what would be the use case? In which situation would you not know in advance if a variable is an array or not? – Stéphane Chazelas Apr 18 '17 at 13:42
  • 1
    @StéphaneChazelas 1) Checking whether an element in a true multi-dimensional array is itself an array or not so you can write general purpose functions parsing array contents, see https://www.gnu.org/software/gawk/manual/gawk.html#Walking-Arrays. 2) Within a user-defined function to test if an argument passed in is an array or not which is useful for testing if an optional array argument is present or not - if a function can take an optional array arg then isarray(arg) will return true when that arg is present, false when absent. – Ed Morton Apr 18 '17 at 14:26

2 Answers2

3

No. If it could then there wouldn't have been a need for gawk to introduce isarray().

Stephen Rauch
  • 4,209
  • 14
  • 22
  • 32
Ed Morton
  • 28,789
  • 5
  • 20
  • 47
2

I also don't think it's possible.

But I'll add that with busybox awk, variables can be both arrays and scalar. There it's OK to do:

a = "foo"; a["foo"] = "bar"

When a variable has been used as an array though, length() returns the number of elements in the array, even if it also has been defined as a scalar (though you can use length(var "") to get the length of the scalar), except when the variable has been passed as an argument to a function and assigned as a scalar there (could be considered as a bug):

$ busybox awk 'BEGIN{a[1] = 1; a = "foo"; print length(a), length(a"")}'
1 3
$ busybox awk 'function f(x) {x = "xxx"; print x[1], length(x)}
               BEGIN{a[1]=1; x = "yyy"; print a[1], length(a); f(a)}'
1 1
1 3

Too bad as otherwise it would have been easy to define a isarray() function there. We can still tell if a variable is an array with at least one element with

function isnonemptyarray(x) {
  return length(x) > 0 && length(x "") == 0
}

(assuming the variable hasn't been defined both as an array and scalar)

In anycase, that's busybox awk specific. length() can't be used on arrays portably. One can define a portable array_length() function with:

function array_length(a, tmp1, tmp2) {
  tmp1 = 0
  for (tmp2 in a) tmp1++
  return tmp1
}

But that can't be used portably on non-array variables.

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • 2
    @Steven, busybox is the exception there. nawk behaves like gawk (the original awk allowed `a=1;a[1]=1` though that was promoting `a` from scalar to array, but not `a[1]=1; a=1`). So if someone's to blame, that's a, w and k, not gawk's maintainer. – Stéphane Chazelas Apr 19 '17 at 12:02
  • 1
    The busybox implementation violates [the POSIX standard](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html): `The same name shall not be used within the same scope both as a scalar variable and as an array`. I thought `length(array)` had made it's way into POSIX by now but I don't see it in the standard yet. You can portably write a function to tell if an array is empty with `for (i in array) return 0; return 1`. – Ed Morton Apr 19 '17 at 16:24
  • @EdMorton, a *script* using a variable both as an array and scalar would be violating the standard. But AFAICT, there's nothing in the spec that prevents an awk implementation to allow it as an extension. There's nothing that says that awk should abort with an error or any other behaviour when someone tries to do so. – Stéphane Chazelas Apr 19 '17 at 16:30
  • I agree there's nothing in that statement that says what type of error should be produced. Not sure what you're suggesting with the rest of that comment. – Ed Morton Apr 19 '17 at 16:47
  • 1
    @Ed, I'm saying that busybox awk is not any more or less compliant than gawk. POSIX tells us you can't use a variable both as an array and scalar, not what awk implementations should do if you do. So all different behaviours of oawk, nawk, gawk, busybox awk are compliant in that instance. – Stéphane Chazelas Apr 19 '17 at 20:16
  • So if a standard says users cannot do X but doesn't say what to do if some user does do X then a tool provider can allow users to do X and still be standard-compliant because it's undefined behavior per that standard? OK, I see where you're coming from BUT I think the standard saying you cannot do it precludes allowing your users to do it and claiming that's OK because the standard didn't say what to do if the user violated that rule because IMHO the one clear requirement that IS clearly stated is that they cannot do it! I don't care enough to discuss it any more though. All the best! – Ed Morton Apr 19 '17 at 21:10