After discussion on https://www.reddit.com/r/ansible/comments/e9ve5q/ansible_slow_as_a_hell_with_gcp_iap_any_way_to/ I altered solution to use an SSH connection sharing via socket.
It is two times faster then @mat solution. I put it on our PROD. Here is an implementation that doesn't depend on host name patterns!
The proper solution is to use Bastion/Jump host because gcloud command still spawns Python interpreter that spawns ssh - it is still inefficient!
ansible.cfg:
[ssh_connection]
pipelining = True
ssh_executable = misc/gssh.sh
ssh_args =
transfer_method = piped
[privilege_escalation]
become = True
become_method = sudo
[defaults]
interpreter_python = /usr/bin/python
gathering = False
# Somehow important to enable parallel execution...
strategy = free
gssh.sh:
#!/bin/bash
# ansible/ansible/lib/ansible/plugins/connection/ssh.py
# exec_command(self, cmd, in_data=None, sudoable=True) calls _build_command(self, binary, *other_args) as:
# args = (ssh_executable, self.host, cmd)
# cmd = self._build_command(*args)
# So "host" is next to the last, cmd is the last argument of ssh command.
host="${@: -2: 1}"
cmd="${@: -1: 1}"
# ControlMaster=auto & ControlPath=... speedup Ansible execution 2 times.
socket="/tmp/ansible-ssh-${host}-22-iap"
gcloud_args="
--tunnel-through-iap
--zone=europe-west1-b
--quiet
--no-user-output-enabled
--
-C
-o ControlMaster=auto
-o ControlPersist=20
-o PreferredAuthentications=publickey
-o KbdInteractiveAuthentication=no
-o PasswordAuthentication=no
-o ConnectTimeout=20"
exec gcloud compute ssh "$host" $gcloud_args -o ControlPath="$socket" "$cmd"
UPDATE There is response from Google engineer that gcloud aren't supposed to be called in parallel! See "gcloud compute ssh" can't be used in parallel
Experiments were shown that with Ansible fork=5 I almost always hit an error. With fork=2 I've never experienced one.
UPDATE 2 Time passed and as of end of 2020 I can run gcloud compute ssh in parallel (in WSL I did fork = 10) without locking errors.