I have a machine with 2 ethernet cards. One with single interface, one with 4.
I intend to use the single interface card (which is should be connected to my LAN) as the means of accessing (ssh etc) this machine (the host).
I then intend to have 4 VMs running (using libvirt), one on each of the 4 ports on the second card, using macvtap. Some will be connecting to my LAN, some to my DMZ (managed by an old sonicwall).
Example:
<network>
<name>my-ubuntu-network</name>
<uuid>aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa</uuid>
<forward dev='e1' mode='bridge'>
<interface dev='e1'/>
</forward>
</network>
I have set up 2 VMs and got them working nicely. All my interfaces get the right IP from the sonicall DHCP by MAC. The MAC defined in both of my domains gets the right IPs from the sonicwall too (both are on the DMZ).
I then created the third VM (again for my DMZ). It got the correct IP from the sonicwall but I can not ping it from the host.
When I connect to it's console, it can ping other devices on the DMZ. If I connect to one of the first 2 guests, I can ping it. If I try from a different physical machine on my LAN (who can see DMZ), i can ping it. It seems the only place I can't ping the guest is from the host.
ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: e4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:05 brd ff:ff:ff:ff:ff:ff
3: e0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:50 brd ff:ff:ff:ff:ff:ff
4: e3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:04 brd ff:ff:ff:ff:ff:ff
5: e2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:07 brd ff:ff:ff:ff:ff:ff
6: e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:06 brd ff:ff:ff:ff:ff:ff
7: macvtap0@e2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 500
link/ether 00:00:00:00:00:73 brd ff:ff:ff:ff:ff:ff
8: macvtap1@e3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 500
link/ether 00:00:00:00:00:03 brd ff:ff:ff:ff:ff:ff
9: macvtap2@e4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 500
link/ether 00:00:00:00:00:33 brd ff:ff:ff:ff:ff:ff
The interfaces look right (to me).
The first VM is set up to use e2. The second VM is set up to use e3. The third VM is set up to use e4.
If I arp (on the host) before I ping any of the VM, I get this.
arp -a
? (10.0.20.10) at 00:00:00:00:00:a6 [ether] on e4
? (10.0.10.100) at 00:00:00:00:00:cb [ether] on e0
10.0.20.10 is the gateway of the DMZ.
10.0.10.100 is the IP of my laptop that I have SSH'd to the host using.
route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 10.0.20.10 0.0.0.0 UG 202 0 0 e4
default 10.0.10.10 0.0.0.0 UG 203 0 0 e0
default 10.0.20.10 0.0.0.0 UG 204 0 0 e3
default 10.0.20.10 0.0.0.0 UG 205 0 0 e2
default 10.0.10.10 0.0.0.0 UG 206 0 0 e1
10.0.10.0 0.0.0.0 255.255.255.0 U 203 0 0 e0
10.0.10.0 0.0.0.0 255.255.255.0 U 206 0 0 e1
10.0.20.0 0.0.0.0 255.255.255.0 U 202 0 0 e4
10.0.20.0 0.0.0.0 255.255.255.0 U 204 0 0 e3
10.0.20.0 0.0.0.0 255.255.255.0 U 205 0 0 e2
This proves the route is using e4 and not e0 (as I would like).
If I ping the first VM (10.0.20.71) from the host, I get a response.
Then I arp.
arp -a
? (10.0.20.71) at 00:00:00:00:00:73 [ether] on e4
? (10.0.20.10) at 00:00:00:00:00:a6 [ether] on e4
? (10.0.10.100) at 00:00:00:00:00:cb [ether] on e0
To be expected.
If I then ping VM 3 (10.0.20.72) (my latest, problematic one, with a network defined as e4), it fails. When I run arp I get:
arp -a
? (10.0.20.71) at 00:00:00:00:00:73 [ether] on e4
? (10.0.20.10) at 00:00:00:00:00:a6 [ether] on e4
? (10.0.10.100) at 00:00:00:00:00:cb [ether] on e0
? (10.0.20.72) at <incomplete> on e4
As you can see, it is showing incomplete.
I am running arch linux on the host.
Can anyone please suggest what I need to do to resolve this? Can anyone confirm I have identified the problem correctly (route "priority").
I think I need to change the metric of e0 so that it's at the top of the list.
Basically, can anyone give me some input on whats going on? What should I read up on / do to fix this?
Thanks!