I'm running BGP using FRR on Debian Linux on several machines. My question might end up having to do with something in the FRR/BGP configuration but I'm trying to understand at a more basic level why a particular IPv6 route selection is happening (from the Linux kernel).
I have a machine "a3" which is peered with "a1" and "a2". "a1" and "a2" are route reflectors and are both providing a default gateway to a3. Here you can see a3's IPv6 routing table:
root@a3:~# ip -6 route
::1 dev lo proto kernel metric 256 pref medium
2602:fbbc:0:2::/64 dev vxbr2 proto kernel metric 256 pref medium
2602:fbbc:0:65::/64 dev vxbr101 proto kernel metric 256 pref medium
2602:fbbc:1:1::/64 dev 000_bridge proto kernel metric 256 pref medium
fe80::/64 dev 000_bridge proto kernel metric 256 pref medium
fe80::/64 dev vnet7 proto kernel metric 256 pref medium
fe80::/64 dev vxbr101 proto kernel metric 256 pref medium
fe80::/64 dev vxbr2 proto kernel metric 256 pref medium
fe80::/64 dev vnet40 proto kernel metric 256 pref medium
fe80::/64 dev vnet43 proto kernel metric 256 pref medium
fe80::/64 dev vnet46 proto kernel metric 256 pref medium
fe80::/64 dev vnet47 proto kernel metric 256 pref medium
fe80::/64 dev vnet54 proto kernel metric 256 pref medium
fe80::/64 dev vnet57 proto kernel metric 256 pref medium
fe80::/64 dev vnet58 proto kernel metric 256 pref medium
fe80::/64 dev vnet63 proto kernel metric 256 pref medium
fe80::/64 dev 001_bridge proto kernel metric 256 pref medium
default nhid 36 proto bgp metric 20 pref medium
nexthop via 2602:fbbc:1:1::1 dev 000_bridge weight 1
nexthop via 2602:fbbc:1:1::2 dev 000_bridge weight 1
As I understand it, the line near the bottom reading default nhid 36 proto bgp metric 20 pref medium is indicating that the nexthop entry numbered 36 is being used as the default route, which contains two other separate entries, one for 2602:fbbc:1:1::1 and one for 2602:fbbc:1:1::2.
Here's the nexthop table:
root@a3:~# ip nexthop
id 15 dev 001_bridge scope host proto zebra
id 16 dev 000_bridge scope link proto zebra
id 26 dev vxbr2 scope link proto zebra
id 27 dev vxbr101 scope link proto zebra
id 31 via 2602:fbbc:1:1::1 dev 000_bridge scope link proto zebra
id 32 via 10.1.0.1 dev 001_bridge scope link proto zebra
id 36 group 31/37 proto zebra
id 37 via 2602:fbbc:1:1::2 dev 000_bridge scope link proto zebra
So I would think, due to the sequence here (it is earlier in the nexthop list, lowered numbered and first in the sequence of id 36 group 31/37 proto zebra) that 2602:fbbc:1:1::1 would be selected as the default gateway, but this is not the case. Looking up any random public IPv6 address gives:
root@a3:~# ip -6 route get 2001:4860:4860::8888
2001:4860:4860::8888 from :: via 2602:fbbc:1:1::2 dev 000_bridge proto bgp src 2602:fbbc:1:1::a3 metric 20 pref medium
And I can confirm via traceroute6 and any other tools available that 2602:fbbc:1:1::2 is definitely being selected as the gateway, not 2602:fbbc:1:1::1. And I have no idea why.
Also, ip -6 route show cache gives no output, and ip -6 route flush cache has no effect, so it doesn't seem to be route cache related. There do not appear to be any custom rules configured either:
root@a3:~# ip -6 rule show
0: from all lookup local
32766: from all lookup main
I'm sure I will have more to tweak on the BGP configuration to resolve this but just from the perspective of how the route selection is done in Linux, does anyone have an idea on what could be causing this? (And any ideas on what parameter could be tuned to fix it?)