Large number of routes installed into kernel cause high cpu usage
I've been using fedora and a routing suite called FRR to install routes into the linux kernel. When I install a large number of routes CPU usage of several gnome programs hit 100% and stay there:
1172 root 20 0 6199948 4.9g 5964 R 99.7 20.8 24:41.98 NetworkManager
2544 sharpd 20 0 6396920 5.5g 2320 R 99.7 23.6 30:02.87 gsd-sharing
21160 sharpd 20 0 4689524 1.9g 99900 R 99.3 8.1 22:27.56 gnome-shell
Additionally memory usage spikes and grows significantly. Perf top is reporting:
45.28% libgio-2.0.so.0.6000.4 [.] g_inet_address_mask_equal
20.55% libgio-2.0.so.0.6000.4 [.] g_inet_address_equal
7.44% libgio-2.0.so.0.6000.4 [.] g_inet_address_get_family
7.40% libgio-2.0.so.0.6000.4 [.] g_network_monitor_base_add_network
6.47% libgio-2.0.so.0.6000.4 [.] g_inet_address_get_type
3.84% libgio-2.0.so.0.6000.4 [.] g_inet_address_to_bytes
3.72% libgio-2.0.so.0.6000.4 [.] g_inet_address_mask_get_type
2.98% libc-2.29.so [.] __memcmp_avx2_movbe
1.59% libgio-2.0.so.0.6000.4 [.] g_inet_address_get_native_size
0.32% libgio-2.0.so.0.6000.4 [.] 0x000000000003d5c0
0.22% libgio-2.0.so.0.6000.4 [.] 0x000000000003d5c4
0.01% libgio-2.0.so.0.6000.4 [.] g_network_monitor_base_set_networks
0.01% [kernel] [k] native_write_msr
0.01% [kernel] [k] timerqueue_add
0.01% [kernel] [k] clockevents_program_event
0.01% libgio-2.0.so.0.6000.4 [.] 0x0000000000103f80
And when I look at the g_network_monitor_base_add_network code:
g_network_monitor_base_add_network (GNetworkMonitorBase *monitor,
GInetAddressMask *network)
{
int i;
for (i = 0; i < monitor->priv->networks->len; i++)
{
if (g_inet_address_mask_equal
(monitor->priv->networks->pdata[i], network))
return;
}
g_ptr_array_add (monitor->priv->networks, g_object_ref (network));
if (g_inet_address_mask_get_length (network) == 0)
{
switch (g_inet_address_mask_get_family (network))
{
case G_SOCKET_FAMILY_IPV4:
monitor->priv->have_ipv4_default_route = TRUE;
break;
case G_SOCKET_FAMILY_IPV6:
monitor->priv->have_ipv6_default_route = TRUE;
break;
default:
break;
}
}
/* Don't emit network-changed when multicast-link-local routing
* changes. This rather arbitrary decision is mostly because it
* seems to change quite often...
*/
if (g_inet_address_get_is_mc_link_local
(g_inet_address_mask_get_address (network)))
return;
queue_network_changed (monitor);
}
It sure looks like to me that the code is storing the routes as an unsorted array. Worse yet for each route add that happens the code scans every route in the table and then inserts it( so 2 times! ). I don't know gio and I don't know what data structures are available for usage here. Can I get someone to look at this and hopefully quickly fix it?
I am willing to help debug/test any patches.