Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neigh: enable garbage collection #1068

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dirkmueller
Copy link
Contributor

VIPs and floating ips that move between differnet interfaces might stay
for very long times cached incorrectly in the neighbor table until the
garbage collection kicks in. by default a STALE (so an entry that used
to have an active connection but now doesn't anymore) gets garbage
collected after gc_stale_timeout, but only if there are more than
gc_thresh1 STALE entries in total. The default of 128 means that one has
to accumulate 128 stale entries (or trigger a forced cache flush) until
this is happening, which for small/low traffic clouds can take an
eternity.

@dirkmueller dirkmueller added this to the Cloud 7 Update1 milestone Feb 4, 2017
Itxaka
Itxaka previously approved these changes Feb 6, 2017
@@ -6,6 +6,14 @@
net.ipv4.ip_local_reserved_ports = 35357
# Increase system IP port range to allow for more concurrent connections
net.ipv4.ip_local_port_range = 27018 64999
# ensure STALE arp neighbor entries expire from the cache, otherwise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARP?

# VIPs of an OpenStack service or the floating IP of a VM
# might not become reachable
# gc_thresh1 is the lower threshold that needs to be reached before
# stale entries are getting garbage collected. the default of 128 means
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The?

VIPs and floating ips that move between differnet interfaces might stay
for very long times cached incorrectly in the neighbor table until the
garbage collection kicks in. by default a STALE (so an entry that used
to have an active connection but now doesn't anymore) gets garbage
collected after gc_stale_timeout, but *only* if there are more than
gc_thresh1 STALE entries in total. The default of 128 means that one has
to accumulate 128 stale entries (or trigger a forced cache flush) until
this is happening, which for small/low traffic clouds can take an
eternity.
Copy link
Member

@vuntz vuntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openstack-ansible is using a different approach: https://git.openstack.org/cgit/openstack/openstack-ansible-openstack_hosts/tree/defaults/main.yml#n46

Does that make sense? Or is your approach better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants