mbdns 1.0.1
Sunday, Jan 6, 2019 · 700 words · approx 4 mins to readI’ve just released mbdns
v1.0.1, a bug fix release that stops mbdns
devouring all open file descriptors on the system. While it’s just a 2 line change, how I noticed, investigated and fixed the bug, and verified the fix was correct, might be useful to others.
mbdns
runs an infinite loop to process record updates, and every record update opens a file descriptor (fd
) for the socket it needs to make the http request to Mythic Beasts’ API endpoint. I’d mistakenly used an (idiomatic) golang language feature called defer
to read the request’s response, which works by releasing the object held by defer when the outer function body exits.
But because of the infinite loop, which never exits by design, each trip round the loop would consume an extra fd
per request and eventually exhaust them. Oops! How did I notice that was true? Running it for a while on the Ubiquiti EdgeRouter X (ER-X) that powers my home network caused fd
exhaustion quite quickly due to a low number on that platform, which stopped the router from doing its other jobs. Internet access suddenly went away!
How do you figure out the number of possible open file descriptors on a Unix-like platform? You run ulimit -Hn
to see the hard limit (4096 on an ER-X), and ulimit -Sn
to see the soft limit (1024 on the ER-X). I’d setup mbdns
to update 2 records every 300 seconds (24 fd
s consumed per hour), so it only took a little over 42 hours to exhaust the 1024 fd
soft limit and bring the system to a halt.
Knowing that mbdns
was exhausting descriptors, I took at look at its log to see what was happening after the limit was reached. I should have taken a copy of the exact message returned by golang when a http request fails due to lack of fd
s, but it was quite descriptive and very clear about what was happening: no fd
s were available so sockets couldn’t be used.
Knowing that the HTTP request was the reason for exhaustion, due to knowing that on Unix platforms a socket operation opens a descriptor, all I needed to do was look at the mbdns
source code and go through the process()
loop to see where it was creating and holding on to http.Response
objects.
mbdns
is incredibly simple, so the fix was easy: always read the response body and close it, regardless of whether it’ll be used or not (it only gets printed into the log on error during normal operation).
How did I verify the fix was correct? I modified a test version of mbdns
to shorten the loop iteration wait time to 10 seconds, ran it, and then asked the ER-X how many open file descriptors mbdns
was holding with the following:
lsof -p $(pidof mbdns)
That command lists the open files (lsof
) of the process ID belonging to mbdns
(pidof mbdns
). Running that a few times to see the output before, during and after a run round the loop a few times showed that there was an extra fd
per request per loop, but that it was released when the request was complete. Success!
There was some other interesting output from lsof
, showing me the other descriptors the process always keeps open. Those included what the shell was holding to run — one fd
to the binary image for mbdns
itself on disk — a descriptor for reading /dev/urandom
, which is presumably to give the golang runtime a source of randomness, and also the two file handles for stdin
and stdout
that I redirect to the log to see what it’s doing!
lsof
is a pretty handy utility to get to know, to help you get a handle (NOT SORRY) on what a process is doing with files on a Unix or Unix-like system.
If you’re an mbdns
user, please upgrade to v1.0.1 as soon as possible, lest you run out of file descriptors on the host system eventually (or pretty soon if you run it on a resource constrained platform like the ER-X!).