[SAC] [OSGeo] #1693: Slow DNS lookups on tracsvn

#1693: Slow DNS lookups on tracsvn
---------------------------+--------------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: Systems Admin | Keywords: tracsvn, dns
---------------------------+--------------------------
I've just updated the Gogs implementation to cache DNS lookups because
otherwise it would take up to 14 seconds to render the user-explore page,
due to having to lookup the domain of each user's email in the page (~20
per page).

Now, with the cached lookups the time to render the page goes down from 14
second to 5 _milli_ seconds.

This tells me there's an issue with the DNS resolution.
I don't know which other services do use DNS lookups, but if any
production service does it too (beside Gogs which is only experimental)
we'd better install a local caching server, like dnsmasq (or, as I've been
suggested, "unbound").

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

Ok, after getting errors twice while responding to the ticket, I chose
to respond here:

#1693: Slow DNS lookups on tracsvn
---------------------------+--------------------------
Reporter: strk | Owner: sac@???
     Type: task | Status: new
Priority: normal | Milestone:
Component: Systems Admin | Keywords: tracsvn, dns
---------------------------+--------------------------

[...]

This tells me there's an issue with the DNS resolution.
I don't know which other services do use DNS lookups, but if any
production service does it too (beside Gogs which is only experimental)
we'd better install a local caching server, like dnsmasq (or, as I've been
suggested, "unbound").

Hi Sandro, did you already switch Gogs over to using libc-based name resolution, as suggested ?

BTW, Trac error upon submission is:

Trac detected an internal error:

IndexError: list index out of range

Cheers,
  Martin.
--
Unix _IS_ user friendly - it's just selective about who its friends are !
--------------------------------------------------------------------------

On Fri, May 27, 2016 at 10:36:28AM +0000, Martin Spott wrote:

Ok, after getting errors twice while responding to the ticket, I chose
to respond here:

> #1693: Slow DNS lookups on tracsvn
> ---------------------------+--------------------------
> Reporter: strk | Owner: sac@???
> Type: task | Status: new
> Priority: normal | Milestone:
> Component: Systems Admin | Keywords: tracsvn, dns
> ---------------------------+--------------------------
[...]
> This tells me there's an issue with the DNS resolution.
> I don't know which other services do use DNS lookups, but if any
> production service does it too (beside Gogs which is only experimental)
> we'd better install a local caching server, like dnsmasq (or, as I've been
> suggested, "unbound").

Hi Sandro, did you already switch Gogs over to using libc-based name resolution, as suggested ?

I tried that locally and did not have a big speedup.
It was something like from 400ms to 330ms (locally).

BTW, Trac error upon submission is:

Trac detected an internal error:

IndexError: list index out of range

I've filed https://trac.osgeo.org/osgeo/ticket/1694 for this.
You were considered a spammer.
Now I granted you SPAM_TRAIN so you should be able to see the
monitoring page: https://trac.osgeo.org/osgeo/admin/spamfilter/monitor
(now clean as I did it).

Not sure as a spam-trainer you would ever be considered spammer again

--strk;

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------
Changes (by TemptorSent):

* component: Systems Admin => DNS

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:1&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by TemptorSent):

A local caching DNS server with DNSSEC capability is mandatory for proper
operation. As mentioned in the ticket, either dnsmasq --
http://www.thekelleys.org.uk/dnsmasq/doc.html or Unbound would be a good
option -- https://www.unbound.net/index.html and both should be available
to install using the package manager.

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:2&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by TemptorSent):

To explain further --
A recursive caching name server is used to look up and locally cache DNS
information from the appropriate authoritative name server, which it
discovers more or less in the following manner:
- starting at the root name server, it sends a query for the target
record; the root nameserver replies based on the TLD (top level domain -
osgeo.org, for example) with the address of the name servers for that
domain
- it then queries each successive name server recursively until it find an
authoritative record for the target
- each record has an associated TTL or "Time To Live", which determines
how long it is allowed to be cached for

DNSSEC uses cryptographic techniques to ensure that only records actually
originating from an authorized authoritative nameserver are accepted --
without this, a technique called DNS cache poisoning can be used to insert
records linking valid names to malicious addresses. Ensuring that only
DNSSEC authenticated records are cached when available will prevent a
large class of DNS related exploits.

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:3&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by wildintellect):

I'm in favor of trying unbound, I have seen it in use around my work. Easy
to remove if it doesn't work. Question is, can we get it from repos on
such an old box?

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:4&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by strk):

It would be also helpful to understand which services are doing
dns lookups, in case those lookups can be disabled. For example
I know for sure that Gitea is doing DNS lookups for finding
user avatars, and I also know Gitea must be compiled in a certain
way for it to be using the system resolver

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:5&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by TemptorSent):

Pretty much every service that logs incoming connections will be hitting
DNS (unless you explicitly turn it of, leaving just IPs), as does the mail
server (mandatory), and any thing else that wants to know the name, ip
address, mail server, dkim keys, or anything else that DNS provides.
Disabling nslookups in very specific cases where many cache-misses occur
on a regular basis, but otherwise it just makes sense to use a local fast
cache. We don't necessarily need to use the system resolver from gitea,
just point it at a the ip of the local caching server to use for lookups.

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:6&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by strk):

Now available as packages in current TracSVN machine:
{{{
unbound 1.4.22-1~bpo70+1 (Installed-Size: 1585)
dnsmasq 2.62-3+deb7u4 (Installed-Size: 120)
}}}

I'd go the smaller route unless Unbound does anything that dnsmasq is not
doing..

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:7&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by strk):

Here are instructions to setup dnsmasq on Debian:
https://wiki.debian.org/HowTo/dnsmasq

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:8&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by martin):

According to my experience, setting up dnsmasq as DNS cache on nowadays
Debian systems requires nothing else but installing the package.

Please let me know if it doesn't.

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:9&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by strk):

I've installed dnsmasq but /etc/resolv.conf is not automatically
updated to point to localhost, which makes me think it is NOT
enough to install the package.

I've manually added 127.0.0.1 as first entry i /etc/resolv.conf

Not sure how to test/compare performances now though
(nor how to convince Gitea to use 127.0.0.1 to lookup
names)

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:10&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by martin):

Ah ok, thanks for responding, maybe I'm using more recent versions of the
respective package.

The installer of the dnsmasq package I know puts the local host as the one
and only nameserver in /etc/resolv.conf and writes the forwarder into a
file somewhere in /var/run/dnsmasq/ or the like. If the dnsmasq process
proves to be robust, let's try that.

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:11&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by strk):

I did not follow the debian guide completely, so for example
I don't know if dnsmasq is serving every host rather than just
localhost, can you check that part Martin ?

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:12&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by martin):

Is everybody using the system resolver or do some local services rely on
custom DNS configuration ?

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:13&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by strk):

I'm not sure what Gitea (focus of this ticket) is using,
don't know how to tell. No idea about any other service.

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:14&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by martin):

I'd recommend to make the DNS cache listen on localhost only and configure
resolv.conf to connect localhost first.\\
If we're unsure, which services rely on connecting the cache, we might
simply enable the configuration that way and see if it breaks anything. Ok
?

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:15&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by strk):

Fine by me

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:16&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#1693: Slow DNS lookups on tracsvn
--------------------------+--------------------
Reporter: strk | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone:
Component: DNS | Resolution:
Keywords: tracsvn, dns |
--------------------------+--------------------

Comment (by strk):

Just make sure to create a dnsmasq configuration file (and please put
under revision control) because the trick of having non-localhost
addresses in /etc/resolv.conf is supported by dnsmasq as a poor man's
configuration of it (dnsmasq will use the non-localhost addresses as
upstreams)

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/1693#comment:17&gt;
OSGeo <http://www.osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.