updated November 2024

Taking control of your dns

How to set up bind9 as an authorative name server on a Debian Bookworm cloud VM and make sure it behaves in a compliant manner to satisfy the requirements of the DNS Flag Day 2020.

In addition there are some notes on configuring fail2ban to protect our nameserver with a dynamic firewall.

I'm available for hire, but once you read this you probably don't need me!

The scenario

The setup described is suitable for a small web / email server with maybe a couple of fairly low traffic domains. Without getting heavily into the arguments about DNS redunancy requirements which can be found elsewhere, do you have database replication? Do you have backup web and email servers? If not, why is a master slave arrangement for your DNS necessary and why must DNS be served by a third party over which you don't have full control? Would you not like to have as much control over your DNS as is possible for a small operation?

Software

Debian Packages bind9, bind9-dnsutils, bind9-host, bind9-libs:amd64, bind9-utils, dnsutils, fail2ban and haveged should be installed.

Firewalling

DNS queries are made at port 53, which must be open for both udp and tcp. If you have ufw installed:-

ufw allow 53/udp
ufw allow 53/tcp

Authorative only name server

We'll make a start by adding to the options section which is found in /etc/named/named.conf.options, in particular the essential step to limit memory usage, which is to configure bind to be an authorative only name server. We do not want to act as a recursive name server, providing and caching details for domains that we do not have authority for. That is the role of nameservers you specify in the /etc/resolv.conf file of your local machine, that is to say, converting the web site you enter into your browser's address bar into the correct ip address.

In addition to the 3 lines of code to enforce this behaviour, we'll add further option code as suggested here; the purpose of this is to reduce the effectiveness of amplification attacks. We're not going to endure the pain of adding dnssec, so we can also comment out and make explicit as appropriate. With some original comment removed for clarity, our edited file now looks like this:

options {
  directory "/var/cache/bind";        
        
  //dnssec-validation auto;
  dnssec-validation no;
  listen-on-v6 { any; };

  //added as suggested https://www.logcg.com/en/archives/1681.html
  rate-limit {
    ipv4-prefix-length 32;
    window 10;
    responses-per-second 5;
    errors-per-second 5;
    nxdomains-per-second 5;
    slip 2;
  };
  
  version  "unknown";
  //added 3 lines - we're authorative only
  auth-nxdomain no;    // conform to RFC1035
  allow-query     { any; };
  recursion no;
};

Logging

The most appropriate file for other configuration changes is /etc/named/named.conf.local, which is already referenced in /etc/bind/named.conf. Bind logging is infinitely adjustable, but we just want to dump all it's output into a separate file and from there rely on other software to parse the information. logwitch is eminently suited for this.
logging {
  channel bind_log {
    file "/var/log/bind/bind.log" versions 3 size 5m;
    severity info;
    print-category yes;
    print-severity yes;
    print-time yes;
  };
  category default { bind_log; };
  category update { bind_log; };
  category update-security { bind_log; };
  category security { bind_log; };
  category queries { bind_log; };
  category lame-servers { null; };
};
Unfortunately this code will prevent bind from starting; the problem is that apparmor needs to be tweaked to allow the named demon access to the new file. Add the following line to /etc/apparmor.d/usr.sbin.named and restart the apparmor service.
/var/log/bind/bind.log rw,
We also need to ensure our new logs are rotated, by creating /etc/logrotate.d/bind
/var/log/bind/bind.log {
  daily
  missingok
  rotate 7
  compress
  delaycompress
  notifempty
  create 644 bind bind
  postrotate
    /usr/sbin/invoke-rc.d bind9 reload > /dev/null
  endscript
}
The same file, /etc/bind/named.conf.local can also be used to inform the bind server about the zone files we are going to create. We're going to create zone files to resolve ip addresses for two domains. For now, add the following information, substituting your own domains for the ones we're using here.
zone "example.co.uk"{
    type master;
    file "/etc/bind/zones/db.example.co.uk";
};

zone "example.org"{
    type master;
    file "/etc/bind/zones/db.example.org";
};
We can now check our configuration for correct syntax; no output from

root@badwolf:/etc/bind# named-checkconf

is the good news we want to see.

The zone files

These are the files that were referenced in the /etc/bind/named.conf.local file. Create the /etc/bind/zones directory if it doesn't exist and the db.example.co.uk file with similar information to this. When doing this for real substitute with your own domain name, replacing x.x.x.x for your ipv4 address and x:x::x:x the ipv6 address. For the dkim record you'd also replace <selector> and <dkim public key> with values you use (here - and below - for help with this):
$TTL	3H
@	IN	SOA	@ postmaster.example.co.uk. (
			  0		; Serial
			  3H		; Refresh
			  1H		; Retry
			  1W		; Expire
			  3H )		; Negative Cache TTL
@  IN   NS   ns.example.co.uk.
ns IN   A    x.x.x.x  
ns IN   AAAA x:x::x:x
 
@   IN   A    x.x.x.x
@   IN   MX   10 mx.example.co.uk.
@   IN	 AAAA x:x::x:x
@   IN   TXT  "v=spf1 ip4:x.x.x.x ip6:x:x::x:x -all"

mx  IN   A    x.x.x.x
mx  IN	 AAAA x:x::x:x

www IN   A    x.x.x.x
www IN	 AAAA x:x::x:x 

<selector>._domainkey.example.co.uk. IN TXT "v=DKIM1;k=rsa;p=<dkim public key>"

_dmarc.example.co.uk. IN TXT "v=DMARC1; p=reject; adkim=s; aspf=s"

Some notes about this. Firstly and as we we see, this is by no means the only way to configure a zone file. Time to live, $TTL 3H - resolving DNS servers should cache this file for n more than 3 hours, after which they should request the information again.

@ is a shorthand for the domain. Where this is not used, as for some of the subdomains being created, they must be terminated by a period/full stop (.). IN is short for internet. postmaster.example.co.uk. In the start of authority line (SOA) is actually an email address with the first . replacing the more usual @.

In this SOA record, the serial is an arbitary number that must be incremented every time any changes are made and the name server is to be restarted.

The other values within the SOA record parentheses are instructions to secondary name servers (more on that below). They should query our primary name server after the refresh time to check if the serial has changed. If they fail to get a response that should wait the retry time before tryinng again. If they fail to get a reply after expire time, they should cease to offer resolution to any queries. A failure to answer a query, if the requested record is not held for instance, should be cached for the Negative Cache TTL time. The comments behind the ; semi-colon are for readability and are not required for the correct formatting of the zone file.

In addition to creating the name server (NS) record, we're creating a mail exchanger (MX) record, ipv4 or A records and ipv6 or AAAA records. The TXT record for the domain is to configure Sender Policy Framework (SPF). This is a protocol to counter email spammers and phishers who may forge email from addresses using our domain. In this case we are indicating that unless mail from example.co.uk originates at the ipv4 address x.x.x.x or ipv6 address x:x::x:x it should be refused.

The other two TXT records, are actually for subdomains as mandated by the DKIM and DMARC protocols. The DNS TXT record is only one part of setting up DKIM, your mail exchange software must be configured to sign outgoing email with the private DKIM key. The selector can be anything you like - I use a date string such as 202204 to remind me when this key was set up. Angle brackets enclosing selector and key are only to indicate a substitution is to be made. The DMARC TXT record informs receiving mail exchangers that email that genuinely originates from example.co.uk should have both SPF and DKIM values. Note the period (.) at the end of these subdomains.

We can now run some checks:-

root@example.co.uk:# named-checkconf 
root@example.co.uk:#
root@example.co.uk:# named-checkzone example.co.uk /etc/bind/zones/db.example.co.uk 
zone example.co.uk/IN: loaded serial 0 
OK
root@example.co.uk:#
and provided they pass as above, we can restart bind9, after which a further check to confirm we have created the name server can be made:-
root@example.co.uk:# dig @localhost NS example.co.uk +short
ns.example.co.uk.
root@example.co.uk:#

The glue record

So far we have a name server that will answer any queries it receives from the local machine; we will want to ask our domain registrar to configure example.co.uk to use ns.example.co.uk as it's name server and we should be able to do that from the account log in provided. There is a problem though; if example.co.uk is to be reached at it's x.x.x.x ipv4 address, how are queries to ns.example.co.uk, which is in that domain, to reach it's name server? The solution is a glue record, which the domain registrar should allow you to create.

With a glue record in place, the ns.example.co.uk should be accepted as the name server and within a few hours to days, once DNS has propagated, the dig command can be repeated from an outside machine and should confirm we are done:

david@bulawayo:~ $ dig @ns.example.co.uk NS example.co.uk +short
ns.example.co.uk.
david@bulawayo:~ $

Testing resources

The test at dnsflagday is important. Poorly behaving name servers are slowing down the internet and in response a group of the major providers is going to refuse to struggle with them, which will result in domains that are poorly served by their name servers going off line.

SPF, DKIM and DMARC records can be tested for at the mxtoolbox site. Remember that with DKIM the DNS record is only part of the setup; if you're running exim4 the write up here can help with DKIM signing.

Adding a second domain to our name server

We'll demonstrate this with example.org. Since we already have a name server running on our VM in our example.co.uk domain there is no need to create a second name server. Because of this, the zone file needs to be a little different.
$TTL 3H
@   IN SOA  ns.example.co.uk. postmaster.example.co.uk. (
                0   ; serial
                3H  ; refresh
                1H  ; retry
                1W  ; expire
                3H )    ; minimum				
 
@   IN   NS  ns.example.co.uk.
@   IN   A   x.x.x.x
@   IN   MX  10 mx.example.org.
@   IN	 AAAA x:x::x:x

mx IN  A x.x.x.x
mx IN  AAAA x:x::x:x

www IN   A x.x.x.x
www IN	 AAAA x:x::x:x

<selector>._domainkey.example.org. IN TXT "v=DKIM1;k=rsa;p=<dkim public key>"

_dmarc.example.org. IN TXT "v=DMARC1; p=reject; adkim=s; aspf=s"

;

The major difference is that we specify the NS record, but do not add an A record for it as that is done within it's own zone file. Also note that we are creating a second DNS personality for the single mail exchanger we have and as explained here, there may be reason to do this differently.

A work around

some TLDs, such as .org, insist that the domain configuration must include more than a single name server. Some providers (linode is an example) easily allow the creation of a slave zone that accepts a zone transfer (or AXFR) with pretty much zero configuration. So as per linode's instructions you can cater for this issue when you add the zone to the /etc/bind/named.conf.local file, which we need to revisit. Now only the changes from the initial version above are shown:-

 
//added for a second domain
zone "example.org" {
    type master;
    file "/etc/bind/zones/db.example.org";	  
    //AXFR transfer to the linode name servers
    allow-transfer {     
	96.126.114.97;
	96.126.114.98;
	2600:3c00::5e;
	2600:3c00::5f;   
   };
   also-notify {
	96.126.114.97;
	96.126.114.98;
	2600:3c00::5e;
	2600:3c00::5f;
  };
};
You must also configure the domain at the registrar with the linode name servers and also remember to increment the Serial of the zone file when any changes to records are made. The relevant snip which you would change from 0 to 1 to start a transfer:-
$TTL	3H
@	IN	SOA	@ postmaster.example.org. (
			  0			; Serial

Reverse name resolution

It's very unlikely that our VM provider hands over authority for the reverse DNS of the ipv4 addresses it allocates, rather you would ask them to configure this in all likelyhood. So we'll illustrate this with our domains running in a local area network using the 192.18.0.0/24 range, We'll have example .co.uk being served at 192.168.0.5 and example.org at 192.168.0.6. The issue is to have a lookup of those ipv4 addresses resolve to the respective domain names. This is done with PTR records.

We add a new zone, 0.168.192.in-addr.arpa to our named.conf.local with its file db.192.168.0 having these contents:

$TTL 3H
@    IN      SOA     ns.example.co.uk. postmaster.example.co.uk. (
                          0         ; Serial
                          3H       ; Refresh after 3 hours
                          1H       ; Retry after 1 hour
                          1W       ; Expire after 1 week
                          3H )     ; Negative caching TTL
0.168.192.in-addr.arpa.       IN      NS      ns.example.co.uk.
5   IN      PTR     example.co.uk.
6   IN      PTR     example.org.

Fending off the bad guys

As with any other service offered by a server, bad guys will attempt to abuse our name server. Here's a couple of lines from /var/log/bind/bind.log after logging was set up as described above. I've seen lines similar to the first of these being repeated hundreds of times in an eye blink and this is what we want to stop.
07-May-2022 00:47:14.161 query-errors: info: client @0x7f5990623690 167.94.138.62#48147 (ip.parrotdns.com): 
query failed (REFUSED) for ip.parrotdns.com/IN/A at query.c:5498

07-May-2022 14:51:16.395 rate-limit: info: client @0x7f599061b810 85.94.75.139#51396 (ns.example.co.uk): 
rate limit slip response to 85.94.75.139/32 for ns.example.co.uk IN AAAA  (c1f69fe2)
Fail2ban offers two jails to protect bind's named daemon, but received wisdom is that it's best not to enable the named-refused-udp jail as this can lead to your server becoming involved in an amplification attack on another nameserver.

So create /etc/fail2ban/jail.local. Note that fail2ban enables the ssh jail by default, so this config assumes you have made arrangements that make this unnecessary:-

[DEFAULT]
bantime = 99h
usedns = no
findtime = 300
maxretry = 2

[sshd]
port    = ssh
logpath = %(sshd_log)s
backend = %(sshd_backend)s
enabled = false

[named-refused-tcp]
enabled  =  true
port     = domain,953
protocol = tcp
filter   = named-refused
logpath  = /var/log/bind/bind.log

[named-refused-udp]
enabled  = false
port     = domain,953
protocol = udp
filter   = named-refused
logpath  = /var/log/bind/bind.log
A final step is to adjust the standard filter for the named-refused-tcp jail to ensure that log lines with both query-errors: and rate-limit: are matched. Create the file /etc/fail2ban/filter.d/named-refused.local:-
[Definition]

# Daemon name
_daemon=named

# Shortcuts for easier comprehension of the failregex

__pid_re=(?:\[\d+\])
__daemon_re=\(?%(_daemon)s(?:\(\S+\))?\)?:?
__daemon_combs_re=(?:%(__pid_re)s?:\s+%(__daemon_re)s|%(__daemon_re)s%(__pid_re)s?:)

#       hostname       daemon_id         spaces
# this can be optional (for instance if we match named native log files)
__line_prefix=(?:\s\S+ %(__daemon_combs_re)s\s+)?

failregex = .*(query-errors|rate-limit): info: client @0x[a-z0-9]{12} #.*


ignoreregex =
Note that some of that is copied from the standard /etc/fail2ban/filter.d/named-refused.conf file which we are overriding and may not be essential. It's the failregex line that's important and all that's needed now is to restart fail2ban and have bad guys jailed for several days.