| Version 9 (modified by ahu, 10 months ago) |
|---|
Large Scale DNSSEC Best Current Practices (or 'Best Current Problems')
This is a list of things we have noticed during several large scale PowerDNS DNNSEC migrations.
100% PowerDNS related
- Do NOT use PowerDNS 3.0 or 3.0.1 for large scale DNSSEC, it has too many bugs. The documentation already mentions this, but DNSSEC in 3.0 has been deprecated officially ( http://mailman.powerdns.com/pipermail/pdns-users/2012-July/009099.html).
- Even 3.1 has some bugs, notably related to signing broken records (see below, and you can remove them) and with wildcard records from pre-signed zones, for which a fix is available.
- Various DNSSEC caches also cost memory. We encountered a signing master that was swapping all the time. This kills performance dead. Check if your (virtual) server is short on memory before signing. Also note that due to the way PowerDNS calculates RRSIG expiry, you may see sudden memory jumps at 0:00 on Thursday. Be prepared for those.
- Even though we worked hard to keep things efficient, the same goes for CPU usage.
- The static PowerDNS Auth 3.1 packages for 64-bit Debian crash easily on Ubuntu 12.04 LTS. This can be fixed by compiling yourself, or contact us for an improved binary. The crash is due to a conflict between our static binaries and dynamic gethostbyname NSS calls.
- Either stay close to default PowerDNS DNSSEC parameters or consult an expert. Test your parameters aginst http://dnsviz.net/ and http://dnssec-debugger.verisignlabs.com/.
- Updating from PowerDNS 2.9.x to 3.1 requires a database schema upgrade. Not performing the upgrade can lead to silent failures. Please consult http://doc.powerdns.com/upgrades.html#from2.9to3.0 and http://doc.powerdns.com/from3.0to3.1.html relevant.
- At all times, keep the 'auth' and 'ordername' fields in the database correct, or run 'pdnssec rectify-zone' on changes. Read http://doc.powerdns.com/dnssec-modes.html#dnssec-direct-database for more details.
- You seriously need to run 'pdnssec check-all-zones'. If an RR has one broken record in it, the entire RRSET can not be signed. This leads to pain. In addition, PowerDNS sometimes reacts badly to trying to sign broken records.
- If you use MySQL or PostgreSQL on Linux and you connect over TCP/IP to 127.0.0.1, and you run a big slaving operation, you will need to set /proc/sys/net/ipv4/tcp_tw_recycle to 1 to prevent errors about the kernel being unable to assign an address connecting to 127.0.0.1. Alternatively, connect via UNIX domain sockets.
Somewhat PowerDNS related
- Stay away from mixed authoritative and recursive operation on one IP address. Not only does this make life complicated, PowerDNS with DNSSEC appears to have some bugs in this area.
- When using slaves that AXFR your signed zones, be sure that your slaves actually support serving DNSSEC. Some servers will gladly AXFR a signed zone, but not perform DNSSEC processing on it. This goes for PowerDNS 2.9.x
- Key rollovers. PowerDNS automatically renews the RRSIGs (the signatures for your DNS data), so you don't need to do anything (but look at SOA-EDIT if you have non-PowerDNS AXFR slaves). There are documents which tell you to roll your DNS keys frequently, although it is now believed such automatic rolling is not required. In any case, if you are doing a large scale migration, it is advised to initially not roll keys until the dust has settled.
Generic DNSSEC related
- Do not secure zones which you don't run yourself! A common scenario is where you have company.hu still on your servers, and you are still the registrar, but these days company.hu hosts the domain itself on ns1.company.hu and ns2.company.hu
If you decide to secure all zones in your database, you WILL create a DS for company.hu and give it to the HU.nic. This will kill the domain, as the folks on ns[12].company.hu will not have signed their zone with your key!
This last thing is responsible for the slight dip in the Dutch DNSSEC graph on http://xs.powerdns.com/dnssec-nl-graph/
- Make sure your network is stable. It turns out that various versions of BIND respond to timeouts to your server by declaring it as not supporting EDNS, and thus not DNSSEC. This in turn will disable your signed domains!
ISC is pondering improving the logic of BIND in this respect for DNSSEC signed domains, and we are in productive discussions with them on this subject.
- DNSSEC answers are both larger & different than regular DNS answers. Make very sure that no firewall is in front of your server that blocks TCP or large DNS packets. Many default Cisco configurations do just that.
- Clocks need to be set correctly. Also make sure you can reach an NTP server without relying on DNSSEC as a badly set clock will disable DNSSEC, and hence NTP in that case!
- In any large migration, some things will suddenly stop working because you relied on undefined behaviour or even bugs for proper operations. Be hypervigilant for domains which 'suddenly' malfunction after signing!
- If slaving signed zones via zone transfer (AXFR) , your slaves have a copy of all current DNSSEC signatures (that will expire within two weeks), but not of the actual keys (because AXFR slaves do not need them). So, where slaves used to be "free backups", this is no longer true for DNSSEC unless you replicate the entire database at SQL level (NATIVE replication). We recommend you backup your keys!
Please stay tuned for further 'best current practices'.