Tutorial: dns-failover

It is easy to run your website on different servers.

It is easy to run databases accross different servers.

Because virtual servers got quite cheap.

But it is not that easy to ensure that your visitors are visiting a server that is still running.

You can setup more than one A record but a lot of browsers do not support DNS load balancing.

You can setup a load balancer but then you just moved the problem to another place. You load balancer is now the single point of failure.

For me DNS looks like a good solution to point visitors to the right server.

During the next weeks I will add more complex scenarios on how to handle DNS failover.

But I want to start with a low effort and simple solution.

Afterwards more tools and servers are added to the setup.

So everyone can decide how much effort he/she wants to put into his/her own DNS failover system.

So lets start with the first step into DNS failover.

1. Create a DNS server account which is supporting dynamic DNS updates
For me HE.NET is offering a cheap ($0.00) and reliable DNS service.

If you add an A record you can select that this record can be dynamically updated through a script:

dynamicdns1.JPG

TTL (time to live) for this can be setup to up to 5 minutes.
Quite a short amount of time for a free service.

dynamicdns2.JPG

After the creation you have to click on the arrows on the right side to add your access key.

dynamicdns3.JPG

This will be your password to update the A record. [The values are not real - so don't try them.]

Best addon it is working for AAAA (IPv6!) too:

dynamicdns4.JPG

The update of the ip is simple:

curl "https://dyn.dns.he.net/nic/update?hostname=dyn.example.com&password=password&myip=192.168.0.1"
curl "https://dyn.dns.he.net/nic/update?hostname=dyn.example.com&password=password&myip=2001:db8:beef:cafe::1"

Just use curl to call a url.

2. Write a short bash script that is managing everything

So what do we need?

  • A textfile containing ip addresses of the web servers
  • A way to check which servers are online
  • A call to HE.NET to update the DNS A record

I am using just bash, curl and dig.

Dig ist part of the dnsutils and can be installed through following command:

sudo apt-get install dnsutils

After that we can create the file containing the ips:

nano ~/ips

Content:

127.0.0.1
186.0.0.1
10.1.1.1

So one ip per line. I am using the order to priorize the servers because the script is taking the first usable ip to update the DNS record.

Now we can create the bash file:

nano ~/dnsupdate && chmod +x ~/dnsupdate

Content:

#!/bin/bash
IFS=/pre>\n' read -d '' -r -a ips < ~/ips
statusweb=()
index=0
echo "=================================="
echo "check following ips"
echo "${ips[@]}"

for i in ${ips[@]}
do
    echo "=================================="
    echo "checking $i" 
    let index=index+1
    if curl -m 5 -s -k --head --request GET $i | grep "200 OK" > /dev/null
    then
            statusweb[index]=true
          echo "================="
          echo "web ip is up"
          echo "================="      
    else
        statusweb[index]=false
          echo "================="
          echo "web ip is down"
          echo "================="  
    fi
done

echo "=================================="
echo " "
echo "update dns"
echo " "
index=0
for statuswebval in ${statusweb[@]}
do
    if $statuswebval
    then
        echo "=================================="
        echo "Changing web DNS..."
        oldip=$(dig +short test.domain.com)
        echo "current ip: ${oldip}"
        echo "new     ip: ${ips[$index]}"
        echo "=================================="
        if [ "${ips[$index]}"=="$oldip" ]
        then
                  echo "================="
                  echo "update not needed"
                  echo "================="
        else
                   curl "https://dyn.dns.he.net/nic/update?hostname=test.domain.com&password=astromgpassword&myip=${ips[$index]}"
                   echo "================="
                   echo "update done"
                   echo "================="
        fi
        break
    else
        echo "================="
        echo "Skipping ${ips[$index]}"
        echo "================="
    fi
    let index=index+1
done

echo "=================================="
echo "end of script"
echo "=================================="

So what is this script doing?

  • read the list of ips into an array (ips)
  • create two empty arrays (statusweb,index)
  • echo the list of ips
  • loop through the ips
    • for each ip do
      • add 1 to index (let is cool)
      • curl the http header of the webservice running on the ip and check if it is 200
        Timeout is set to 5 seconds to ensure the script is not locked.
      • curl is returning true or false so it can be part of an if statement
      • we store the status of the ip with a true or false value in the array statusweb
  • loop through the status values
    • for each status do
      • dig the DNS record you want to update
        +short ensures that only the ip is returned
      • compare the ip of the DNS record with the first ip which is working
      • update the DNS record or skip it
  • done

Last step is creating a cron job calling this script every 5 minutes:

crontab -e

Add line:

/5 * * * * /usr/bash ~/dnsupdate 

I think this is the bare minimum setup to check webservers and update DNS records.

So let's talk about some disadvantages:

  • Webservers are only checked from one internet connection
    So you cannot be sure if the server is offline for the all visitors or only for you
  • There is no history record for the reliability of one ip
    So you cannot be sure that you select an ip that is currently up but does only have an update of 80%
    You can try to manage that by sorting the list of ips but you have to keep the records for that too.
  • If you use more than one vps to run this script it might happen that the different scripts will overwrite the results of other scripts.
    So if you have a network split or more than one routing issue the DNS record is flipping around.

As always I am looking for feedback, improvements and other solutions.

Next step is to add CloudFlare support.

First thing you need is the API-KEY, which can be found on the buttom of your Account information.

cloudflare1.JPG

The API itself is easy, but is using JSON.

So we need some Ruby magic to get this done.

nano dnsupdate.rb

Content:

require 'json'
domain = ARGV[0]
ip = ARGV[1]
id = ""
listResponse = `curl [parameters 1]`
puts listResponse

domains = JSON.parse(listResponse)
domains['response']['recs']['objs'].each do | domainrecord |
    puts domainrecord
        if (domain == domainrecord['name'])
            id = domainrecord['rec_id']
            break
        end
end

updateResponse = `curl [parameters 2]`
status = JSON.parse(updateResponse)
puts status

if status['result'] == 'success'
    puts "update done"
else
    puts "error during update of #{domain}"
end

So what are we doing?

  • save the two parameters to the vars domain and ip
  • Call curl to get the list of domains and DNS records
  • Pars the JSON response to find the correct record for the given domains
    In this example I want to update the A record for the domain itself
  • If the record is found save the id of the record (needed for update)
  • Send the update request via curl
  • Check the status to ensure that the update is done

We now take a look at the two curl calls:

  • list domains and records

    curl https://www.cloudflare.com/api_json.html \
      -d 'a=rec_load_all' \
      -d 'tkn=[Your API_TOKEN]' \
      -d 'email=[Your CloudFlare login]' \
      -d 'z=[domain to update]'
    
  • update domain

    curl https://www.cloudflare.com/api_json.html \
      -d 'a=rec_edit' \
      -d 'tkn=[Your API_TOKEN]' \
      -d 'id=[DB ID of the record you want to update]' \
      -d 'email=[Your CloudFlare login]' \
      -d 'z=[Domain of record]' \
      -d 'type=A' \
      -d 'name=[Name of record to update]' \
      -d 'content=[new ip address]' \
      -d 'service_mode=1' \
      -d 'ttl=1' \
    

    TTL is the time to live of record in seconds. 1 is the value for the "Automatic" setting.

If you have got a nodeping account you can use the provided results as the source of ping numbers.

You have to set the results to "public access" to enable the script to download the ping results without an API key.

This is my Ruby script that catches the ping results from nodeping, checks the number of network failures, and sets the ip which is a) currently online and b) does have the lowest number of failures as the new value of the A record of the given domain:

require 'json'
require 'date'

class NodePingResult
  attr_accessor :ip, :isup, :numberOfBadResults

  def to_s
    "[[email protected]](/cdn-cgi/l/email-protection)} [[email protected]](/cdn-cgi/l/email-protection)} [[email protected]](/cdn-cgi/l/email-protection)}"
  end
end

nodepingReports = []
nodepingIPs = []
nodePingResults = []

recordId = ""
ip = ""

#######################################################
#Please change you cloudflare and nodeping information
#######################################################
domain = 'mydomain'
cloudflaretoken='QWERTZUIOP'
[[email protected]](/cdn-cgi/l/email-protection)'
##################################
nodepingReports << 'https://nodeping.com/reports/results/[reportid]/50?format=json'
nodepingIPs << '127.0.0.1' 
nodepingReports << 'https://nodeping.com/reports/results/[reportid]/50?format=json'
nodepingIPs << '127.0.0.1' 
#######################################################

nodepingIPs  = nodepingIPs.reverse

counter = 0
nodepingReports.reverse_each do | report |

    res = NodePingResult.new
    res.ip = nodepingIPs[counter]
    res.numberOfBadResults = 0

    reportResult = `curl #{report}`
    results = JSON.parse(reportResult)
    results.each do | result |
        if ('Success' == result['m'])
            res.isup = true
        else
            res.isup = false
            res.numberOfBadResults += 1
        end
    end
    counter += 1
    nodePingResults << res
end

nodePingResults.sort! { |a,b| a.numberOfBadResults <=> b.numberOfBadResults }
nodePingResults.each do | newip |
    if (newip.isup == true)
        ip = newip.ip
        break
    end
end
puts "selected ip: #{ip}"

parameterDomainList = "-d 'tkn=#{cloudflaretoken}' -d 'email=#{cloudflarelogin}' -d 'z=#{domain}'"
listResponse = `curl https://www.cloudflare.com/api_json.html -d 'a=rec_load_all' #{parameterDomainList}`
#puts listResponse

domains = JSON.parse(listResponse)
domains['response']['recs']['objs'].each do | domainrecord |
        puts domainrecord
                if (domain == domainrecord['name'])
                        recordId = domainrecord['rec_id']
                        break
                end
end
puts recordId

parameterDomainUpdate = "-d 'tkn=#{cloudflaretoken}' -d 'id=#{recordId}' -d 'email=#{cloudflarelogin}' -d 'z=#{domain}' -d 'type=A' -d 'name=#{domain}' -d 'content=#{ip}' -d 'service_mode=1' -d 'ttl=1'"
updateResponse = `curl https://www.cloudflare.com/api_json.html -d 'a=rec_edit' #{parameterDomainUpdate}`
status = JSON.parse(updateResponse)
#puts status

if status['result'] == 'success'
        puts "update done: #{domain} now pointing to #{ip}"
else
        puts "error - check last response: #{status['msg']}"
end

The script has to loop through the ping results in reverse order because the list starts with the newest entry first.

Due to the lack of API access you have to enter the ip address of each nodeping test.

And for the people who don't want to use Ruby - the bash only version of the script:

1. Create list of ips to check

nano ~/ips

Content:

127.0.0.1
186.0.0.1
10.1.1.1

2. Create list of node ping tests for the given ips (same order)

nano ~/results

Content:

https://nodeping.com/reports/results/[id of test]/100?format=json
https://nodeping.com/reports/results/[id of test]/100?format=json
https://nodeping.com/reports/results/[id of test]/100?format=json

3. Install libs:

sudo apt-get install dnsutils curl jq

If jq is not in the repos you can install it yourself:

#32bit version
wget http://stedolan.github.io/jq/download/linux32/jq && chmod +x jq && cp jq /usr/bin
#64bit version
wget http://stedolan.github.io/jq/download/linux64/jq && chmod +x jq && cp jq /usr/bin

4. Create bash script:

nano dnsupdate && chmod +x dnsupdate

Content:

#!/bin/bash
###########################################
#Configuration
###########################################
domain="domain.com"
recordId="#"
cloudflarelogin="cloudflare-login"
cloudflaretoken="cloudflare-token"
###########################################
#Files
###########################################
IFS=/pre>\n' read -d '' -r -a iplist < ~/ips
IFS=/pre>\n' read -d '' -r -a results < ~/results
###########################################
statusweb=()
statuspoints=()
index=0
echo "=================================="
echo "check following ip list with nodeping"
echo "${iplist[@]}"

for i in ${iplist[@]}
do
    echo "=================================="
    echo "checking $i" 
    $(curl -m 5 ${results[index]} -o "./res${i}")
    resultstring=$(cat "./res${i}")
    statuspoints[index]=$(cat "./res${i}" | grep -Po '"m":.*?[^\\]",' | grep Success | wc -l)
    resultstring=$(cat "./res${i}" | grep -Po '"m":.*?[^\\]",' | head -n 1)
    if [ "${resultstring}"=="\"m\":\"Success\"," ]
    then
        statusweb[index]=1
          echo "================="
          echo "web ip is up"
          echo "================="      
    else
        statusweb[index]=0
          echo "================="
          echo "web ip is down"
          echo "================="  
    fi
    echo " "
    echo "status: ${statusweb[index]}"
    echo "status: ${statuspoints[index]}"
    echo " "
    let index=index+1
done

max=0
counter=0
indexselectedip=0

for point in ${statuspoints[@]}; do
    if (( point > max && statusweb[counter] == 1 )); then 
        max=$point
        indexselectedip=$counter
    fi
    let counter=counter+1
done

parameterDomainList="-d a=rec_load_all -d tkn=${cloudflaretoken} -d email=${cloudflarelogin} -d z=${domain}"
domaindata=$(curl https://www.cloudflare.com/api_json.html ${parameterDomainList} -o domainlist)
key=$(cat domainlist | jq '.response.recs.objs[] | {name, rec_id} ' | cut -d ':' -f 2 | grep \" | sed 's/"//g' | sed 's/ //g' | sed 's/,//g' | tr '\n' ' ' )
domainlist=( $key )
echo "checking domainlist: ${domainlist[@]} with cloudflare"
for ((i=1; i < ${#domainlist}; i++))
do
    echo "#${domainlist[$i]} ${i}"
    if [[ "${domainlist[$i]}" == "${domain}" ]]
    then
        let k=i-1
        recordId="${domainlist[$k]}"
        echo "break for: ${recordId}"
        break
    fi
    let i=i+1
done

echo " "
echo "=================================="
echo " "
echo "update dns"
echo " "
        echo "=================================="
        echo "Changing web DNS..."
        oldip=$(dig +short ${domain} | head -n 1)
        sleep 2
        echo "current ip: ${oldip}"
        echo "new     ip: ${iplist[$indexselectedip]} (points: ${statuspoints[$indexselectedip]})"
        tester=${iplist[indexselectedip]}
        echo "=================================="
        if [[ "${tester}" == "${oldip}" ]]
        then
          echo "================="
          echo "update not needed"
          echo "================="
        else
          parameterDomainUpdate="-d act=rec_edit -d a=rec_edit -d tkn=${cloudflaretoken} -d id=${recordId} -d email=${cloudflarelogin} -d z=${domain} -d type=A -d name=${domain} -d content=${iplist[indexselectedip]} -d service_mode=1 -d ttl=1"
          cloudflareresponse=$(curl https://www.cloudflare.com/api_json.html ${parameterDomainUpdate})
          echo "response: ${cloudflareresponse}"
          echo "================="
          echo "update done"
          echo "================="        
        fi
echo "=================================="
echo "end of script"
echo "=================================="

So what are we doing here?

  • Load list of ips and nodeping tests
  • Check results for each ip
    • count number of good results to check quality of host
    • check current status of ip
  • Sort ips by status and uptime
  • Load list of DNS records for domain from cloudflare
  • Search for domain record we want to update
  • dig domain to get current ip
  • compare current ip with the one with the best uptime and score
    • update dns record
    • or do nothing if the record is allready pointing to the best ip