Tutorial: varnish

This is not a planned tutorial but a start of a discussion how low end vps can serve as web frontends.

Many people like to use event-driven webservers like lighttpd or ngix other do prefer process-based webservers like apache. Both do have their advantages but only the first do have the image of low resource consuming webservers.

First of all: What is varnish?

Answer: There is a good video about it.

If you look to e.g. a blog most of the time the content delivered to the visitor does not change. But everytime ruby or php is calling the database and glues the pieces together. Varnish is caching the html response to decrease the load of the server.

So the basic scenario would be that we buy a second vps to run varnish on it. I know if you have enough RAM varnish can run on the same server but I want to point to something with this scenario.

Because in these days we do have two advantages - if you are using the right provider:

  • Offloaded MySQL servers
  • local unmetered network

So a second vps can run varnish which is calling the first vps through local network:

visitor -> varnish vps -> application vps -> offloaded MySQL server

Second and third communication through local LAN.

Looks like you do not need a 512 MB box for a well known blog.

Back to varnish:

Installation is easy:

curl http://repo.varnish-cache.org/debian/GPG-key.txt | apt-key add -
echo "deb http://repo.varnish-cache.org/ubuntu/ lucid varnish-3.0" >> /etc/apt/sources.list
apt-get update
apt-get install varnish

The configuration is splitted between two files:

  • /etc/default/varnish
  • /etc/varnish/default.vcl

Basic configuration is handled in /etc/default/varnish:

DAEMON_OPTS="-a :80 \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -S /etc/varnish/secret \
             -s file,/var/lib/varnish/$INSTANCE/varnish_storage.bin,1G"

# -a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
# -T ${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT} \
# -f ${VARNISH_VCL_CONF} \
# -S ${VARNISH_SECRET_FILE} \
# -s ${VARNISH_STORAGE}"

If you want to use the RAM for the cache storage alter the last line to:

-s malloc,100M"
Using:
K, k The size is expressed in kilobytes
M, m The size is expressed in megabytes
G, g The size is expressed in gigabytes

Next file is /etc/varnish/default.vcl which is used to define how the cache should work.

This is a configuration usable for wordpress:

backend default {
    .host = "192.168.10.10";
    .port = "80";
    .max_connections = 30;  
    .connect_timeout = 4.0s;  
    .first_byte_timeout = 600s;  
    .between_bytes_timeout = 600s;
}

sub vcl_recv {
    #Add forwarded header
    if (req.restarts == 0) {
        if (req.http.x-forwarded-for) {
            set req.http.X-Forwarded-For =
            req.http.X-Forwarded-For + ", " + client.ip;
        } else {
            set req.http.X-Forwarded-For = client.ip;
        }
    }
    #Remove cookies for logged in users
    if (!(req.url ~ "wp-(login|admin)") &&
        !(req.url ~ "&preview=true" ) ) {
        unset req.http.cookie;
    }
    if (req.http.Authorization || req.http.Cookie) {
        return (pass);
    }   
    #Fix encodings
    if ( req.http.Accept-Encoding ) {
        if ( req.http.Accept-Encoding ~ "gzip" ) {
            # If the browser supports it, we'll use gzip.
            set req.http.Accept-Encoding = "gzip";
        }
        else if ( req.http.Accept-Encoding ~ "deflate" ) {
            # Next, try deflate if it is supported.
            set req.http.Accept-Encoding = "deflate";
        }
        else {
            # Unknown algorithm. Remove it and send unencoded.
            unset req.http.Accept-Encoding;
        }
    }   
}

sub vcl_fetch {
    if (req.url ~ "wp-(login|admin)" || req.url ~ "preview=true" || req.url ~ "xmlrpc.php") {
        return (hit_for_pass);
    }
    #Remove cookies for logged in users
    if ( (!(req.url ~ "(wp-(login|admin)|login)")) || (req.request == "GET") ) {
        unset beresp.http.set-cookie;
        set beresp.ttl = 1h;
    }
    if (req.url ~ "\.(gif|jpg|jpeg|swf|css|js|flv|mp3|mp4|pdf|ico|png)(\?.*|)$") {
        set beresp.ttl = 7d;
    }
}

There are three main sections:

  • backend default
    Target to cache
  • sub vcl_recv
    How to change requests
  • sub vcl_fetch
    How to change responses

A good resource for varnish vcls is mattiasgeniar:

A set of configuration samples used for Varnish 3.0\. This includes templates for:
    Wordpress
    Drupal (works decently for Drupal 7, depends on your addons obviously)
    Joomla (WIP)
    Fork CMS
    OpenPhoto

And various configuration for:
    Server-side URL rewriting
    Clean error pages for debugging
    Virtual Host implementations
    Various header normalizations
    Cookie manipulations
    301/302 redirects from within Varnish

Set the host to the ip of the vps you want to cache. After altering all settings you can restart varnish.

Last thing to do is to point the domain to the new ip address of the second vps.

If you want to run varnish on the same vps you have to change the port of your webserver. Only one service can listen to the port 80.

Looking to the run-of-the-mill blogs adding one post a day and using disqus for comments varnish might be a great idea.

Ask yourself how often your frontpage changes. It is all about the hit ratio of the cache.