Varnish
Varnish is an open source, free and flexible software which is used accelerate the speed of website by caching webpage contents in memory. Varnish caches content using hash-tables which are key-value store where URL is usually taken as key.
Scenario
Set up varnish to serve only specific pages of your website from cache. The webpages should only get served from cache by varnish when the end-users are not logged in to the website. If end-users are logged in to the website and are browsing these webpages, the webpages should be served by web-server running behind the varnish.
We will start by installing varnish4.0 on Ubuntu server on which nginx will act as backend server and is already running and will configure varnish as a reverse proxy.
Varnish Installation: We will execute the following commands on terminal to install varnish.
[js]sudo apt-get install apt-transport-https
sudo curl https://repo.varnish-cache.org/GPG-key.txt | apt-key add –
sudo echo "deb https://repo.varnish-cache.org/ubuntu/ precise varnish-4.0" >> /etc/apt/sources.list.d/varnish-cache.list
sudo apt-get update
sudo apt-get install varnish -y[/js]
Varnish Configuration: We will follow the following steps to configure varnish as reverse proxy.
1. Stop varnish and web server:
[js]service varnish stop
service nginx stop[/js]
2. Change listening port of web server from 80 to 8080 in /etc/nginx/sites-enabled/default file.
3. Open /etc/default/varnish and change it’s listening port 6081 to 80 as shown below:
Change
[js]DAEMON_OPTS="-a :6081 \
-T localhost:6082 /
-f /etc/varnish/default.vcl /
-S /etc/varnish/secret /
-s malloc,256m"[/js]
Into
[js]DAEMON_OPTS="-a :80 /
-T localhost:6082 /
-f /etc/varnish/default.vcl /
-S /etc/varnish/secret /
-s malloc,256m" [/js]
This would tell varnish to listen on 80 port.
4. Start varnish and web server:
[js]service varnish start
service nginx start[/js]
We have configure varnish as reverse proxy in front of nginx server. We could verify varnish running at port 80 and nginx at port 8080 using below command.
[js]netstat -ntlp[/js]
Now, we will start configuring /etc/varnish/default.vcl where we will define our custom rules that will apply on the incoming client requests.
Varnish uses a language called Varnish Configuration Language ( VCL ) to define various custom rules. The syntax of VCL is similar to C or perl.
Overview of default.vcl: Before defining our custom rules, let’s first understand the structure of the default.vcl file.
This file consists of subroutines and each subroutine is called sequentially in pre-defined order set by varnish. Subroutine can be built-in or custom. Built-in subroutines start with “vcl_” and custom subroutines can not start with “vcl_”. Every subroutine is ended with return statement which can have recv, fetch, pass, miss, hit, deliver, pipe or hash arguments defining the next action. Each parameter has different meaning and are not available in every subroutine.
In the file, first we need to tell VCL compiler which version of varnish we are using. Then we import the Varnish Modules which would be getting used in the file and later we define the backend server.
[js]cat /etc/varnish/default.vcl
vcl 4.0;
import std;
import directors;
backend default {
.host = "127.0.0.1";
.port = "8080";
}[/js]
There are total of 14 built-in subroutines. We will be going through only six subroutines which are required to perform our scenario.
1. vcl_recv: This subroutines is triggered at the beginning of the request. Here, we decide whether and how to let varnish to handle it and or to pass it to backend server. Every statement added into the subroutine is explained in comments just above it.
[js]sub vcl_recv {
# Define the backend server first using “req” object
# which is created every-time varnish receives the request.
set req.backend_hint = default;
# By-pass all authentication requests to backend server.
# “pass” argument in return statement passes the request
# to vcl_pass sub routine which ultimately is passed to
# backend server.
if (req.http.Authorization || req.http.Authenticate){
return(pass);
}
# http.X-Requested-With variable checks the ajax requests
# to pass then to backend server.
if(req.http.X-Requested-With == "XMLHttpRequest" || req.url ~ "nocache") {
return (pass);
}
# Pass requests to backend if req.url contains the any of
# the below string anywhere in URL.
if (req.url ~ "/(checkout|customer|catalog/product_compare|wishlist)/") {
return(pass);
}
# Pass requests to backend if method in request is not
# GET and HEAD.
if (req.method != "GET" && req.method != "HEAD") {
return (pass);
}
# req.http.Cookie contains cookies.
# We pass request to backend server someone is logged
# into the website and we don’t want to
# varnish to server anything from cache if someone is
# logged into the website.
if (req.http.Cookie ~ "CUSTOMER_AUTH"){
return(pass);
}
# hash paramater call vcl_recv subsoutine to server the
# mentioned URL from cache.
if (req.url ~ "^/$" || req.url ~ "/footwear/*" || req.url ~ "^/accessories/*") {
return(hash);
}
}[/js]
2. vcl_hash: It is called by vcl_recv. This returns lookup and searches the object from cache and eventually calls vcl_hit or vcl_miss subroutines. If object is present in the cache vcl_hit is called other wise vcl-miss is called.
[js]sub vcl_hash {
# This functions stores the url as key in varnish.
hash_data(req.url);
# If the host is set,stores it else store server ip.
if (req.http.host) {
hash_data(req.http.host);
} else {
hash_data(server.ip);
}
#looks up the cache and call vcl_hit or vcl_miss.
return (lookup);
}[/js]
3. vcl_hit: This is called when the object is present in the cache.
[js]sub vcl_hit {
# Called when a cache lookup is successful.
# Here ,we have user obj object instead of req
# object because vcl_hash passes obj to it.(more info)
# obj.ttl is object’s remaining time to live and
# if it is greater than zero, call vcl_deliver subroutine.
if (obj.ttl >= 0s) {
return (deliver);
}
}[/js]
4. vcl_miss: This is called when the object is not present in the cache and it will fetch the object from the backend and stores it in the cache and then serves the request.
[js]sub vcl_miss {
return (fetch);
}[/js]
5. vcl_backend_response: This subroutine is called after a request is fetched from the backend i.e. after vcl_miss and before delivering the fetched object from backend. We set the TTL variable on the object to 1 hour. TTL value can be in seconds (3600s), minutes (60m) or hours (1h).
[js]sub vcl_backend_response {
# This subroutine works on beresp object.
# Set TTL of object for 1 hr.
set beresp.ttl = 60m;
# Allow stale content, in case the backend goes down.
# make Varnish keep all objects for 6 hours beyond their TTL
set beresp.grace = 6h;
#deliver argument calls vcl_deliver subroutine.
return(deliver);
}[/js]
6. vcl_deliver: This is called just before passing the cached object to client. We can modify header to be passed to the client.
[js]sub vcl_deliver {
#Adding a debug header to know if current request is a hit or a miss.
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
# Set the cache control headers here.
set resp.http.Cache-Control = "no-store, must-revalidate, post-check=0, pre-check=0";
#Update the cache hits.
set resp.http.X-Cache-Hits = obj.hits;
# Some headers can be removed if not required.
# Have added some headers below in comments.
# Remove some headers: PHP version
# unset resp.http.X-Powered-By;
# Remove some headers: Apache version & OS
# unset resp.http.Server;
# unset resp.http.X-Drupal-Cache;
# unset resp.http.X-Varnish;
# unset resp.http.Via;
# unset resp.http.Link;
# unset resp.http.X-Generator;
# Finally, this will deliver the response back to the client.
return (deliver);
}[/js]
References:
VCL Syntax:
High Hit rate:
Rariable list: https://www.varnish-cache.org/docs/3.0/reference/vcl.html#examples
Return params:
Authentication: http://blog.tenya.me/blog/2011/12/14/varnish-http-authentication/
—
Thanks,
Navjot Singh
Team AWS, Intelligrape
Email: navjots[at]intelligrape[dot]com