Saving a Request¶
This chapter is for the system administration course only
Rescue | Economization | Protection | |
---|---|---|---|
Directors | x | x | |
Health Checks | x | ||
Grace Mode | x | x | |
Retry a Request | x | ||
Saint Mode | x | ||
Tune Backend Properties | x | ||
Access Control Lists (ACL) | x | ||
Compression | X |
Varnish offers many mechanisms to save a request. By saving a request we mean:
- Rescue: mechanisms to handle requests when backends are in problematic situations.
- Economization: mechanisms to spend less resources, i.e., send less requests to the backend.
- Protection: mechanisms to restrict access cache invalidation from unauthorized entities.
Table 18 shows how different mechanisms are mapped to their saving meaning. This chapter explains how to make your Varnish setup more robust by using these mechanisms.
Directors¶
- Loadable VMOD
- Contains 1 or more backends
- All backends must be known
- Selection methods:
- round-robin
- fallback
- random
- seeded with a random number
- seeded with a hash key
Round-robin director example:
vcl 4.0;
import directors; // load the directors VMOD
backend one {
.host = "localhost";
.port = "80";
}
backend two {
.host = "127.0.0.1";
.port = "81";
}
sub vcl_init {
new round_robin_director = directors.round_robin();
round_robin_director.add_backend(one);
round_robin_director.add_backend(two);
new random_director = directors.random();
random_director.add_backend(one, 10); # 2/3 to backend one
random_director.add_backend(two, 5); # 1/3 to backend two
}
sub vcl_recv {
set req.backend_hint = round_robin_director.backend();
}
Varnish can have several backends defined, and it can set them together into clusters for load balancing purposes. Backend directors, usually just called directors, provide logical groupings of similar web servers by re-using previously defined backends. A director must have a name.
There are several different director selection methods available, they are: random, round-robin, fallback, and hash.
The next backend to be selected depends on the selection method.
You can specify the timeout before unused backend connections are closed by setting the backend_idle_timeout
parameter.
How to tune this and other parameters is further explained in the Tuning section.
A round-robin director takes only a backend list as argument. This director type picks the first backend for the first request, then the second backend for the second request, and so on. Once the last backend have been selected, backends are selected again from the top. If a health probe has marked a backend as sick, a round-robin director skips it.
A fallback director will always pick the first backend unless it is sick, in which case it would pick the next backend and so on. A director is also considered a backend so you can actually stack directors. You could for instance have directors for active and passive clusters, and put those directors behind a fallback director.
Random directors are seeded with either a random number or a hash key. Next section explains their commonalities and differences.
Note
Health probes are explain in the Health Checks section.
Note
Directors are defined as loadable VMODs in Varnish 4. See the vmod_directors man page for more information and examples.
Warning
If you declare backend servers, but do not use them,varnishd
returns error by default. You can avoid this situation by turning off the runtime parametervcc_err_unref
. However, this practice is strongly discouraged. Instead, we advise to declare only what you use.
Random Directors¶
- Random director: seeded with a random number
- Hash director: seeded with hash key typically from a URL or a client identity string
Hash director that uses client identity for backend selection
sub vcl_init {
new h = directors.hash();
h.add_backend(one, 1); // backend 'one' with weight '1'
h.add_backend(two, 1); // backend 'two' with weight '1'
}
sub vcl_recv {
// pick a backend based on the cookie header of the client
set req.backend_hint = h.backend(req.http.cookie);
}
The random director picks a backend randomly.
It has one per-backend parameter called weight
, which provides a mechanism for balancing the selection of the backends.
The selection mechanism of the random director may be regarded as traffic distribution if the amount of traffic is the same per request and per backend.
The random director also has a director-wide counter called retries
, which increases every time the director selects a sick backend.
Both, the random and hash director select a backend randomly. The difference between these two is the seed they use. The random director is seeded with a random number, whereas the hash director is seeded with a hash key.
Hash directors typically use the requested URL or the client identity (e.g. session cookie) to compute the hash key. Since the hash key is always the same for a given input, the output of the hash director is always the same for a given hash key. Therefore, hash directors always select the same backend for a given input. This is also known as sticky session load balancing. You can learn more about sticky sessions in https://info.varnish-software.com/blog/proper-sticky-session-load-balancing-varnish.
Hash directors are useful to load balance in front of other Varnish caches or other web accelerators. In this way, cached objects are not duplicated across different cache servers.
Note
In Varnish 3 there is a client director type, which is removed in Varnish 4.
This client director type is a special case of the hash director.
Therefore, the semantics of a client director type are achieved using hash.backend(client.identity)
.
Health Checks¶
- Poke your web server every N seconds
- Affects backend selection
std.healthy(req.backend_hint)
- Set using
.probe
- Varnish allows at most
.threshold
amount of failed probes within a set of the last.window
probes varnishlog
: Backend_health
backend server1 {
.host = "server1.example.com";
.probe = {
.url = "/healthtest";
.timeout = 1s;
.interval = 4s;
.window = 5;
.threshold = 3;
}
}
You can define a health check for each backend. A health check defines a probe to verify whether a backend replies on a given URL every given interval.
The above example causes Varnish to send a request to http://server1.example.com/healthtest every 4 seconds. This probe requires that at least 3 requests succeed within a sliding window of 5 request.
Varnish initializes backends marked as sick.
.initial
is another variable of .probe
.
This variable defines how many times the probe must succeed to mark the backend as healthy.
The .initial
default value is equal to .threshold – 1
.
When Varnish has no healthy backend available, it attempts to use a graced copy of the cached object that a request is looking for. The next section Grace Mode explains this concept in detail.
You can also declare standalone probes and reuse them for several backends. It is particularly useful when you use directors with identical behaviors, or when you use the same health check procedure across different web applications.
import directors;
probe www_probe {
.url = "/health";
}
backend www1 {
.host = "localhost";
.port = "8081";
.probe = www_probe;
}
backend www2 {
.host = "localhost";
.port = "8082";
.probe = www_probe;
}
sub vcl_init {
new www = directors.round_robin();
www.add_backend(www1);
www.add_backend(www2);
}
Note
Varnish does not send a Host header with health checks.
If you need that, you can define an entire request using .request
instead of .url
.
backend one {
.host = "example.com";
.probe = {
.request =
"GET / HTTP/1.1"
"Host: www.foo.bar"
"Connection: close";
}
}
Note
The healthy
function is implemented as VMOD in Varnish 4.
req.backend.healthy
from Varnish 3 is replaced by std.healthy(req.backend_hint)
.
Do not forget to include the import line: import std;
Analyzing health probes¶
Backend_health
tag invarnishlog -g raw -i Backend_health
# varnishlog -g raw -i Backend_health 0 Backend_health - default Still healthy 4--X-RH 5 3 5 0.012166 0.013693 HTTP/1.0 200 OK
varnishadm debug.health
in Varnish 4.0 orvarnishadm backend.list -p
in Varnish 4.1:Backend default is Healthy Current states good: 5 threshold: 3 window: 5 Average responsetime of good probes: 0.016226 Oldest Newest ================================================================ 44444444444444444444444444444444444444444444--44----444444444444 Good IPv4 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX--XX----XXXXXXXXXXXX Good Xmit RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR--RR----RRRRRRRRRRRR Good Recv HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH--HH----HHHHHHHHHHHH Happy
varnishadm backend.list
:Backend name Refs Admin Probe default(127.0.0.1,,8081) 1 probe Healthy 4/5
Every health test is recorded in the shared memory log with 0
VXID (see Transactions).
If you want to see Backend_health
records in varnishlog
, you have to change the default grouping by XVID to raw:
varnishlog -g raw -i Backend_health
Backend_health
records are led by 0
, which is the VXID number.
The rest of the probe record is in the following format:
Backend_health - %s %s %s %u %u %u %f %f %s
| | | | | | | | |
| | | | | | | | +- Probe HTTP response
| | | | | | | +---- Average response time
| | | | | | +------- Response time
| | | | | +---------- Probe window size
| | | | +------------- Probe threshold level
| | | +---------------- Number of good probes in window
| | +------------------- Probe window bits
| +---------------------- Status message
+------------------------- Backend name
Most of the fields are self-descriptive, but we clarify next the Probe window bits and Status message.
The Probe window bits field details the last probe with the following format:
%c %c %c %c %c %c %c
| | | | | | |
| | | | | | +- H -- Happy
| | | | | +---- R -- Good Received (response from the backend received)
| | | | +------- r -- Error Received (no response from the backend)
| | | +---------- X -- Good Xmit (Request to test backend sent)
| | +------------- x -- Error Xmit (Request to test backend not be sent)
| +---------------- 6 -- Good IPv6
+------------------- 4 -- Good IPv4
Status message is a two word state indicator, which can be:
- Still healthy
- Back healthy
- Still sick
- Went sick
Note that Still indicates unchanged state, Back and Went indicate a change of state. The second word, healthy or sick, indicates the present state.
Another method to analyze health probes is by calling varnishadm debug.health
in Varnish 4.0 or varnishadm backend.list -p
in Varnish 4.1.
This command presents first data from the last Backend_health
log:
Backend default is Healthy
Current states good: 5 threshold: 3 window: 5
Average responsetime of good probes: 0.016226
and the last 64 window bits of probes:
Oldest Newest
================================================================
44444444444444444444444444444444444444444444--44----444444444444 Good IPv4
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX--XX----XXXXXXXXXXXX Good Xmit
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR--RR----RRRRRRRRRRRR Good Recv
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH--HH----HHHHHHHHHHHH Happy
Demo: Health Probes¶
See the power of health probes!
Suggested steps for the demo:
- Configure a probe as shown in Health Checks.
- For Varnish 4.0, run
watch -n.5 varnishadm debug.health
in one terminal - For Varnish 4.1, run
watch -n.5 varnishadm backend.list -p
in one terminal - Start and stop your backend
For this, you might want to simulate very quickly a backend with the command
python -m SimpleHTTPServer [port]
. - The watch command makes the effect of an animated health prober!
Grace Mode¶
- A graced object is an object that has expired, but is kept in cache for a given grace time
- Grace mode is when Varnish uses a graced object
- Grace mode is a feature to mitigate the accumulation of requests for expired objects
- Grace mode allows Varnish to build responses from expired objects
beresp.grace
defines the time that Varnish keeps an object afterberesp.ttl
has elapsed
The main goal of grace mode is to avoid requests to pile up whenever a popular object has expired in cache.
To understand better grace mode, recall Fig. 2 which shows the lifetime of cached objects.
When possible, Varnish delivers a fresh object, otherwise Varnish builds a response from a stale object and triggers an asynchronous refresh request.
This procedure is also known as stale-while-revalidate
.
The typical way to use grace is to store an object for several hours after its TTL
has elapsed.
In this way, Varnish has always a copy to be delivered immediately, while fetching a new object asynchronously.
This asynchronous fetch ensures that graced objects do not get older than a few seconds, unless there are no available backends.
The following VCL code illustrates a typical use of grace:
sub vcl_hit {
if (obj.ttl >= 0s) {
# Normal hit
return (deliver);
} elsif (std.healthy(req.backend_hint)) {
# The backend is healthy
# Fetch the object from the backend
return (fetch);
} else {
# No fresh object and the backend is not healthy
if (obj.ttl + obj.grace > 0s) {
# Deliver graced object
# Automatically triggers a background fetch
return (deliver);
} else {
# No valid object to deliver
# No healthy backend to handle request
# Return error
return (synth(503, "Backend is down"));
}
}
}
Graced objects are those with a grace time that has not yet expired.
The grace time is stored in obj.grace
, which default is 10 seconds.
You can change this value by three means:
- by parsing the HTTP
Cache-Control
fieldstale-while-revalidate
that comes from the backend, - by setting the variable
beresp.grace
in VCL, or - by changing the grace default value with
varnishadm param.set default_grace <value>
.
Varnish 4.1 parses stale-while-revalidate
automatically from the Cache-Control
header field.
For example, when receiving "Cache-Control: max-age=5, stale-while-revalidate=30"
, Varnish 4.1 sets obj.ttl=5
and obj.grace=30
automatically.
To see a working example on how Varnish works with Cache-Control
, see the VTC in Understanding Grace using varnishtest.
Note
obj.ttl
and obj.grace
are countdown timers.
Objects are valid in cache as long as they have a positive remaining time equal to obj.ttl
+ obj.grace
.
Timeline Example¶
Backend response HTTP Cache-Control header field:
"Cache-control: max-age=60, stale-while-revalidate=30"
or set in VCL:
set beresp.ttl = 60s;
set beresp.grace = 30s;
- 50s: Normal delivery
- 62s: Normal cache miss, but grace mode possible
- 80s: Normal cache miss, but grace mode possible
- 92s: Normal cache miss, object is removed from cache
TTL
are delivered, set beresp.grace = 0
.
The downside of this is that all grace functionality is disabled, regardless any reason.Exercise: Grace¶
Copy the following CGI script in
/usr/lib/cgi-bin/test.cgi
:#!/bin/sh sleep 10 echo "Content-type: text/plain" echo "Cache-control: max-age=10, stale-while-revalidate=20" echo echo "Hello world" date
Make the script executable.
Issue
varnishlog -i VCL_call,VCL_return
in one terminal.Test that the script works outside Varnish by typing
http http://localhost:8080/cgi-bin/test.cgi
in another terminal.Send a single request, this time via Varnish, to cache the response from the CGI script. This should take 10 seconds.
Send three requests: one before the TTL (10 seconds) elapses, another after 10 seconds and before 30 seconds, and a last one after 30 seconds.
Repeat until you understand the output of
varnishlog
.Play with the values of
max-age
andstale-while-revalidate
in the CGI script, and theberesp.grace
value in the VCL code.
retry
Return Action¶
- Available in
vcl_backend_response
andvcl_backend_error
- Re-enters
vcl_backend_fetch
- Any changes made are kept
- Parameter
max_retries
safe guards against infinite loops - Counter
bereq.retries
registers how many retries are done
sub vcl_backend_response {
if (beresp.status == 503) {
return (retry);
}
}
The retry
return action is available in vcl_backend_response
and vcl_backend_error
.
This action re-enters the vcl_backend_fetch
subroutine.
This only influences the backend thread, the client-side handling is not affected.
You may want to use this action when the backend fails to respond. In this way, Varnish can retry the request to a different backend. For this, you must define multiple backends.
You can use directors to let Varnish select the next backend to try.
Alternatively, you may use bereq.backend
to specifically select another backend.
return (retry)
increments the bereq.retries
counter.
If the number of retries is higher than max_retries
, control is passed to vcl_backend_error
.
Note
In Varnish 3.0 it is possible to do return (restart)
after the backend response failed.
This is now called return (retry)
, and jumps back up to vcl_backend_fetch
.
Saint Mode¶
- Saint mode is implemented as a backend director with the following capabilities:
- Fine-grained health checks; maintains a blacklist of relations between objects and backends
- Objects have a blacklist TTL
- Backends in the blacklist have a threshold of related objects
- Backends with objects below the threshold can be selected to serve other objects
- Backends with objects above the threshold are marked as sick for all objects
- Available in Varnish Cache 4.1 or later
Saint mode complements regular Health Checks by marking backend sicks for specific object. Saint mode is a VMOD that maintains a blacklist of objects and related backends. Each blacklisted object has a TTL, which denotes the time it stays in the blacklist.
If the number of blacklisted objects for a backend are below a threshold, the backend is considered partially sick. Requests for blacklisted objects might be sent to another backend. When the number of blacklisted objects for a backend exceeds a threshold, the backend is marked as sick for all requests.
vcl/saintmode.vcl below is typical usage of saint mode.
In this example, a request with a 500
response status would be retried to another backend.
vcl 4.0;
import saintmode;
import directors;
backend server1 { .host = "192.0.2.11"; .port = "80"; }
backend server2 { .host = "192.0.2.12"; .port = "80"; }
sub vcl_init {
# create two saint mode backends with threshold of 5 blacklisted objects
new sm1 = saintmode.saintmode(server1, 5);
new sm2 = saintmode.saintmode(server2, 5);
# group the backends in the same cluster
new fb = directors.fallback();
fb.add_backend(sm1.backend());
fb.add_backend(sm2.backend());
}
sub vcl_backend_fetch {
# get healthy backend from director
set bereq.backend = fb.backend();
}
sub vcl_backend_response {
if (beresp.status > 500) {
# the failing backend is blacklisted 5 seconds
saintmode.blacklist(5s);
# retry request in a different backend
return (retry);
}
}
An alternative is to build the response with a stale object.
For that, you would return(abandon)
, restart
the request in vcl_synth
, check for req.restarts
in vcl_recv
.
To get a better idea on how to do it, please take a look the stale-if-error
snippet in https://github.com/fgsch/vcl-snippets/blob/master/v4/stale-if-error.vcl.
The fine-grained checks of saint mode help to spot problems in malfunctioning backends.
For example, if the request for the object foo returns 200 OK
HTTP response without content (Content-Length = 0
), you can blacklist that specific object for that specific backend.
You can also print the object with std.log
and filter it in varnishlog
.
Note
For more information, please refer to its own documentation in https://github.com/varnish/varnish-modules/blob/master/docs/vmod_saintmode.rst.
Tune Backend Properties¶
backend default {
.host = "localhost";
.port = "80";
.connect_timeout = 0.5s;
.first_byte_timeout = 20s;
.between_bytes_timeout = 5s;
.max_connections = 50;
}
If a backend has not enough resources, it might be advantageous to set max_connections
.
So that a limited number of simultaneous connections are handled by a specific backend.
All backend-specific timers are available as parameters and can be overridden in VCL on a backend-specific level.
Tip
Varnish only accepts hostnames for backend servers that resolve to a maximum of one IPv4 address and one IPv6 address.
The parameter prefer_ipv6
defines which IP address Varnish prefer.
Access Control Lists (ACLs)¶
- An ACL is a list of IP addresses
- VCL programs can use ACLs to define and control the IP addresses that are allowed to purge, ban, or do any other regulated task.
- Compare with
client.ip
orserver.ip
# Who is allowed to purge....
acl local {
"localhost"; /* myself */
"192.168.1.0"/24; /* and everyone on the local network */
!"192.168.1.23"; /* except for the dialin router */
}
sub vcl_recv {
if (req.method == "PURGE") {
if (client.ip ~ local) {
return (purge);
} else {
return (synth(405));
}
}
}
An Access Control List (ACL) declaration creates and initializes a named list of IP addresses and ranges, which can later be used to match client or server IP addresses.
ACLs can be used for anything.
They are typically used to control the IP addresses that are allowed to send PURGE
or ban requests, or even to avoid the cache entirely.
You may also setup ACLs to differentiate how your Varnish servers behave.
You can, for example, have a single VCL program for different Varnish servers.
In this case, the VCL program evaluates server.ip
and acts accordingly.
ACLs are fairly simple to create.
A single IP address or hostname should be in quotation marks, as "localhost"
.
ACL uses the CIDR notation to specify IP addresses and their associated routing prefixes.
In Varnish’s ACLs the slash “/
” character is appended outside the quoted IP address, for example "192.168.1.0"/24
.
To exclude an IP address or range from an ACL, and exclamation mark “!
” should precede the IP quoted address.
For example !"192.168.1.23"
.
This is useful when, for example, you want to include all the IP address in a range except the gateway.
Warning
If you declare ACLs, but do not use them, varnishd
returns error by default.
You can avoid this situation by turning off the runtime parameter vcc_err_unref
.
However, this practice is strongly discouraged.
Instead, we advise to declare only what you use.
Compression¶
- Where to compress? backend or Varnish?
- Parameter to toggle:
http_gzip_support
- VCL variable
beresp.do_gzip
to zip andberesp.do_gunzip
to unzip
sub vcl_backend_response {
if (beresp.http.content-type ~ "text") {
set beresp.do_gzip = true;
}
}
- Avoid compressing already compressed files
- Works with ESI
It is sensible to compress objects before storing them in cache. Objects can be compressed either at the backend or your Varnish server, so you have to make the decision on where to do it. Factors that you should take into consideration are:
- where to store the logic of what should be compressed and what not
- available CPU resources
Also, keep in mind that files such as JPEG, PNG, GIF or MP3 are already compressed. So you should avoid compressing them again in Varnish.
By default, http_gzip_support
is on, which means that Varnish follows the behavior described in https://www.varnish-cache.org/docs/trunk/phk/gzip.html and https://www.varnish-cache.org/docs/trunk/users-guide/compression.html.
If you want to have full control on what is compressed and when, set the http_gzip_support
parameter to off, and activate compression based on specific rules in your VCL code.
Implement these rules in vcl_backend_response
and then set beresp.do_gzip
or beresp.do_gunzip
as the example above.
If you compose your content using Edge Side Includes (ESI), you should know that ESI and gzip work together. Next chapter explains how to compose your content using Varnish and Edge Side Includes (ESI).
Note
Compression in Varnish uses and manipulates the Accept-Encoding
and Content-Encoding
HTTP header fields.
Etag
validation might also be weakened.
Refer to https://www.varnish-cache.org/docs/trunk/phk/gzip.html and https://www.varnish-cache.org/docs/trunk/users-guide/compression.html for all details about compression.