Use Varnish and Nginx to follow, hide and cache 301 / 302 redirects

Hello!

Varnish is a robust, open source and stable caching solution that has been employed on many different high traffic environments with a significant amount of success.

One of the things that we have come across, specifically with environments such as Amazon Web Services is that websites tend to spread their web stack across multiple services. For example static media such as JS, CSS and image files may be hosted on Amazon S3 storage. This would require either implementing additional CNAMES for your domain (i.e. static.yourdomain.com) that point to the S3 URL, or have your CMS redirect requests for static media to the S3 backend.

Remember with S3, you have to generate the static files and copy them over to S3, so these URLs may need to be generated and maintained by the CMS often times with a redirect (301 or 302) that rewrites the URL to the S3 backend destination.

When Varnish is caching a website, and it comes across a request that is rewritten in a 301/302 redirect by the backend response (“beresp”), varnish typically will simply cache the 301/302 redirect as a response, saving that minuscule amount of processing power that needed to happen to process the request and send the rewrite. Some may argue that that is simply negligible!

Wouldn’t it be nice to actually cache the content after the redirect happens? There’s two ways one could go about doing this.

Cache 301 / 302 Redirects with Varnish 4

In simpler setups, Varnish can simply process the rewrite/redirect and change the url and issue a return(restart) to restart the request with the new URL. This means that Varnish is processing the rewrite and returning the new url so that it can be re-requested and ultimately cached. So in vcl_deliver you can add the following :

        # Cache 301/302 redirects
        if ((resp.status == 301) || (resp.status == 302)) {
               set req.url = regsub(resp.http.Location,"^http://[^/]+(.*)","\1");
               return(restart);
        }

The above should work for you if, lets say, you are using Varnish in front of a simple apache/nginx server that is then processing the request. If Varnish is sending traffic to another proxy (i.e. nginx + proxy_pass), then this above directive may not work for you. The reason why one may want to proxy traffic from varnish to another proxy like nginx may be in a scenario where you want to do some fancy redirection of traffic + DNS resolution. Still confused?

Lets say varnish is at the edge of a network, caching a complicated website. Requests to varnish need to go to another load balancer (i.e. an Amazon ELB). ELB endpoints are never going to be a static IP address and Varnish (as of v4) cannot do DNS resolution of hostnames on a per request basis, so you would need to proxy the request to Nginx which would handle the reverse proxy over to ELB which would then load balance the backend fetch to the CMS.

If your scenario sounds more like the aforementioned one, then you could try following the 301/302 redirect with nginx instead of varnish.

Cache 301 / 302 Redirects with Nginx

Nginx and Varnish seem to go hand in hand. They’re great together! In this scenario you are using Varnish as your edge cache and sending all backend requests to an nginx proxy_pass directive. In order to tell Nginx to follow a redirect before sending any response to Varnish (and ultimately the end-user), you can tell varnish to simply save the redirect location and return the response after redirecting back to Varnish so it can simply cache the response!

        location / {
                proxy_pass  http://backend-server.com;
                proxy_set_header Host $http_host;
                proxy_intercept_errors on;
                error_page 301 302 307 = @handle_redirects;
        }
        location @handle_redirects {
                set $saved_redirect_location '$upstream_http_location';
                proxy_pass $saved_redirect_location;
        }

You can see that the proxy_pass directive is configured normally. In the event of any 301, 302 or 307, process it with the @handle_redirects location directive. Then simply proxy pass the $saved_redirect_location as if it were the backend server! This means that even if the proxy_pass location is not even in your Varnish configuration as a valid hostname (i.e. random S3 url) Varnish will still cache it, thinking it is backend-server.com.

Hopefully this will help someone out there!

Menu