What is Browser Caching?
Browser Caching means to store some (or all) of the website resources on the user browser and not receive them every time visiting the website. The browser saves these resources on its local cache for a while. When the defined timespan finishes, the browser should, according to the caching policy, transfer a request to the server.
All caching policies, such as resource storage timespan, how to save it, what users (browser or CDN edge servers) store it, and whether a resource should be downloaded or not after expiration, are exchanged between the server and the user via the HTTP headers.
In this article, we explain the most significant HTTP headers in the Browser Caching field as well as the ArvanCloud general policy for resource caching.
How to Validate Cached Responses by ETags Header
The server employs the ETags HTTP header to exchange the validation token. The validation token makes it possible to re-download a cached resource only if it has changed.
To better understand the issue, imagine that the maximum timespan for storing a resource on the cache is 60 seconds. When the timespan expires, the browser cannot use the resource anymore. It must send a request to the server to re-obtain the resource. However, if the resource has not changed during the timespan, the browser receives the same data on its cache. Therefore, re-downloading the data will not be efficient.
The validation token, mentioned in the Etags header, has been introduced to troubleshoot such problems. When the server retrieves a resource to the browser for the first time, it sends the ETags header containing the validation token with the collection of HTTP headers.
The validation token is a string hashing of the transferred file content. When the resource storage timespan on the cache expires, the browser sends this token in the If-None-MatchHTTP request header to the server. The server compares it with the available resource, and, if it has not changed, retrieves the HTTP status code 304 (Not Modified) in response to the browser. This response determines that the browser can use the resource stored on its cache for another 60 seconds; therefore, there is no need to re-download it. Not to re-download a resource means to save time and bandwidth as well as less delay in user access to the data.
The browser performs all of these automatically. Hence, the web developers' only duty is to make sure that the server supports the ETags header. The ArvanCloud servers fully support this header.
The Expire Header
Employ this header to determine a resource exact expiration date and timespan. Indeed, using this header is an outdated way to set a response expiry date.
How to Set Caching Policies by the Cache-Control Header
The HTTP cache-control header is for managing these points: What a user (Cacheability), under what conditions (Revalidation), and for how much time (Expiration) can cache a resource. The headers can be in both requests and responses.
The cache-control headers help the website managers determine how to manage the content received from the central server that hosts the website. The cache-control specific commands define these policies. Be noted that the cache-control header may include some commands that you have to apply a comma to separate each independent one. Read an analysis of the most significant commands in the following line:
- no-cache: The user (browser or CDN edge server) must receive the required validation from the server before employing the resource for which this command is determined. As a result, if the ETags header is also applied, only the Request and the Response will be used to make sure that the resources have not changed between the user and the server - there is no need to download an unchanged resource. That point saves bandwidth and time.
- no-store: This command signifies that the browser and all the devices located between it and the central server (such as the CDN edge servers) are not allowed to cache the resource. Indeed, they have to request it from the central server whenever needed. For example, bank information is among those that should not be cached; and whenever required, the user has to make a request (from the central server) to download it.
- public: Applying this command to a resource means that every user (browser or CDN edge server) can store it.
- private: Using this command indicates that only the browser can store the resource, and the devices between it and the central server are not allowed to do so. For example, the browser is allowed to cache the HTML page containing the user's private information. However, the CDN edge servers cannot do that.
- max-age: This command sets a resource maximum storage timespan on the cache in second. When the timespan ends, the resource is also expired. Therefore, the user (browser or CDN edge servers) must request it from the server again. For example, if the max-age for a resource is 60 seconds, the browser can cache and use it for the same duration.
If you determine only max-age and do not employ the Private command in the cache-control clearly, it means that all devices can store the resource without any obligation to use the Public command.
The general values of max-age are:
- One minute: max-age=60
- One hour: max-age=3600
- One day: max-age=86400
- One week: max-age=604800
- One month: max-age=2628000
- One year: max-age=31536000
- s-maxage: Notice that the "s" is the abbreviation for "Shared Cache." It is similar to the max-age command; however, it is an instruction for CDNs, not browsers (the browser ignores it). If you apply this command to a resource, the CDN considers the value by which you have defined. It also ignores the max-age (or the expire header) whenever using it.
- must-revalidate: This command decides that a user (browser or CDN edge server) should confirm the resource stored on the cache (an outdated one that may not exist or have changed) - in other words, a resource for which the max-age has expired in the central server before using it. They are not allowed to employ an outdated resource until it completes the validation process.
- proxy-revalidate: This is similar to the must-revalidate command. The only difference is that this one is the Proxy Servers specific command.
- no-transform: This command determines that the devices between the central server and the browser, such as the CDN edge servers, are not allowed to change that resource.
- stale-while-revalidate: It determines a timespan in seconds, during which the user (browser or CDN edge server) can apply the outdated resource stored on its cache and validate it via the central server simultaneously.
- State-If-Error: This command is similar to state-while-revalidate. The only difference between them is that the user (browser or CDN edge server) can use the outdated resource stored on the cache only if the central server retrieves one of the 500, 501, 502, 503, or 504 error codes when validating it.
- immutable: This command determines that the original body of the response will not change during the timespan. Therefore, there is no need to update the resource unless it expires.
How to Determine Correct Caching Policies
The following flowchart, prepared by Ilya Grigorik, one of the Google developers, provides a suitable view of how the optimized resource cache is. You can use it to determine that it is better to adjust which command to each resource:
Some Samples of Cache-Control Settings
- A static resource cache,
- Ensure not to store a significant resource,
- Storing a resource on the browser cache and not on the CDN edge servers,
- Storing a resource on the browser and the CDN edge servers cache on the condition that it has to be validated whenever you use it,
- Storing a resource on the CDN edge servers and validate it every time you use it,
- Storing a resource by every user (browser or CDN edge server) and validating it whenever you use it,
- Storing a resource with different expiration dates on the browser and the CDN edge servers.
How to Configure the Cache-Control
There are two ways to implement the HTTP Cache-Control Header: on the server or through code writing. Keep on reading to know how to implement the cache-control on the Nginx and Apache servers and the PHP codes.
Add the following commands to the .htaccess file. Here, they determine that the server should adjust the files that define the cache-control header to the max-age (equal to 84600) and public parameters:
Add the following commands to the Nginx configuration file. This way, you determine that it should adjust the cache-control header to no-transform and public parameters for the files defined in the command:
You can insert the commands associated with adding the cache-control header into website codes directly. For example, the following one adjusts the cache-control header to the max-age parameter equal to one day:
The ArvanCloud Caching Policies
- The Request Phase
In this phase, first, the request received from the client-side is compared with a list of files cached on the ArvanCloud edge servers. If the request is associated with one of the cached resources, the edge servers will transfer it in response to the user during the time in which it is active. However, if the requested resource has expired, the ArvanCloud edge servers first ask the central server that hosts the website for its validation and then retrieve a response to the user.
The point that which files are capable of being cached on the ArvanCloud edge servers depends on your caching settings on the ArvanCloud panel. For example, when setting all of the files caches, the ArvanCloud edge servers store all website resources. Therefore, while receiving the user request, it will not deliver the format checking step.
- The Response Phase
In this phase, the ArvanCloud edge servers - to receive a request for a resource capable of caching - first search their cache to find it. If the search process does not succeed, they send a request to the central server that hosts the website to install the resource. And, the ArvanCloud edge servers retrieve the response to the user.
According to the request headers, the response transferred from the central server-side to the edge servers have two possibilities:
- It has a caching ability. Therefore, when receiving the next request to access the resource, the edge servers employ the resource to respond to the users.
- Or, it may not have a caching ability. In such a case, whenever the users send a request to access the resource, they must complete the steps again.
Browser Cache Settings on the ArvanCloud User Panel
You can define the allowed data storage timespan on the users' browser cache. To do so, complete the following steps:
- Open the Caching Settings section on the ArvanCloud panel,
- Go to the Advanced Settings section,
- Activate the Cache Data on the Browser option.
Notice that you can employ Browser Caching to store your website resources on the users' browser. Accordingly, you do not access your users' browsers to erase their cache.