Caching strategies for authentication

Published in

tedspence.com

7 min readMar 11, 2022

You should expect high performance authentication from your APIs

It takes time to verify a password or any other kind of authentication token. Yet customers demand high performance from our APIs. So how can we bridge this gap?

The stumbling block is that verifying an authentication token is slow on purpose. If we could verify a password in a short period of time, attackers could guess passwords and break into our software by brute-force.

Software that checks authentication for each and every API call is slow, and we can do better. We can design a robust, secure, performance cache solution that ensures correctness while also providing high performance. Here’s how to build this system.

What type of authentication system do you have?

The solutions I present here are authentication-system-independent. They work equally well whether you are validating authentication results via a database, via a remote server, or via an authentication provider like Okta or Azure B2C.

If you are able to, I strongly encourage you to avoid storing passwords locally in your own system. Regardless of which service you use to manage authentication you can still apply these caching rules to create safe and sensible performance for your software.

Are we permitted to cache authentication results?

The primary risks with authentication caches are:

Accepting an API call made with an old password or expired bearer token;
Allowing a user to make API calls after their credentials are revoked; and
Cache conflicts that pollute one person’s data for another.

How can we address this? Let’s begin by imagining a very simple system that does no caching at all — where every API call is tested for authentication. Here’s how it works:

When an API call arrives, we contact the authentication server and request verification of the authentication token provided with the API call.
Once that token is verified, we do the work of the API call and give the result.

In this scenario, the instant a user’s credentials change or are revoked, the system will immediately notice and reject all further API calls. However, any API calls that were in flight when the change occurred may or may not succeed. They might have tested authentication before the change, or they might have done so afterwards.

Now let’s imagine that our authentication server is not a single server but is instead a cluster of machines. Without exact details of the clustering strategy, we don’t know how long it takes for a change to be visible to all machines in the cluster.

With this realization, we can conclude as follows:

Changes to a user’s authentication status take time to distribute to all machines in our system. We should notify users that there exists a short period of time after an authentication change before they can be certain that all further API calls will respect the changes to authentication privileges.

This realization frees us up to implement caching in our authentication system. If we say that all authentication changes will be visible across our clustered software within a certain number of minutes, we can communicate behavioral expectations to users clearly, and this frees us up to improve performance — as long as we can satisfy the correctness guarantee.

Caching authentication results

Now that we have documented this requirement and explained it to end users, how should we implement caching to meet this guarantee securely?

My team chose a guarantee time of five minutes.
We implemented a three-tiered caching system: local memory, a remote key-value cache db, and finally verification via the original service.
Our software checks the cache tiers in order of performance. Checking the in-memory cache takes about 10–20 microseconds; checking REDIS takes about 1–4 milliseconds; and verifying the raw password or token takes from 20-70 milliseconds.

If the cache item is not present and we have to verify the raw password or token, we then push the results into the in-memory and key-value caches so that future API calls can reuse the cached result for a short time.

But how can you safely cache this value? It’s important not to store sensitive passwords or keys in memory. An attacker who happens to capture a crashdump might be able to observe raw key names or values in memory. We chose to use a one-way hash system that works like this:

You’ll notice a few interesting items in this code snippet:

We use the remote IP address as part of our cache key. This means that a developer writing programs that run on different IP addresses may need to authenticate multiple times, but it also reduces the chances that a one-way hash collision could allow one person to make use of another person’s cached authentication results.
We use two different cache expiration timers: an absolute expiration timer, and a “staleness” timer.
The absolute expiration timer should be shorter than the guaranteed correctness time window we discussed earlier. For example, if a user successfully authenticates at time T and then their authentication is revoked at time T+1ms, we want them to force our system to check authentication well before the five-minute guarantee period. I suggest you set the absolute expiration timer to be one minute shorter than the guarantee period.

With this system, we have an interesting problem. A developer will see one slow API call first (since it includes full authentication), then fast API calls for a period of four minutes, and another slow API call when the cache expires. The system is usually fast but with occasional slowdowns. This is where the staleness timer comes in.

Using a cache staleness timer and background refresh

Most caching systems support absolute expiration or sliding window expiration — but we want the ability to refresh a stale item even while we allow its results to be considered valid.

Here’s how the staleness timer works:

Authentication results whose age is less than the staleness timer are valid.
Authentication results whose age is older than the absolute expiration timer are invalid, and we must do a full verification.
Authentication results older than the staleness timer but newer than the absolute expiration timer are valid— but then the system immediately triggers the verification task in the background. After the background task finishes, the new authentication results are saved in the cache without slowing down any API calls in flight.

Using this system, we should set our staleness timer to roughly half the duration of our absolute expiration timer. This means that if someone continues to make API calls regularly, every two minutes we will kick off a background task to re-verify their authentication credentials. A developer making use of a lengthy series of API calls will see consistently high performance, and we’ll still check authentication regularly.

If your caching system is robust, you can also ensure that only one background check is in flight for any one API key at a time — if you don’t, you might accidentally verify the same value dozens of times in the background without realizing it!

Recheck in case of failure

Let’s consider another type of failure case. A user has their account temporarily disabled for one reason or another; maybe they forgot to pay a bill. They make an API call, and they receive an error message that their account is disabled.

What’s the user going to do? They will call customer support, who will fix them, but our caching system will keep serving up the cache failures until the cache is recomputed. We can imagine this might lead to an extremely unhappy customer support experience:

Customer: “I just got an API error because my account was locked out. Please re-enable my account.”
Representative: “I’ve re-enabled your account.”
Customer: “I tried again, and it still says I’m locked out.”

To fix this problem and to prevent the poor customer service representative from having a miserable time, we have to force a full re-check on failure. You can also implement this as “don’t-cache-failures,” but if you do so, you need to ensure that setting don’t-cache on the results will overwrite any previous successful results.

This has the side benefit of slowing down anyone who is randomly guessing passwords or tokens, while it also allows customer service team members to fix API problems instantly.

Cache invalidation

Ideally your software will also be able to invalidate caches when a change occurs. This may be tricky to do when you need to maintain a list of potential cache keys affected by a specific change. You can do this by tracking cache keys tied to a particular user ID, and then revoking all cache keys tied to that user ID if that user’s record changes.

If you can’t do cache invalidation carefully, please make sure to clearly document your authentication guarantee timer.

Check your software carefully

As with any critical piece of code, what’s important in this software is not only the intention of the code but the actual execution. Please ask your top engineers to review your authentication cache code slowly, and flag anything unnecessarily complicated as a bug.

The greatest risk in authentication is complacency — the sense that “Oh, I’ve done this a million times, this isn’t special.” Any time you work on authentication you need to be extremely cautious. I encourage you to check the OWASP top ten vulnerability list before doing a code review of something so sensitive as authentication.

Ted Spence teaches at Bellevue College and leads engineering at Lockstep. If you’re interested in software engineering and business analysis, I’d love to hear from you on LinkedIn.