Make Your PWA Work Offline Part 2 - Dynamic Data

As you probably remember from my previous post on making your Progressive Web App work offline, storing static files in the cache can be of great help when there’s no Internet connection available.

Yet, achieving full offline support for your app involves more than just caching, and now it's time to show you something more advanced, namely handling dynamic data offline.

In this article, I will show you how to store, update, and synchronize data between your application and a server, drawing on some specific use cases and real-life solutions.

First, let’s go one more time over what dynamic data is—imagine an app like Twitter as a Progressive Web App example.

Aside from static data that you can store in the cache, what else can you find there? Posts, users, comments, likes, and a lot more information that changes over time–dynamic data, in other words.

There are several ways of handling this type of data in offline mode, and below I’ll briefly outline a few, each of which addresses a different behavior of the app. You can use them separately or mix them up, it's up to you.

Questions about Progressive Web Apps?

Reach out to us and we'll gladly put you in contact with one of our experts on the subject.

Cache API

As I already mentioned in my previous post about static data, you can save files as a request-response pair in the cache by using the Cache API.

It's intended for not just static files from the app, but also external resources, like the backend API.

Does it mean that you can use the Cache API alongside SW to store dynamic data?

Well, SW gives you theinstallevent, fired only once in the SW life cycle, so you can use it if there’s almost no possibility of this data changing in the future or changing along with the SW version.

SW also offers thefetchevent, for changes that might happen more often and less predictably.

This option allows you to build middleware for every request done in the app. But before you start the implementation, consider a strategy of serving data that will best fit your needs.

The Google team also included some strategies in their "Offline Cookbook." Let's dive straight into the contents and analyze them using the example of the Twitter app and fetching tweets.

1. Cache only

In this case, the tweets will be fetched only from the cache and should be saved there in theinstallevent, so they’ll be updated only once per one version of the SW. This strategy is best for handling static data.

The “cache only” strategy for offline PWA apps

The “cache only” strategy (source: ”Offline Cookbook”).

2. Network only

A standard strategy which doesn’t require the SW. In this case, the app will just fetch tweets straight from the API, so if the user doesn’t have an Internet connection or it’s too weak, they won't get any tweets at all.

The “network only” strategy for PWA apps

The “network only” strategy (source: ”Offline Cookbook”).

3. Cache, falling back to network

In this instance, the tweets will be loaded from the cache and, if they aren’t there, the SW will fetch them from the API. This, in turn, means that the user will see tweets even while offline, but they won't be the most recent ones.

The “cache, falling back to network” strategy

The “cache, falling back to network” strategy (source: ”Offline Cookbook”).

4. Cache & network race

In this scenario, tweets will be fetched from the faster source and won't be updated in the cache if they’re already there.

This creates a problem similar to the one we encountered with the previous strategy. In this case, however, it can be partly fixed by updating the cache whenever the API fetching is successful.

All you need to do is add the following instruction:.then(response => caches.put(response.clone())after fetch(event.request).

Like I said above, this particular fix is only partial, because in all likelihood the user won't see the latest recent tweets either in online or offline modes.

Why?

Getting data from the cache is usually faster, and so the user will be seeing tweets from the previous response in the cache.

The “cache & network race” strategy

The “cache & network race” strategy (source: ”Offline Cookbook”).

5. Network falling back to cache

The most promising option. The tweets will be loaded from the API first and if that method fails, they will be fetched from the cache.

The app will be able to serve the most recent data and work offline.

What is missing from the "Offline Cookbook," however, is the fact that if you want the most recent data to be stored in the cache, you need to put it there every time you fetch it from the API, just like I described in the previous case.

Still, there’s still one more disadvantage to this particular solution, in my opinion at least: you need to wait until the data is fetched (or not fetched) from the network, so the whole process takes a little longer.

The “network falling back to cache” strategy

The “network falling back to cache” strategy (source: ”Offline Cookbook”).

6. Cache then network

This strategy involves loading tweets from the cache (if it’s faster) and then updating them with the value from the API.

This arrangement gives the user older tweets first, while loading the most recent ones only after some time.

It’s the perfect solution to the problem of handling dynamic data offline, and this is how Twitter’s native app actually works.

Alas, nothing comes for free and this particular strategy is way harder to implement than the others.

It uses the SWfetchevent in a manner similar to the "Network falling back to cache" strategy (with sending the response to the cache), but the most important part exists outside the SW file.

It also requires additional code in places where you’re calling for a resource, plus, you need to fetch data from the API and from the cache at the same time, and update your view each time you get a response.

The strategy is outlined in the "Offline Cookbook," using the trained-to-thrill app as an example.

The “cache then network” strategy

The “cache then network” strategy (source: ”Offline Cookbook”).

In summary: If you prefer an easy solution which will certainly present the most recent tweets in online mode, and the most likely up-to-date tweets in offline mode, use the "Network falling back to cache" strategy. However, if your project schedule gives you enough time to work on a better solution, use the "Cache then network" strategy, as it will result in a much better user experience.

Remember, you should add filtering to the code offered in the "Offline Cookbook," to save in the cache only the responses you need rather than all of them. Here’s an example of such a filter:


self.addEventListener('fetch', event => {
  if (event.request.method === 'GET' && event.request.url.indexOf('https://API-host/tweets') !== -1) {
    event.respondWith(
      caches.open('mysite-dynamic').then(cache => {
        return fetch(event.request).then(response => {
          cache.put(event.request, response.clone());
          return response;
        });
      })
    );
  }
});

Basically, you just need to add anifstatement and check whether a request is asking for tweets or some other resource that you’d like to save in the cache.

In other cases, you won't need to use theevent.respondWithmethod—the browser will handle it without your input.

However, if you have more than one resource that should be stored for offline mode, you need to provide a more advanced filter.

Sometimes, a user may request to save only selected resources (e.g. a single tweet), for example by clicking on a chosen tweet.

If you want to grant them the capability, you can use the Cache API in the same way. Just call thecache.put(request, data)orcache.add(request)method, and the app will try to fetch the response and put it in the cache (addmethod returns Promise withvoid, so you won't have access to the response).

Query Parameters

The Cache API is a store with pairs of requests and responses, and in some cases this might be an issue. Let me explain using a simple case of tweet pagination/filtering.

Usually, to fetch a specific page, we put a query param to the request. If we don’t tell the app to ignore query params while fetching the data from the cache or saving it there, we will get each page of tweets in a separate request-response pair. This can be a problem.

Let's say that you’ve saved a few pages of tweets in the cache. After time passes and more tweets arrive, the app will update the first page of tweets in the cache by overwriting previously saved tweets. In order not to lose those old tweets, the app needs to save them in the cache as a response to another request (probably as a next page).

However, this approach offers no certainty that old tweets won’t appear on the new first page.

To avoid redundancy, the app needs to compare these two responses with the tweets and then maybe modify them in the cache.

If some tweets are moved from the first to the second page, then other tweets should be moved from the second to the third page to fill the gap, and so on.

Imagine you have ten pages stored and you need to update them every time a new tweet appears. Sounds too complicated and problematic for me.

A simplified flow of saving tweets in cache storage while respecting query params.

What if the app ignored query params?

Well, then there would always be only a single page of tweets stored in the cache—unless you stored all of them (rather than only the ones on one page) in one response.

But this approach would only produce even more mess, because the API and the cache would produce different responses to the same request, i.e. the app would return hundreds of unfiltered tweets from the cache when you’d ask for the first ten tweets as filtered by the user. A simplified flow of saving tweets in storage while ignoring query params and handling them manually.

A simplified flow of saving tweets in storage while ignoring query params and handling them manually.

Of course, you can add a flag to a response while saving tweets in the cache— this will help you recognize the tweets already in the cache.

All you would have to do while fetching them, is choose the proper tweets based on page number (or filters).

You would need to add logic to match tweets with an existing response and compare them every time you’d want to save them in the cache.

You can use every type of browser storage (sessions storage, local storage, indexedDB) to implement this kind of solution and thus avoid the disadvantages associated with cache storage.

All in One

Continuing on the subject of storage, why limit yourself to just one?

There are libraries out there which wrap storages in the browser with one API and choose from them to store data based on what the current browser is supporting or what your needs are.

Thanks to such an approach, you wouldn’t have to code different solutions for different browsers and could easily change your default storage in the future should the need arise.

Examples of such libraries include localForage, Minimongo or PouchDB.

The solution in which data is stored without taking query params into consideration can be implemented in every browser store.

Such libraries usually offer a better API than Cache API.

LocalForage has an API which works similarly to LocalStorage API, supporting async operations and using Promises or callback functions. Additionally, you can store native JS objects there (i.e. Blob, Array, Object), rather than just strings (only if you don't use localStorage, but then it stringifies objects automatically).
Minimongo is client-side MongoDB, so it feels like working with a document-oriented database. You can create collections and use basic queries, and you get an advanced API to manage data in the browser. If you are using MongoDB on the backend, it will offer features to merge or combine remote and local data, so you don't have to worry about synchronization.
PouchDB is a JS NoSQL database inspired by Apache’s CouchDB and supporting synchronization with this particular database using the CouchDB sync protocol on the backend.

Working with such libraries is usually similar to working with document-oriented databases (even if you choose IndexedDB as a default store) because of their ability to adapt to other storages.

Is PWA a fit for you?

Check how we build Progressive Web Apps for best performance and UX

Global State

The global state has become a very popular concept as of late.

One of its main advantages is the single source of truth.

It only requires the definition of some data structures and schemas in order to simplify subsequent data maintenance and handling.

You get a data storage ready to be saved in any selected browser storage, one you will be able to use it in offline mode.

Actually, many libraries allow you to do it automatically just by setting a param and choosing which data should be stored.

The data will be updated in the global state along with data in the storage; plus, the global state will be filled with data from the storage on init.

Implementing a global state can help a lot in providing offline support.

There’s only one drawback—most of the time, you won’t be able to choose the type of data storage as it will be done by the library by default.

A sample architecture of data flow in Vuex .

A sample architecture of data flow in Vuex (source: Vuex).

If you’re not familiar with data storage, you can read the official guide to libraries for state management. Most of them have pretty good intros, with basic concepts and data flows.

And there are a lot of libraries out there that allow an easy implementation and insertion of global state into your apps, like Redux, Mobx, Vuex, NGRX, NGXS.

Transactions

Sometimes, connections between data are so important that changing one item in data storage requires us to introduce changes in other parts of data, which has to be performed as a separate operation on the storage.

In such a case, it's important to finish all the operations successfully (ideally as a single operation), or you will have to roll back each change in the data storage if even one operation fails.

This behavior is usually called a transaction, but some call it batched or bulk operations.

In the context of the Twitter app, imagine a situation where you fetch and store tweets in offline storage. Along with the tweets themselves, you should also store data about comments and authors of these tweets.

To make sure that all the required data is successfully saved for offline use, you can perform these operations inside a transaction.

Unfortunately, there aren’t all that many data storage tools or state management libraries out there that would include transaction support.

Currently, IndexedDB is the only browser storage to have such a feature, but some libraries (not too many, unfortunately) that serve as an adapter for browser storages and state management libraries carry their own implementation of this feature.

You have to be careful while using them, because some implementations of transactions do not exactly agree with the textbook definition thereof.

For instance, in Mobx, transactions wrap and queue actions in such a manner that no observer gets notified of the completion of an action until all of them are finished.

If any one of those actions fail, Mobx will not roll back the changes made by dispatching previous actions.

An Interactive Offline App

If you want to add offline support to your app, you should probably consider allowing users to perform offline interactions, rather than just read data stored in the offline cache.

In the context of Twitter, this means the users should be able to post tweets, write comments, like other people’s tweets, etc., even without an Internet connection.

Below, you’ll find a handful of solutions you can turn to to make it possible.

Background Sync

One such option involves the Background Sync API along with service workers. What does it do?

Simply put, it's a low-level feature for running code inside registered events in the background when the user is connected to the Internet, even when the page itself is closed.

In the context of offline support, you can use it to hold request until the user has their Internet connection back, to make sure that they are going to be sent.

This way, you will make sure that your app is synced with a remote database and all changes a user made are there or will be sent there. Here’s how you can use it:

Background Sync flow

Background Sync flow (source of inspiration: ”Offline Cookbook”).

1. Register a sync event

First, you need to register a sync event during which you’ll be syncing the data you want, i.e. tweets


if ('serviceWorker' in navigator) {
  navigator.serviceWorker.register('./serviceworker.js')
    .then(() => navigator.serviceWorker.ready)
    .then(registration => {
      if ('SyncManager' in window) {
        registration.sync.register('sync-tweets');
      }
    })
}

However, you can't register a sync event before the relevant service worker is ready. To check its status, usenavigator.serviceWorker.ready. It should return a Promise which will be resolved when the worker is actually ready.

Then, you just need to check whether the browser supports background sync and register your event usingregistration.sync.register('sync-tweets'), wheresync-tweets is a unique tag for this event. You can register more than one sync event, but each one has to have a different tag.

2. Create a listener for the sync event

Next, create a listener for the sync event in the service worker. It will capture all sync events, so you can add a snippet to its code which will make it run only when a specific event is fired.


self.addEventListener('sync', event => {
  if (event.tag == 'sync-tweets') {
    event.waitUntil(syncTweets());
  }
}

The listener is added in the first line. The next line checks whether a fired event is the proper one by comparing thetagproperty. If so, then you can run a function insideevent.waitUntil(described in the previous post) which will send all new or updated tweets to the API.

3. Send tweets to the API

So, somewhere in your app, you can implement a function that sends the tweets to the API. In this function, if a request to the API fails on account of no Internet connection, you can catch this error and save your request in the indexedDB.

Let's create atwitter.service.jsfile for that:



class TwitterService {
  async createTweet(tweet) {
    return fetch('https://api/tweets', {
      method: 'POST',
      body: JSON.stringify({
        content: tweet.content,
        author_id: tweet.author.id
      }),
      headers: {
        'Content-Type': 'application/json'
      }
    }).catch(error => {
      this.saveTweetInOffline(tweet, 'POST');
      throw error;
    });
  }

  async updateTweet(tweet) {
    return fetch('https://api/tweets', {
      method: 'PUT',
      body: JSON.stringify({
        id: tweet.id,
        content: tweet.content,
        author_id: tweet.author.id
      }),
      headers: {
        'Content-Type': 'application/json'
      }
    }).catch(error => {
      this.saveTweetInOffline(tweet, 'PUT');
      throw error;
    });
  }
}

Inside the service, you can see two methods for creating and updating tweets in the API (createTweet, updateTweet). They both use the fetch API to send the request.

Let's assume that both endpoints in the API have the same URL ('https://api/tweets') but a different method (POST, PUT). You need to send stringified converted object withcontentandauthor_idin the body, and, in case of an update, also the id.

4. Store tweets in IndexedDB

Then, you can move on to handling errors inside thecatchcallback.fetchwill basically reject Promise only when there are some connection problems, so it fits our case perfectly. If it happens, you can store your tweets to be synced later by Background Sync.

To do that, let's add two additional methods inside this class.



openTweetsDB() {
    return openDB('twitter', 1, {
      upgrade(db) {
        db.createObjectStore('tweetsToSync', { keyPath: 'id' });
      }
    });
  }

  async saveTweetInOffline(tweet, method) {
    const db = await this.openTweetsDB();
    const tx = db.transaction('tweetsToSync', 'readwrite');
    tx.store.put({ ...tweet, method });
    await tx.done;
  }

We useopenTweetsDBto open the database in indexedDB and return that database as an object. Here, you can see the following params inopenDB: the database name (twitter), the version (1), and the object itself, with theupgrade function called only when such a version of the selected database hadn’t been opened before.

In this function, you can create the store for caching tweets you want to sync by usingcreateObjectStore('tweetsToSync', { keyPath: 'id' }), where:

'tweetsToSync'is the name of your store,
{ keyPath: 'id', autoIncrement: true }are options of this store, which set id as the key path of this store, or auto-increments id if it's not defined in the object.

In the context of offline support,saveTweetInOfflineis the most important function here. It collects tweets in the created indexedDB store, so the service worker will have access to them while performing the Background Sync.

First, you need to open the database by using theopenTweetsDBmethod. Then, you need to create atweetsToSynctransaction in your store withreadwritemode, in order to be able to put your tweets in the store. You can do that by usingtx.store.put({ tweet, method }). Now, let’s check what’s inside the object you saved:

...tweet—includes all fields from the tweet (spread operator), unconverted, so you still have all the data (like the author's name) needed for the tweet to be displayed on the page available, in case you’d like to display them before the data is synced with the API,
method—new and updated tweets are stored in the same store, so this field is used to differentiate between them, taking POST or PUT string as a value.

Finally, you need to finish the transaction by usingtx.done. Now, you have your tweets ready for syncing.

5. Sync tweets

The last thing we need to take care of is thesyncTweetsfunction inside the listener for thesyncevent in the service worker file. You need to get all the tweets that should be synced from the indexedDB store, send them to the API, and then remove them from the store if they were successfully synced. Here is the code:



async function syncTweets() {
  const db = await openDB('twitter', 1);
  const tweets = await db.getAll('tweetsToSync');
  const tweetsIdsToRemove = [];
  await Promise.all(tweets.map(async ({ method, ...tweet }) => {
    try {
      const response = await fetch('https://api/tweets', {
        method: method,
        body: JSON.stringify(tweet),
        headers: {
          'Content-Type': 'application/json'
        }
      });
      if (response.ok) {
        tweetsToRemove.push(tweet.id);
      }
    } catch (error) { /* Do nothing */ }
  }));
  const tx = db.transaction('tweetsToSync', 'readwrite');
  for (const id of tweetsIdsToRemove) {
    tx.store.delete(id);
  }
  await tx.done;
}

First, you have to define variables.

You need to open the database using theopenDB('twitter', 1)function (same as before, but without creating an object store), and assign a database object to the db constant.
You can use thegetAll('tweetsToSync')method to get all the tweets and save them in thetweetsconstant. You have to input the store's name as a param.
The last variable,tweetsIdsToRemove, is an empty array. The ids of each tweet that is successfully synced will be pushed there, so you will know which tweets should be deleted from the indexedDB store.

After that, there is a loop through tweetstweets.map(({ method, ...tweet }) => {...})wrapped byPromise.allthat determines when the tweet syncing is done. Inside the loop, the fieldmethodand the objecttweetare separated by the spread operator. Each tweet is sent to the API by thefetchfunction with a proper method and body.

This method is similar to thetwitter.service.js, in terms of creating and updating tweets.

To check which tweets were synced with the API, you have to useresponse.okand check whether its value equalstrue. If so, a given tweet can be removed from the offline store because it’s not needed anymore. To mark it as such, theidof this tweet is then pushed totweetsIdsToRemove.

Finally, in order to remove redundant tweets, you need to open a transaction (in the same manner we used for thetwitter.service.jsfile), do a loop through thetweetsIdsToRemovearray, and then delete tweets from the store by usingtx.store.delete(id).

Congratulations, you have background tweet syncing in your app! 🍾

But hold your horses there, partner... unfortunately, as of this point, the Background Sync API still doesn’t have full support in all the major browsers 😫.

Background Sync support in major browsers.

Background Sync support in major browsers.

Another drawback of Background Sync is that a user can't decide when all the data stored offline will be synced.

Imagine that a user has a weak Internet connection and prefers to use it for something more important, but can't actually choose to do so, because this closed app is hoarding the whole bandwidth to sync some data.

To prevent such situations, you will need to implement some additional logic based on user requirements.

An Alternative for Background Sync

Unfortunately, I didn't find any feasible alternative that would allow you to sync data when your app is closed. But maybe you don't actually need data to sync in the background. Maybe it should be synced only upon user request.

Background Sync won’t ask you whether a request can or should be sent.

It just sends them when there’s an active Internet connection. But what if the user stored some pretty big requests in their storage and the sync will take up the whole bandwidth just to send them to the server?

This can block other requests for which the user is asking in real-time and can adversely affect the smoothness of the app.

In such a case, you can use asyncTweetsfunction inside aclickevent or wherever it should be called. Everything else stays the same for Background Sync, the only difference is that you don't have to define a listener for thesyncevent.

If you prefer to sync data in the background, and it doesn’t bother you that the app has to be opened during the process, you can add a listener for theonlineevent onwindow,document ordocument.body, and call thesyncTweetsfunction there.

Summary

There is no single good recipe for implementing offline support in Web applications. Everything depends on the business logic, the data flow, and the particular user environment. I do hope, however, that this article will help you a bit with finding your way. Just be careful, think twice, and do your research before starting the implementation.

Plus, remember to ask yourself a couple of important questions related to working with offline data, like: “How do you want to sync the changes made in the offline mode with the remote data on the server?”, “How will you resolve possible conflicts? Which changes will have bigger priority?”, etc.

Handling dynamic data offline is a very difficult and complex topic, and it’s impossible to cover all of it in just two articles. If you’re looking for more information on the subject, start by googling phrases like "cache synchronization strategies."

Mateusz Adamczyk