Backbone.Conduit Demo

Introduction

This demo means to help clarify how Backbone-based applications perform when loading large amounts of data into a Backbone.Collection. Backbone is great at many things, but data sets larger than several thousand items can result in poor application performance.

The demo runs through the same cycle using three different techniques: One uses basic Backbone techniques, while the other two leverage the Backbone.Conduit plugin, which provides different Backbone.Collection implementations that are optimized for larger data.

What The Demo Does

The goal of this demo is to explore how long it takes to prepare large data sets for use by an application. This demo shows loading data into a Backbone collection broken down into four steps typically taken by an application:

Fetch & Parse the Raw Data
Create the Backbone.Model instances
Filter the data in a non-trivial way (here, only the most recent entry for each restaurant)
Sort the Data in a non-trivial way (here by date, then by name)

After that, the application shows the first three entries in the collection in a simple bit of HTML.

About The Data

The data set used here is all the NYC Restaurant Health Grades over the past several years (~ 221K entries); the smaller data sets are just truncated versions. The demo loads static JSON files from the server, where each entry looks like:

        [
            {"name":"Morris Park Bake Sho","zip":10462,"grade":"A","date":"02/09/2015"},
            {"name":"Wendy's","zip":11225,"grade":"A","date":"12/30/2014"},
            {"name":"Dj Reynolds Pub And","zip":10019,"grade":"A","date":"09/06/2014"},
            ...
        ]

Like most real-world data, this stuff is far from perfect. Many restaurants have duplicate entries; some don't have dates on them at all. This forces us to write sorting & filtering methods that handle these typical real-world data inconsistencies.

Three Techniques

The three implementations differ in two ways: how much work they do, and whether each step in the process is Synchronous or Asynchronous. The less work the step takes, the shorter amount of time is required. And since the UI in browsers is single-threaded, asynchronous steps provide a much better user experience: the UI doesn't "hang" while doing other work.

Example #1: Benchmarking Backbone.Collection

The first example uses a basic Backbone.Collection. Create Models, Filter Collection and Sort Collection steps are synchronous. Some interesting things to note on this implementation:

The smallest size, 5K items, performs moderately well on fast hardware
Larger sizes start to hang the browser pretty significantly because of the synchronous steps
At 100K or larger, the "Create Models" step dominates the time taken

Example #2: Optimized Model Creation via Conduit.QuickCollection

As Example #1 shows, the Model Creation step is very expensive. Example #2 shows performance differences when we treat this as a first class problem. The Collection used here mixes in the same behavior available in Conduit.QuickCollection functionality, which is optimized to create models ~ 45% faster than the benchmark. As a result:

Performance of the "Create Models" step is consistently better
The "Filter Collection" and "Sort Collection" stages are virtually identical to #1
The UI still hangs significantly with larger data sizes due to the synchronous nature of the final steps.

Example #3: Web Worker Data Management via Conduit.SparseCollection

As Example #2 shows, even optimizing the least-scalable step (Model Creation) has it limits. To really scale, we need to make a fundamental change. Example #3 demonstrates this change, adding behavior available in a Conduit.SparseCollection. This shifts as much of the work as possible to another thread via a Web Worker, including the raw data storage (thus the "Sparse" part of the name). It also does not create any Backbone.Model instances unless they are explicitly needed.

This allows all of the steps to be asynchronous, allowing the UI thread to work smoothly even at large data sizes. More specifically:

The raw data fetched and stored on the web worker
Sorting and Filtering are much faster at large data sized, because we deal with the raw data instead of Backbone.Models
The "Create Models" step is the very last one, and does vastly less work: it creates Backbone.Model instances only when completely necessary; in this case only three models are created, because that is all we need
The technique shown is only limited by how much memory you have available on your client, and how quickly you can get the raw data to the client

Where Do We Go From Here

The Backbone.Conduit plugin is in active development. The Conduit.QuickCollection is a drop-in replacement for a Backbone.Collection for most use cases. It should be useful anywhere you believe you may have more than a few hundred models.

The Conduit.SparseCollection is still somewhat experimental, but is very promising. The data access pattern is fundamentally different, but all functionality is exposed via Promises using methods that are functional equivalents to a regular Backbone.Collection. It is not a drop-in replacement, but if you are considering data larger than several thousand items, it is worth a look.

Backbone.Conduit Demo Data Three Ways