How to Make a Swarm Chart

Javascript and D3.js Tutorial

This tutorial introduces the concept of a D3 force simulation to lay out elements on a chart. A video lesson walking through a more advanced project, including fluid animation of time series data, is now available on Gumroad: https://chartfleau.gumroad.com/l/masterclass. Use code LAUNCH20 for a $20 discount.

Why a Swarm?

A swarm, or "beeswarm", chart visualizes data at a very granular level. Every element in your dataset is represented by an element in the chart. In addition to conventional x and y dimensions, it also enables us to visualize extra dimensions through size and color. Optionally, animation can add information from yet another dimension (time), making swarm plots a very informationally dense visualization option. Here's an animated example done entirely in D3:

Getting Started

If you're not familiar with D3, it would help to get a basic understanding of how it works first. D3 is widely used on the web, and well-documented. A good starting point is the D3 homepage, and this introductory guide at D3 Observable. Let's begin with an empty web page with an <svg> element that will contain our chart. Note that you will need a local server to host this page in order to see it in your browser. I use Visual Studio Code with a plugin called Live Server for my development environment, but you can use a dev environment of your own choosing. In this example we'll bring in the D3 library via a <script> tag referencing the d3js.org link as per the code below.

The code and data for this project are available on Github.

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <script src="https://d3js.org/d3.v5.min.js"></script>
    <title>Swarm Plot</title>
  </head>
  <body></body>
</html>
<script>
    const width = 1920;
    const height = 1080;
    const margin = [50, 60, 50, 100];
    let svg = d3
        .select("body")
        .append("svg")
        .attr("height", height)
        .attr("width", width);

    d3.csv("data.csv").then((data) => {
        // the rest of our d3 code will go here
    });
</script>

Our Data Structure

In this example, I'm using a list of about 500 companies, including their stock tickers, market capitalizations (ie. size), sector assignments, and their year-to-date (YTD) returns as of May 2020. In this tutorial we'll build a static swarm plot, but if you have time series data you could use Javascript's setInterval() function to animate your chart.

TickerSectorMarket CapReturn
MSFTTechnology1381.230.1637
AMZNTechnology998.090.3188
AAPLTechnology1326.700.086
etc.etc.etc.etc.

Note that we're pulling in this data from a csv file using the asynchronous d3.csv() function, so the rest of our code will be in the callback function that runs after the data is loaded.

Creating X, Y, Color, and Size Scales

Firstly, we'll use a d3.scaleBand() to map the 11 sectors in our dataset to 11 x-coordinates on the screen as follows:

// a Set is a convenient way to remove duplicates
let sectors = Array.from(new Set(data.map((d) => d.Sector)));
let xScale = d3
      .scaleBand()
      .domain(sectors)
      .range([margin[3], width - margin[1]]);

For our y-scale we'll map each company's return to a vertical position on the chart. Because coordinates on the screen begin with (0,0) at the top left corner, the bottom of our chart will have the maximum y-value and the top of the chart will have the minimum.

let yScale = d3
      .scaleLinear()
      .domain(d3.extent(data.map((d) => +d["Return"])))
      .range([height - margin[2], margin[0]]);

Next we'll make a color scale so each of our 11 sectors gets its own color. This is another example of an ordinal scale similar to our x-axis, but instead of mapping to a range of 11 x-coordinates we'll map to a range of 11 colors. D3 has some color scales built in, but you could specify your own array of colors here. For example, instead of d3.schemePaired you could use your own array ["red", "yellow", "orange", etc.]. Specifying colors as hex codes or rgba values is also fine.

let color = d3.scaleOrdinal().domain(sectors).range(d3.schemePaired);

Finally we'll map market capitalization to a radius for our circles. It's up to you how to size your elements, but you should consider your data's domain and the range of circle sizes you want to see on the chart. It's usually best to use a square root scale (d3.scaleSqrt) when sizing circles because a circle's area is a function of the radius squared.

let marketcapDomain = d3.extent(data.map((d) => +d["Market Cap"]));
let size = d3.scaleSqrt().domain(marketcapDomain).range([3, 40]);

Joining Data With Circle Elements

Now let's join our dataset to svg circle elements. This part is essentially no different from any other D3 visualization. Basically the d3.selectAll().data().enter() pattern creates an element in the DOM (in this case a <circle>) for each item in our dataset. If you're unfamiliar with D3, this would be better explained by the D3 homepage

svg.selectAll(".circ")
    .data(data)
    .enter()
    .append("circle")
    .attr("class", "circ")
    .attr("stroke", "black")
    .attr("fill", (d) => color(d.Sector))
    .attr("r", (d) => size(d["Market Cap"]))
    .attr("cx", (d) => xScale(d.Sector))
    .attr("cy", (d) => yScale(d.Return));

At this point we already have something nice to look at:

The Interesting Part - Force Simulation

The key to building a swarm plot is to understand the three forces acting upon each element. Controlling these forces opens up all sorts of creative possibilities.

D3 will do most of the work here, we just need to tell it what to do.

  1. In this example we're orienting our clusters vertically, so every element within a sector should be attracted to the same x-coordinate (the same coordinates defined by our xScale above). Think of the attractive force as a vertical line with the same x-coordinate for all points. Any elements to the left of the line will be pulled towards it to the right, and vice versa from the opposite side.
  2. The y-force is specific to each element within a category (unlike the x-force, which applies the same force to every element in its sector). Each element's y-force pulls its associated circle to the vertical location (y-coordinate) that's mapped from that element's value (in this example the company's YTD return) via the scaling function defined previously.
  3. Finally, to prevent the circles from completely overlapping eachother, D3 provides a helpful collision force that makes our elements push against eachother when they touch. D3 does all the hard work here. We can just specify the radius around which the force will be exerted. Since our circles have unique radii (representing each company's size) we want to set the radius of the collision force equal to the radius of the circle it's linked to in the same way that we defined the radius above.
let simulation = d3.forceSimulation(data)
    
    .force("x", d3.forceX((d) => {
        return xScale(d.Sector);
        }).strength(0.2))
    
    .force("y", d3.forceY((d) => {
        return yScale(d.Return);
        }).strength(1))
    
    .force("collide", d3.forceCollide((d) => {
        return size(d["Market Cap"]);
        }))
    
    .alphaDecay(0)
    .alpha(0.3)
    .on("tick", tick);

There are a few parameters here that are best understood by tinkering with them to get a feel for their effects. Strength is a value between 0 and 1 defining how forcefully our elements will be pulled to their new locations. The alpha parameter is related to how quickly the simulation advances. With alphaDecay set to 0, the forces will continue exerting themselves indefinitely (we'll address that below).

Running the Simulation

In the code above, we referenced a function called "tick". A force simulation proceeds iteratively in ticks, which you can think of as frames of animation. There are functions for starting, stopping, and controlling how many ticks are processed, but by default the simulation will begin rapidly processing ticks to smoothly adjust the coordinates of our dataset frame-by-frame in accordance with the forces we've specified. The force simulation assigns additional properties to our data, including positions and velocities for all elements on each tick. All we need to do is update the location of our circle elements on each tick by referencing the simulated x and y positions as follows:

function tick() {
    d3.selectAll(".circ")
        .attr("cx", (d) => d.x)
        .attr("cy", (d) => d.y);
    }

Applying Force Decay

Our chart is basically working now, but as a final step you may want to assign a force decay that initializes a few seconds after your elements are positioned. Depending on the force parameters you're using, there may be some tension on your circles as the forces continue to interact, which may result in some overlapping elements. Applying a decay will smoothly switch off the forces after our elements are positioned.

let init_decay = setTimeout(function () {
    console.log("start alpha decay");
    simulation.alphaDecay(0.1);
    }, 3000); // start decay after 3 seconds

Now the simulation will converge on a nice stable layout.

What's Next?

Building charts with force simulations has a steep learning curve, but the results will really make your data visualizations stand out. I've published a video lesson on Gumroad (https://chartfleau.gumroad.com/l/masterclass) that will take you through a complete project where we animate time series data. Use code LAUNCH20 for a $20 discount.