Tutorials: HTML/CSS/JS | Leaflet | D3 | Visualizing spatial-temporal data 1 | Visualizing spatial-temporal data 2

Interactive Mapping

Tutorial 3. Data Visualization with D3

Ningchuan Xiao

D3 stands for Data Driven Documents. It is a very popular javascript library that can be used to visualize data on the web. There are many sources where we can learn about this technique and here are a few examples:

This workshop will utilize multiple materials from the above sources, especially from the Murray's tutorials.

Getting started with method chaining

The basic skeleton of an HTML file is like this:

<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>D3 Test</title> <script src="https://d3js.org/d3.v7.min.js"></script> </head> <body> <script type="text/javascript"> // This is where we put the good stuff using D3 </script> </body> </html>

Now let's add one line of code to the above structure:

d3.select("body") .append("p") .text("New paragraph!");

On Safari and Chrome, we can see the effect of this immediately.

In our example, we first get the reference to the body of the entire page using the select method of D3. This reference is then handed off to the next operation, which is to append a p element to the end of body. The append function then hands down the reference to the element to the next one, which in turn is to replace the text inside the element with the string specified.

The way methods are used in the above example is called method chaining. This is an extremely handy way to get things done in Javascript. But we will need to make sure that the next method chained can apply on the element referenced from the previous method. The D3 reference is a good place to look up all the methods and elements.

The key technique that enables method chaining is DOM, or Document Object Model. The simplest way to understanding DOM is to examine the tree structure of the above HTML. Each part of the tree can be considered as an object and D3 (or javascript in general) can retrieve that part so that we can make changes to that part. We can, for example, change the structure as we did by adding a new element into the body object. We can also change the content as we did too. Other things we can do is to change the style and/or add other elements such as different graphs.

Binding data

A big part of D3 is to allow us to incorporate data into our web document and show the data. To illustrate how this works in D3, let's remove all the code from our previous example and add the following in the script part:

data = []; for (i=0; i<10; i++) data[i] = Math.floor((Math.random() * 10) + 1);

The code generates a random set of 10 integers. Then we write the following JS code. The second last line is added here so we can a chance to see what are the numbers in the data array. And the last line is to show how we can dynamically change the style of page elements.

d3.select("body").selectAll("p") .data(data) .enter() .append("p") .text("New paragraph!"); document.write(data); d3.selectAll("p").style("color", "red");

Load the file into the browser and we should see 10 lines of the same text "New Paragraph!" in the page. We need to understand how each line of code works to fully understand why the page looks in this way.

We can first examine what is happening behind the scene. This requires a developer tool from the browser. We can use the Develop menu in Safari, or Inspect option in Chrome. Firefox might have something called Firebug. (No clue about IE.)

We can simply type the following line in the console to see the actions:

console.log(d3.selectAll("p"))

We can take a look at how this turns out in Chrome console, where we should see there is an item called __data__ for each p element (click on the p tag in the Elements tab and then click on Properties tab in the window below to see __data__), and the value of this item is as same as the one in the array. This is what we call data binding.

So let's explain the code:

Printing out the data

Now let's try to actually print out the data on the page. We can replace the previous code part with the following (keep the data initialization part though):

d3.select("body") .selectAll("p") .data(data) .enter() .append("p") .text(function(d) { return d; });

If we replace the last line in the above with the following, we get a more expressive page:

.text(function(d, i) { return 'data #' + i + ' is ' + d; });

Basically, each new p element gets a piece of the data, which is referred to as d in the anonymous function in the last line of code (note the alternative version allows us to get both the data and its index in the array).

Writing anonymous functions can save space. But when the function gets too long, it is a better exercise to move the function out of the calling. Here is another way to do it:

d3.select("body") .selectAll("p") .data(data) .enter() .append("p") .text(text_func); function text_func(d) { return 'Data #' + i + ': ' + d; }

Now we want to do more with the same data. How about changing the way the data is presented? We can try to style the printout based on the values in the data.

d3.select("body") .selectAll("p") .data(data) .enter() .append("p") .text(function(d) { return d; }) .style("color", function(d) { if (d<=5) return "green"; else return "red" });

Basic graphing

Before we try any graphing/plotting, let's take a look at the div element. We can actually do a lot of things with this kind of elements (since it is simply a blank element to begin with). Now, we first define a class (either internal in the style tag or or external in a CSS file):

div.bar { display: inline-block; width: 20px; height: 75px; background-color: steelblue; }

and then add the following into a blank HTML.

<div class="bar"></div>

This should yield a static bar in steelblue. Now, it is time to try our newly acquired coding power in D3. This time, instead of handling in the p elements, we will work on the div elements. Let's change the code in our previous example so that the D3 part of the code looks like this:

d3.select("body").selectAll("div") .data(data) // <-- The answer is here! .enter() .append("div") .attr("class", "bar") .style("height", function(d) { return d * 15 + "px"; });

Two new items here. The first is the attr() method. This is what we use to set the attribute of an element. We can call this method multiple times to set multiple attributes. Here we only set one attribute: the class of the div is set to bar. While the bar class gets a default style, we can certainly change any style we want to, and here comes the style() method. In our case, we set the style of height, which is returned from an anonymous function. We want the height of the bar to be proportional to the data value.

SVG: basic shapes

Before we move on to more sophisticated drawing, we need to know more fundamentals about drawing for the web. This relates to a particular standard called Scalable Vector Graphics, of SVG. To understand how SVG works, let's put the following into another HTML file (e.g., test-svg.html):

<svg width="550" height="100"> <circle cx="250" cy="30" r="25" fill="yellow" stroke="red" stroke-width="5"/> </svg>

We can use SVG to draw many things, include rect, circle, ellipse, line, text, and path. And each of these shapes can be controlled by various parameters, and by different styles (such as stroke color, fill color, stroke-width, and opacity). Multiple shapes can be put together to form more complicated, layered drawings. Here are some examples of some of the simple shapes (with parameters):

<rect x="0" y="0" width="500" height="50"/> <circle cx="250" cy="25" r="25"/> <ellipse cx="250" cy="25" rx="100" ry="25"/> <line x1="0" y1="0" x2="500" y2="50" stroke="black"/> <text x="250" y="25">Easy-peasy</text>

When we think about the coordinates, it is important to remember that the origin of the coordinates here is at the upper-left corner of the screen.

Drawing SVG with D3

While we can definitely use div's to draw simple plots, SVG provides a more powerful option. Let's see how this can be done. Let's make sure the following is the only script in an HTML file:

var svg = d3.select("body").append("svg") .attr("width", 500) .attr("height", 100); svg.append("circle") .attr("cx", 250) .attr("cy", 30) .attr("r", 25); svg.selectAll("circle") .attr("fill", "yellow") .attr("stroke", "orange") .attr("stroke-width", 5);

Now, we bind the drawing with out random data:

data = []; for (i=0; i<5; i++) data[i] = Math.floor((Math.random() * 10) + 1); var svg = d3.select("body") .append("svg") .attr("width", 500) .attr("height", 200) .selectAll("circle") .data(data) .enter() .append("circle") .attr("cx", function(d, i) {return i*30 + 50;}) .attr("cy", function(d, i) {return 100-d*5;}) .attr("r", function(d) {return d*2;}) .attr("fill", "yellow") .attr("stroke", "orange") .attr("stroke-width", 5);

The above code will work, but it looks awful! Let's clear this up a little so we can see more structures of it (just the D3 part). The code listed below may look more clumsy, but it gives us a clear idea of the things created and how to control them.

var svg = d3.select("body") .append("svg") .attr("width", 500) .attr("height", 200); var circles = svg.selectAll("circle") // make new shapes .data(data) .enter() .append("circle"); circles.attr("cx", function(d, i) {return i*30 + 50;}) // control geometry .attr("cy", function(d, i) {return 100-d*5;}) .attr("r", function(d) {return d*2;}); circles.attr("fill", "yellow") /// control style .attr("stroke", "orange") .attr("stroke-width", 5);

SVG bar charts

Now we look at how to make the same bar chart as we did with div's using the same data. But this time we use SVG.

data = []; for (i=0; i<25; i++) data[i] = Math.floor((Math.random() * 50) + 1); var w = 500; var h = 100; var svg = d3.select("body") .append("svg") .attr("width", w) .attr("height", h); var rects = svg.selectAll("rect") .data(data) .enter() .append("rect"); rects.attr("x", function(d, i) {return i * (w / data.length);}) // control geometry .attr("y", 0) .attr("height", function(d) {return d*2;}) .attr("width", w/data.length - 1); rects.attr("fill", "steelblue");

The above bar chart is "upside down" because the Y axis increases from top to bottom on the screen. We can easily flip it by changing the y attribute to the following:

.attr("y", function(d) {return h - d*2;} )

In the same fashion, we can draw a barchart with horizontal bars.

data = []; for (i=0; i<25; i++) data[i] = Math.floor((Math.random() * 50) + 1); var w = 100; var h = 500; var svg = d3.select("body") .append("svg") .attr("width", w) .attr("height", h); var rects = svg.selectAll("rect") .data(data) .enter() .append("rect"); rects.attr("y", function(d, i) {return i * (h / data.length);}) // control geometry .attr("x", 0) .attr("width", function(d) {return d*2;}) .attr("height", h/data.length - 1); rects.attr("fill", "steelblue");

Scales

This is a huge life saving function provided in D3. Mike Bostock, the inventor of D3, defined scales as "functions that map from an input domain to an output range" (link to old version). More information about D3 scales can be found at here.

In the above examples, we have seen how the bars in the barchart are mapped to the X axis in a hard-coded fashion. Scales in D3 can make such mapping process automatic. To make it work, we need a domain that is basically our data and a range that includes the target numbers. In our particular case, we have a domain of 25 (or 5 depending on which case we are looking at) integers, each representing a bar to be drawn, and we want to map it to the X axis that ranges from 0 to 500 as defined by the width of the SVG. Just remember this: from domain to range.

In our case, we are looking at a linear scale and we can create and test one like this:

var scale = d3.scaleLinear(); alert(scale(5));

The above scale apparently will only return whatever is in the input. This is because we haven't defined any domain (input) and range(output) yet. Things get more interesting with the following code:

var scale = d3.scaleLinear(); scale.domain([0, 10]).range([0, 100]); alert("scale(5) = " + scale(5)); alert("scale(200) = " + scale(200));

The last line shows that the scale is quite flexible in handling values outside the domain.

Now we can use this new tool to make the barchart more flexible. We use the functions called d3.min and d3.max to get the minimal and maximal values in an array.

data = []; for (i=0; i<25; i++) data[i] = Math.floor((Math.random() * 50) + 1); var w = 500; var h = 100; var svg = d3.select("body") .append("svg") .attr("width", w) .attr("height", h); var xScale = d3.scaleLinear().domain([0, data.length]).range([0, w]); var yScale = d3.scaleLinear().domain([0, d3.max(data, function(d) { return d; })]).range([0, h]); var rects = svg.selectAll("rect") .data(data) .enter() .append("rect") .attr("x", function(d, i) {return xScale(i);}) .attr("y", function(d) {return h-yScale(d);} ) .attr("height", function(d) {return yScale(d);}) .attr("width", w/data.length - 1); rects.attr("fill", "steelblue"); for (i=0; i<data.length; i++) document.write("<br/>" + i + ": " + data[i]);

There are many different kinds of scales and even for the linear scale there are still more to use and learn.

Drawing axes

We will have to put something more useful on the plot. This would include the axes (c.f. Tufte). This requires the js code below and some CSS after that.

data = []; for (i=0; i<25; i++) data[i] = Math.floor((Math.random() * 50) + 1); var total_w = 500; var total_h = 100; var margin = {top: 15, right: 20, bottom: 20, left: 20}; var w = total_w - margin.left - margin.right; var h = total_h - margin.top - margin.bottom; var svg = d3.select("body") .append("svg") .attr("width", w + margin.left + margin.right) .attr("height", h + margin.top + margin.bottom); var xScale = d3.scaleLinear(). domain([0, data.length]).range([margin.left, w+margin.left]); var yScale = d3.scaleLinear(). domain([0, d3.max(data, function(d) { return d; })]). range([h, margin.top]); var rects = svg.selectAll("rect") .data(data) .enter() .append("rect") .attr("x", function(d, i) {return xScale(i);}) .attr("y", function(d) {return yScale(d);} ) .attr("height", function(d) {return h-yScale(d);}) .attr("width", w/data.length - 1); rects.attr("fill", "steelblue"); var xAxis = d3.axisBottom() .scale(xScale) .ticks(5); svg.append("g") .attr("class", "axis") //Assign "axis" class .attr("transform", "translate(0," + (h) + ")") .call(xAxis); for (i=0; i<data.length; i++) document.write("<br/>" + i + ": " + data[i]);

Here is the style.

.axis path, .axis line { fill: none; stroke: black; shape-rendering: crispEdges; } .axis text { font-family: sans-serif; font-size: 11px; }

Adding the X axis will be relatively straightforward. Now we add the Y axis:

var yAxis = d3.axisLeft() .scale(yScale) .ticks(3); svg.append("g") .attr("class", "axis") .attr("transform", "translate(" + margin.left + "," + 0 +")") .call(yAxis);

More information about axes can be found at here.

More on data

We have been using a simple random data set to illustrate the use of D3. How about more complicated data? We are going to extend our data in different ways. We will explore a 2-D table that is organized as a 2D array. We have the second column in numerical values, but the first column will be categorical and then date/time.

I like random things so we will keep everything random, semantically nonsense but technically nontrivial. The following will be in a file called random_stuff.js that is in the same folder provided for this tutorial and we will load it into an HTML file to use it.

Here is a way to generate a random 2D array (assuming we have loaded random_stuff.js):

var data = []; for (i=0; i<15; i++) { data[i] = [xword(), randint(1, 50)]; } // alert(data);

And we can use this data to draw another barchart with some simple changes (note the use of d[1]).

var total_w = 500; var total_h = 100; var margin = {top: 15, right: 20, bottom: 20, left: 20}; var w = total_w - margin.left - margin.right; var h = total_h - margin.top - margin.bottom; var svg = d3.select("body") .append("svg") .attr("width", w + margin.left + margin.right) .attr("height", h + margin.top + margin.bottom); var xScale = d3.scaleLinear(). domain([0, data.length]).range([margin.left, w+margin.left]); var yScale = d3.scaleLinear(). domain([0, d3.max(data, function(d) { return d[1]; })]). range([h, margin.top]); var rects = svg.selectAll("rect") .data(data) .enter() .append("rect") .attr("x", function(d, i) {return xScale(i);}) .attr("y", function(d) {return yScale(d[1]);} ) .attr("height", function(d) {return h-yScale(d[1]);}) .attr("width", w/data.length - 1) .attr("fill", "steelblue"); var xAxis = d3.axisBottom() .scale(xScale) .ticks(5); svg.append("g") .attr("class", "axis") //Assign "axis" class .attr("transform", "translate(0," + (h) + ")") .call(xAxis); var yAxis = d3.axisLeft() .scale(yScale) .ticks(3); svg.append("g") .attr("class", "axis") .attr("transform", "translate(" + margin.left + "," + 0 +")") .call(yAxis); for (i=0; i<data.length; i++) document.write("<br/>" + i + ": " + data[i]);

Categorical axis

Up to this point, everything seems to work fine, except the X axis doesn't make sense since our data on the X axis is categorical (fake names). Now we are going to use an ordinal scale for the X axis. Here, each name in the data is mapped to a band in a range (D3 term). This is called a band scale. To make such a scale, we will need to put all the individual values in the names into an array, and we do this using the following example:

alert(data.map(function(d) { return d[0]; }));

With that, we can write the following code to make sure the X axis is rendered as categories.

var data=[]; for (i=0; i<15; i++) { data[i] = [xword(), randint(1, 50)]; } var total_w = 500; var total_h = 200; var margin = {top: 15, right: 20, bottom: 50, left: 20}; var w = total_w - margin.left - margin.right; var h = total_h - margin.top - margin.bottom; var svg = d3.select("body") .append("svg") .attr("width", w + margin.left + margin.right) .attr("height", h + margin.top + margin.bottom); var yScale = d3.scaleLinear(). domain([0, d3.max(data, function(d) { return d[1]; })]). range([h, margin.top]); var yAxis = d3.axisLeft() .scale(yScale) .ticks(3); xScale = d3.scaleBand() .domain(data.map(function(d) { return d[0]; })) .paddingInner(0.1) .rangeRound([margin.left, w], 0.05); // new range, with padding of 0.1 xAxis = d3.axisBottom() .scale(xScale); var rects = svg.selectAll("rect") .data(data) .enter() .append("rect") .attr("x", function(d, i) {return xScale(d[0]);}) .attr("y", function(d) {return yScale(d[1]);} ) .attr("height", function(d) {return h-yScale(d[1]);}) .attr("width", xScale.bandwidth()) // NEW way of getting width .attr("fill", "steelblue"); svg.append("g") .attr("class", "axis") .attr("transform", "translate(" + margin.left + "," + 0 +")") .call(yAxis); svg.append("g") .attr("class", "x axis") .attr("transform", "translate(0," + h + ")") .call(xAxis) .selectAll("text") .attr("y", 5) .attr("x", 5) .attr("transform", "rotate(45)") .style("text-anchor", "start"); for (i=0; i<data.length; i++) document.write("<br/>" + i + ": " + data[i]);

And we need a new css selector to specify that the X axis does not draw the horizontal line.

.x.axis path { display: none; }

More information about scale band can be found at here.

TODO

Make a barchart where the bars are filled with two alternating colors. For example, the bars can be colored as blue, green, blue, green, etc. Hint: replace "steelblue" with a function in the fill style for the rectangles.


(Optional)

Time scales and temporal axis

time formatting

Time scale is a special continuous scales in D3.

var n = 15; var dates = []; var data=[]; for (i=0; i<n; i++) dates[i] = randomDate(new Date(2012, 0, 1), new Date()); dates.sort(function(a, b) { return a-b; }); for (i=0; i<n; i++) { data[i] = [String(dates[i]), xword(), randint(1, 50)]; } // Sat Oct 27 2012 20:54:47 GMT-0400 (EDT) var parseDateTime = d3.timeParse("%a %b %d %Y %H:%M:%S"); data.forEach(function(d) { d[0] = parseDateTime(d[0].slice(0, 23)); }); var total_w = 500; var total_h = 200; var margin = {top: 15, right: 20, bottom: 50, left: 20}; var w = total_w - margin.left - margin.right; var h = total_h - margin.top - margin.bottom; var svg = d3.select("body") .append("svg") .attr("width", w + margin.left + margin.right) .attr("height", h + margin.top + margin.bottom); var yScale = d3.scaleLinear(). domain([0, d3.max(data, function(d) { return d[2]; })]). range([h+margin.top, margin.top]); var yAxis = d3.axisLeft() .scale(yScale) .ticks(3); var xScale = d3.scaleTime() .domain(d3.extent(data, function(d) { return d[0]; })) .range([margin.left, w+margin.left]); var xAxis = d3.axisBottom() .scale(xScale) .ticks(5); my_line = d3.line() // .interpolate("linear") //////////////////////////////////////// .x(function(d) { return xScale(d[0]); }) .y(function(d) { return yScale(d[2]); }); var path = svg .append("path") .attr("class", "line") .attr("d", my_line(data)); svg.append("g") .attr("class", "axis") .attr("transform", "translate(" + margin.left + "," + 0 +")") .call(yAxis); svg.append("g") .attr("class", "x axis") .attr("transform", "translate(0," + (h+margin.top) + ")") .call(xAxis); console.log(data);

Here is the CSS part:

.axis path, .axis line { fill: none; stroke: black; shape-rendering: crispEdges; } .axis text { font-family: sans-serif; font-size: 11px; } .x.axis path { display: none; } .line { fill: none; stroke: steelblue; stroke-width: 1.5px; }

TODO

Add a horizontal axis to the above graph.

Loading data from files

Using d3.json, d3.csv, etc.