Creating an SVG Data Visualization

Creating an SVG Data Visualization

Learn how to create SVG data in this article by Rob Larsen, an experienced frontend engineer, team lead, and manager. He is an active writer and a speaker on web technology with a special focus on the continuing evolution of HTML, CSS, and JavaScript.

This article focuses on a basic data visualization using SVG and JavaScript—an illustration, the positive/negative variance from an average. In this case, it illustrates the number of home runs hits, per season, by the baseball player David Ortiz in his career with the Boston Red Sox compared to his average number of home runs over his Red Sox career.  

From 2003 to 2016, David Ortiz hit a minimum of 23 and a maximum of 54 home runs in a season while playing for the Red Sox. This averaged to 34.5 per season. This visualization will show the relative positive/negative variance of his home run totals for every year against the 34.5 average. The years when he hit more than the average, will be shown in green. Years, when he hit less, will be shown in red. 

The steps you’ll need to go through are as follows:

  1. Take the data and get the total number of years, the total number of home runs, and then calculate the average.
  2. Loop through this data and calculate the positive/negative offset for each year.
  3. Calculate some metrics based on the available screen real estate.
  4. Now, draw a baseline, centered vertically on the screen.
  5. Then, draw a series of rectangles in the appropriate place, with the appropriate height to indicate the positive/negative variance, along with some simple labels indicating the year and number of home runs.
  6. Next, add a legend indicating the average number of home runs and the number of years.

The final visualization will look as follows:

Now that you have the basics planned out, here’s how this works in detail. 

Start with the markup, which is very simple. First, include Bootstrap and the Raleway font as part of your standard template. Following this, set the background of the SVG element and set the font family, size, and color of two different types of the text element. Then, include the target SVG element and the JavaScript file that runs the visualization:

The included JavaScript file is where the real work is done. This JavaScript file is written using several ES6 features. scripts.js itself is basically one large function, viz.

At the top of viz, you have the data variable. This variable is an array of JavaScript objects. Each object has two properties, year and hrs, indicating the year in question and the number of home runs Ortiz hit that year:  

If you were running this visualization interactively, you would just need to have the right structure (an array of objects) and format (hrs and year) and everything else would work itself out. This interactive visualization may either be accepting input from a user or inserting the result of a web service call to a statistical database into the visualization. Keep this in mind as you look at the variables and methods that populate the rest of the file.

There are several different variables that you’ll use, in addition to data, throughout the visualization:

  • doc: A reference to the document
  • canvas: A reference the SVG element with an idof #canvas
  • NS: A reference to the namespace derived from the SVGelement
  • elem: A placeholder variable for the elements you’ll create

Next, there are several utility methods used to populate the visualization with values and elements. The first, addText, lets you add the text labels to the visualization. It takes in a coordinate object, coords, the text to be entered, and then finally, an optional CSS class, cssClass. You’ll explore the use case for the CSS class argument in one of the examples. The first two arguments should be straightforward and are necessary.

After addText, there is an addLine function that allows you to draw lines on the screen. It takes a coordinate object, coords (which in this case contains four coordinates) and optional stroke color. You’ll notice that the stroke is created with a default value in the function signature. If there is no stroke color provided, the stroke will be #ff8000.

Next, the addRect function allows you to add rectangles to the screen. It accepts a coordinate object, coords, which contains height and width properties, as well as optional stroke and fill colors. 

Finally, there’s a function, maxDiffer, which figures out the maximum difference between a set of positive/negative numbers. Getting this range and then using this maximum difference ensures that no matter how the numbers are spread, the maximum height needed above or below the baseline will fit into the screen:

Next, you have the code that defines the heart of the visualization. It happens in a function that runs on the DOMContentLoaded event. As the function runs, create multiple variables holding different properties that you need to generate the visualization. Here’s what they do:

  • viewBox is a local reference to the SVG element’s viewBox. Store this and the following DOM references locally so that you can save on the number of DOM lookups of the viewBox
  • width is a local reference to the width from the SVG element’s viewBox.
  • height is a local reference to the height from the viewBox.
  • x is a local reference to the x point from the viewBox.  
  • y is a local reference to the y point from the viewBox.  
  • padding is an arbitrary constant that creates several padding calculations. 
  • vizWidth defines the visible width of the SVG canvas. This defines the area in which you can safely draw elements into the SVG element.
  • years is a reference to the number of years in the data set.
  • total is a calculated value that represents the total number of home runs hit over the full data set.
  • avg is the average number of home runs hit per year, calculated by dividing the total by the number of years.
  • verticalMidPoint represents the vertical mid-point of the SVG element. This is the line on which positive or negative variances are drawn.  
  • diffs is an array holding the positive and negative difference between the average number of home runs and the number of home runs hit in every year.
  • maxDiff is the maximum difference between the average number of home runs and the number of home runs hit in a given year.
  • yInterval is the number of pixels per home run. This ensures that the boxes scale properly, vertically, based on the number of home runs hit in any given year.
  • xinterval is the number of pixels per year. This value allows us to evenly space boxes no matter how many years are in the data set:

Now, draw the different boxes and add the labels. To do so, use a for…in loop to loop through the array of diffs, doing two calculations that create two new variables, the newX and newY. The newX is a regular interval based on the value of I multiplied by the intervalX variable previously created. The newY variable is calculated by multiplying the value of diffs[i], the current diff, by the yInterval constant. This gives you a distance to calculate the height of the rectangle in order to represent the number of home runs in each year. 

Next, test whether or not the current diff is greater or less than zero. If it’s greater than zero, you can draw a box that goes up from the verticalMidPoint. If the current diff is less than zero, then draw a box that goes down from the verticalMidPoint. Since the direction of the rectangle and the associated anchor points for the box are different in each case, you need to handle them differently. You can also use different colors for the two variations in order to highlight the differences with a secondary indication. 

While there are differences between the two branches of if, both branches call addRect and addText. Now, look at the similarities and the differences between the two branches of if.

For starters, each call to addRect follows the same pattern for the x and width properties. x is always the newX value added to the padding and the width is the xInterval value plus the padding.

The y and height values are handled differently by the two branches.

If the current difference is less than zero, then the new y coordinate is verticalMidpoint. This anchors the top of the box to the line that represents zero on the visualization and indicates that the box will hang below that line. If the current difference is greater than zero, then the y coordinate is set to be verticalMidPoint minus the newY. This sets the top of the new rectangle to be the value of newY above the line that indicates zero. 

The height, if the current difference is less than zero, is the newY value passed into Math.abs(). You can’t pass in a negative value to an SVG element, so the negative value needs to be converted to a positive value using Math.abs(). The height, in the case of a current diff, that’s greater than zero, is just the newY value, since it’s already a positive number. 

The calls to addText in each branch of the if diverge on the placement of the y point. If the newY value is negative, then, once again, Math.abs has to convert the newY value to a positive number. Otherwise, it’s passed through unchanged. 

Following this, add the zero line to the vertical mid-point with a call to addLine. The arguments passed are the unchanged  x and width from the viewBox for the leftmost and rightmost points and verticalMidpint for the y value for both points 

Finally, add a little bit of text that explains the basics of the visualization. Here is where you use the optional cssClass argument to addLine, passing in large so that you can make slightly larger text. The x and y arguments leverage the x and height variables along with the padding variable to place the text slightly off the bottom left edge of the SVG element. 

The final line of code simply calls the viz() function to kick off the visualization:

If you found this article interesting, you can explore Mastering SVG to take the plunge and develop cross-browser-compatible and responsive web designs with SVG. Mastering SVG will help you master creating animations with SVG.

Leave a Reply

Your email address will not be published. Required fields are marked *