To prompt behavior change, we must be able to effectively communicate data. Not convinced? Read this post on why data visualization matters. The goal of this article is to dig in deeper and present some foundational concepts for creating good data visuals.
A recent eXtension webinar[i] described numerous tools and programs at our disposal for creating more engaging data visualizations, so I won’t address those sorts of resources in this post. What I hope to impart are fundamental concepts of data visualizations that are cross-cutting and applicable, regardless of which tool you choose to use to create them. I have distilled these into my top seven characteristics of good data visualization.
1. What’s your point?
Our goal is to present scientific data in a clear and simple way. But do not misunderstand me; I am not advocating for over simplistic, watered down presentations of science. For example, Nate Silver’s FiveThirtyEight[ii] website, featuring data visualizations and journalism on topics of politics, economics, science, and sports, presents lots of complicated data, often using chart types that are unfamiliar and atypical. And even though we have probably all been warned against using visuals that depart from the norm, the site is ranked 618 in the U.S according to Alexa[iii], and, according to Quantcast[iv], over 371,000 visit the site each month in the U.S. Despite my lack of interest in sports in general, I have found myself browsing through numerous sports-related stories on Silver’s site. What makes these fairly complex and unfamiliar graphs engaging and worth spending time looking at? I propose that one key factor is that the authors know their point.
When you are presenting a graph or chart, do you think through what you want people to understand and walk away with? I know that I am guilty of approaching data visualization with the goal of displaying all of the data as neatly and completely as possible. While not a bad idea, more must be considered than whether I was able to fit all the information into the display. Before finalizing a graph to share with others, take a step back and ask yourself, “What is my point?” Then, determine if the graph actually conveys that, or if there is a way to make your point clearer. It could be that a different chart type or color scheme would help elucidate your point.
2. Choose the right chart
I suspect that by and large, bar charts are the most used chart type, and possibly for good reason: they are simple to read and people are familiar with them. However, they are not a one-size-fits all solution, and numerous other options should be considered. The following charts are ones I have experimented with in the last year.
Consider trying one of these chart or graphic types out this year. All of these examples were made in Excel – not with any special software – using some creative “tricks” to make them possible. At the end of this post I provide some resources to help you learn these tricks.
When you have one or two numbers that tell the story, highlighting a single number is a great option. A caution, however: do not overuse this simple tool or it will lose its impact.
Tables are rarely a good tool for showing data in a live presentation, but they do give you the ability to present a lot of detail and can be useful in printed materials. Combining a table with the technique used in a heatmap – that is, adding colors that vary in intensity to show relative performance – can help readers more quickly process and see patterns. In the example below, the darker colors represent higher yields, allowing the reader to see at a glance which combination of nitrogen application rate and seeding rate results in the best yield.
Layered bar graph
A layered bar graph is essentially combining the two bars of a side-by-side double bar graph. Both the grey bars and red bars are assumed to start at the 0 point on the x-axis. This allows an easy comparison showing us how much more there is of the grey than the red. Combining the bars is a good technique for saving space and clearly illustrates the difference between the two things you are comparing, especially when you want to emphasize the relative difference and not necessarily the quantitative difference. Instead of a legend, color in the title and color of the bar segments are used to communicate information about the different elements being compared.
Small multiples is a great tool for breaking complex information into an array of manageable and comparable information. The technique uses multiple views to show different partitions of a dataset, using a series of similar charts or graphs with the same scale and axes that can be easily compared. There are numerous uses for small multiples, and many chart types can be broken down into small multiples; this examples uses horizontal bar charts.
3. Less is more
Eliminating unnecessary legends, gridlines, tick marks, and colors will clean up the graph and allow you to focus your learner’s attention on your point.
Eliminating the legend is a good strategy to clean up the graphic, and, if done well, makes interpretation of the graph quicker. Labeling bars directly, such as in the small multiples example, makes it easier for the viewer to process information because they do not have to look between the legend and the main part of the chart to determine what each color in the bar chart represents. Color also can be used in a similar way, such as in the stacked bar example. Colors in the title and on the average lines indicate what the grey and red categories are, making a separate label unnecessary.
Consider whether eliminating axis labels and instead labeling the points directly might be advantageous. When the actual numeric value is important, label the points directly; when the overall trend is important, leave the axis labels in place. To reduce redundancy, however, do not use both axis labels and individual point labels. One exception to this guideline would be to use the axis labels, but label a few key data points to draw attention to them.
4. Use color intentionally
I have seen many graphs like the following. In this case, each individual site was given a different color. The graph is bright and eye catching, yet the color is not used in a meaningful way. Separating the various sites with different colors is not important and only detracts from the overall point.
Color is a powerful tool and should always be used to convey a message. When I am developing a graphic, I like to first make as much of the graph as possible grey. Then I go back and begin using color to make the key point stand out. In the following graph I have used a lighter shade of red for the late planting date, and a darker shade of red for the early planting date. Color in the subtitle is used to designate what the different colors of bars represent and allows the legend to be eliminated.
5. Create pointed titles and call out key points with text
The previous graph could be given a title along the lines of “Soybean Yield by Planting Date, 2008 to 2010.” However, a much more useful title could be leveraged to communicate the key point – in this case, “Planting Soybeans Early Resulted in an Average 2.7 bu/acre Yield Increase.”
Text also can be used in other strategic locations, such as the use of the word “12 On-Farm Research Sites” to designate all the sites along the x-axis rather than labeling them each “site 1, site 2, etc.” A subtitle is used to designate what the different colors of bars represent and provides additional useful information about planting dates.
6. Get feedback and iterate
This process is dynamic and, at least for me, requires lots of trial and error. Utilize the back button. Or create a separate copy before trying a bold remake, which also allows you to compare the first and second versions. On a number of occasions, once I had gotten a graph cleaned up and presentable, I realized my point would be better displayed with a completely different graph type, and I ended up starting the process over again.
Starting with a quick sketch on a sheet of scratch paper can also be helpful. Sometimes you can save time by quickly drawing out some ideas of how to display your variables before beginning your computer work. This also forces you to think through the concept rather than just defaulting to one of Excel’s recommended charts.
Getting feedback can be very valuable. Ask other people to take a look at your graph. Ask them what they think the main point is, and what they notice first. Audiences also often provide great feedback. Take note of what questions your audience have and then determine if there is a way to make your graph more clearly communicate the information they need.
7. Read up and copy other visualizations
Many of the graphics I have experimented with came from examples that intrigued me by the effectiveness with which they communicated information. I encourage you to browse websites and follow Twitter accounts that routinely produce good data visualizations. If you see something that really communicates information well, take a few minutes to look at it and think about why it is effective, then try to incorporate that into your future designs.
Here are some suggestions to get you started:
- Storytelling with Data[v] (by author Cole Knaflic)
- Stephanie Evergreen[vi]
- USDA ERS Chart Gallery[viii]
- USDA ERS Data Visualizations[ix]
This is admittedly a very brief introduction to the concept of data visualization. There are lots of great resources that discuss how to pick the right chart for your data – and even walk you through how to create them. Two of my favorites that are fairly comprehensive are “Storytelling with Data” by Cole Knaflic and “Effective Data Visualization“ by Stephanie Evergreen.
Be patient with yourself – as with most things, learning to create good data visualizations takes time. Scott Berinato, author of “Good Charts: The HBR Guide to Making Smarter, More Persuasive Data Visualizations,” sums it up well: “Simplicity takes some discipline and courage to achieve. The impulse is to include everything you know. But charts communicate the idea that you’ve been just that – busy[x].”
These seven suggestions are meant to serve as a starting point and to encourage you to begin experimenting with the way you communicate data. In the next post, I will take you through a data visualization makeover using the elements I outlined in this post.
Please take a minute to answer these three questions. Your feedback helps direct future articles and resources.
[gform form=”https://docs.google.com/forms/d/e/1FAIpQLSfmAI4bHnnWZYSecowyCiFqkuI3knR8C77jsJaXoMpiU4H2LQ/viewform?usp=sf_link” legal=’off’ title=’off’]