How to Design Terrible Graphs: A Developer‘s Guide

Graphs and charts are essential tools in a developer‘s toolkit for communicating data insights to users, stakeholders, and fellow developers. When designed well, they have the power to enlighten and drive better decision-making. But when designed poorly, graphs can confuse, mislead, and undermine a data product‘s credibility and usability.

As a full-stack developer specializing in data visualization, I‘ve witnessed my fair share of terrible graphs in the wild. From analytics dashboards to financial reports to scientific publications, no domain is immune to the scourge of poorly designed charts.

In this article, we‘ll take a lighthearted look at some of the most common anti-patterns in data visualization and how they manifest in different development contexts. Of course, the real goal is to learn how to avoid these mistakes and create graphs that are honest, clear, and effective. We‘ll dive into the code behind terrible graphs, examine real-world examples, and discuss best practices and resources to help you level up your data visualization skills.

The Sins of Terrible Graphs

While terrible graphs come in many varieties, they tend to commit some common sins:

  1. Choosing the wrong graph type for the data: Using a pie chart to show time series data, or a 3D chart to plot two-dimensional data.

  2. Obscuring the data with poor design choices: Garish colors, lack of labeling, illegible fonts, and chart clutter that make the data hard to read.

  3. Distorting the data to mislead the viewer: Truncated axis scales, cherry-picked data, and lack of necessary context that misrepresent the true insights.

  4. Failing to consider the audience‘s needs: Including extraneous or overly complex data that doesn‘t address the key questions users need answered.

To illustrate, let‘s generate some terrible graphs using popular charting libraries. Note: the code samples herein are for educational purposes only. Please do not unleash these monstrosities on your users!

Terrible Graphs in Python with Matplotlib

import matplotlib.pyplot as plt
import numpy as np

# Generate some random time series data 
x = range(1, 31)
y = np.random.randint(0, 100, 30)

# Create a terrible pie chart
plt.figure(figsize=(6,6))
plt.pie(y)
plt.title("Daily Sales: A Useless Pie Chart", fontsize=14)
plt.savefig(‘terrible_pie_chart.png‘, dpi=300, bbox_inches=‘tight‘)
plt.close()

Time series data plotted as a terrible pie chart in Matplotlib

Ah yes, the classic "time series data in a pie chart" abomination. The slices represent sequential days, but plotting them in a circle completely obscures any temporal trends or patterns. To add insult to injury, the chart lacks any labels to provide context for the data. Viewers are left to guess what the heck this cyclopean nightmare is trying to convey.

A line or bar graph would be far more appropriate choices to show how a value is changing over time:

# Create a better line graph of the same data
plt.figure(figsize=(8,4))
plt.plot(x, y)
plt.xlabel("Day")
plt.ylabel("Sales")
plt.title("Daily Sales Over Time")
plt.savefig(‘better_line_chart.png‘, dpi=300, bbox_inches=‘tight‘) 
plt.close()

The same data plotted as a clear line chart

Terrible Graphs in R with ggplot2

Let‘s say you want to visualize the relationship between marketing spend and revenue. You could make a nice, clean scatter plot in ggplot2 to look for correlation:

spend <- c(100, 250, 400, 200, 350, 50, 450, 300, 150, 500)
revenue <- c(1000, 1200, 1800, 1100, 1500, 800, 2000, 1300, 1050, 2200)

df <- data.frame(spend, revenue)

library(ggplot2) 

ggplot(df, aes(x=spend, y=revenue)) +
  geom_point() +
  geom_smooth(method=lm, se=FALSE, color="indianred2") + 
  expand_limits(x = 0, y = 0) +
  labs(title="Marketing Spend vs. Revenue",
       x="Marketing Spend",
       y="Revenue")

ggsave("good_scatter_plot.png", width=6, height=4, dpi=300)

Clear scatter plot showing positive correlation between marketing spend and revenue

But what fun is that? Let‘s make a terrible 3D pie chart instead!

library(plotrix)

pie3D(df$revenue, labels=df$spend, labelcex=0.7, explode=0.1, 
      main="Marketing Spend vs. Revenue: A Terrible 3D Pie Chart", labelrad = 1.3)

The same marketing spend and revenue data plotted as a terrible 3D exploded pie chart

The cardinal sin of data visualization: gratuitous chart junk that adds nothing but distraction. Not only is a pie chart a poor way to show a non-proportional relationship, but the 3D effect distorts the data and makes it harder to compare the slices. The gaudy colors and exploding sections assault the eyes without illuminating the data. Congratulations, you now have a chart that would make Edward Tufte weep.

The Prevalence of Bad Graphs

Terrible graphs are not just the stuff of funny examples – they‘re a pernicious problem in the real world. In a study of graphs from scientific journals, Weissgerber et al. (2015) found that "at least half of the articles with data graphs had some problem with the quality of the graphs" and "about 20% of articles had problems so severe that they either partially or completely obscured the data."

Another study by Ferreira et al. (2014) examined graphs in business reports and found "frequent errors" in scaling, labeling, and design choices that resulted in confusing or misleading graphics.

These findings suggest that bad graphs are disturbingly common, even in publications by expert researchers and analysts. As data visualization expert Alberto Cairo writes in The Truthful Art (2016), "We should be outraged by this, and work hard to change it."

The Ethical Implications of Bad Graphs

For developers who create data products and tools, the stakes around data visualization integrity are high. A misleading graph is not just an innocuous mistake – it can have real consequences for how people understand and act on information.

Consider the damage that could be done by a terrible graph in a medical research paper that leads doctors to make ill-informed treatment decisions. Or an inaccurate chart in a company‘s financial report that causes investors to overvalue the stock. Or a deceptive visualization on a news site that promotes harmful political narratives.

As Cairo argues, "An ethical infographic artist, journalist, or designer cannot just be a decorator who makes numbers and facts visually appealing…If you are an information designer, you are a guide and an enabler, and you must be aware of the power you wield."

Developers who work with data visualization have a responsibility to create honest, clear, and truthful graphs that accurately represent the underlying data. Anything less is a violation of the trust our users place in us and the tools we build.

Best Practices for Designing Better Graphs

So what can developers do to avoid falling into the trap of terrible graphs? Follow these key principles of data visualization design:

  1. Choose the right graph for the data and message: Let the shape and structure of your data guide your choice of graph. Use lines for continuous data, bars for categorical data, scatter plots for correlation, etc.

  2. Keep it simple and focused: Remove any chart junk that doesn‘t directly contribute to communicating the key insights. Use clear labels and legends, but avoid unnecessary decoration.

  3. Use meaningful and accurate scales: Start your axis at zero and use a scale that accurately represents the data. Don‘t truncate or skew the scale to exaggerate or downplay differences.

  4. Provide context and explanations: Help your audience understand what the graph is showing and why it matters. Provide comparisons to benchmarks or past data as needed.

  5. Consider accessibility: Can colorblind users distinguish your categories? Is the text legible at different screen sizes? Design with accessibility in mind.

  6. Get feedback and iterate: Test your graphs with real users and stakeholders. See if they correctly interpret the message. Incorporate their feedback to improve your designs.

There are also many great resources available to help developers learn more about data visualization best practices:

Conclusion

Graphs and charts are incredibly powerful tools for communicating data insights. But with great power comes great responsibility. As developers working with data, we have an obligation to create visualizations that are accurate, clear, and truthful.

By learning to recognize and avoid the traits of terrible graphs, we can ensure that our data products and tools enlighten rather than mislead. We can design graphs that empower our users to make better-informed decisions and take meaningful action.

The next time you go to create a graph, remember the key principles of good data visualization design. Choose the right graph for the data, keep it simple and accurate, provide meaningful context, and always strive to tell the truth.

Together, we can fight back against the scourge of terrible graphs and create a world where data visualization is a force for insight, understanding, and positive change.

Similar Posts