Lecture 2: Clutter and Focus
Nick Huntington-Klein
03 June, 2022
Clutter and Focus
- We want to tell a story with our data
- That is, we want to demonstrate some interesting finding
from our data
- Some stories are clear
- Others are complex
- They all need to be clear
Clutter and Focus
- A common mistake is to show you the data rather than
show you the story
- The story often exists within part of the data
- We want to understand some sort of continuity or some sort
of distinction
- Everything that takes away from that makes the story more
opaque
Examples
- Let’s look at some data visualizations that are cluttered
- Some are still beautiful and lovely and works of art
- Some are impressive with the amount of information they’re able to
get on one graph
- But they can obscure the point they’re trying to make
- With each, let’s think about why
Bible Cross-References
- The following graph shows how different chapters of the bible refer
to each other
Coffee
- How much coffee do South Koreans drink?
- Simple question, right?
Health care
- This graph just wants to get across three statistics about health
care
- And how those statistics vary by state
Network Graphs
- You see a lot of network graphs floating around these
days
- They look super cool
- But they can be hard to actually learn anything from
- Unless the story is just “things sure are connected, aren’t they?”
which isn’t all that interesting
Napoleon’s March
- This is considered one of the most beautiful and impressive pieces
of data viz ever, about Napoleon’s army during its march and retreat
from Russia
- Distance from Paris, size of army, elevation, places where the army
met itself…
- It absolutely tells an intricate and detailed story - but what can
you take away from it if you don’t already know what it’s trying to
say?
- What parts of the story does it guide you towards focusing on? Which
get lost?
Clutter and Focus
- These graphs vary in quality but some common issues:
- They add things to the graph that do not need to be there or are
difficult to understand
- They do not focus your attention on a single takeaway
Clutter and Focus
- Clutter adds cognitive load - it makes your viz more difficult to
understand
- There are many very low-level visual shortcuts in the human brain we
can take advantage of to avoid this
- And to make sure that we can anticipate how the reader will
interpret the visualization
Gestalt Principles of Visual Perception
- Proximity
- Similarity
- Enclosure
- Closure
- Continuity
- Connection
Proximity
- If you put things close to each other, they tend to get thought of
as being associated
- And visually, they are at least visually comparable
- Putting two similar things close together says “these things are
very similar!”
- Putting two distinct things close together says “compare these
please!”
Proximity
Sales |
$10000 |
Marketing |
$15000 |
|
|
R&D |
$5000 |
HR |
$6000 |
Similarity
- Simply put, things that are made similar in some way (shape, color,
shade, etc.) are part of the same whole
Note on using color for this: Remember colorblindness!
- Most common is red/green/(orange), blue/yellow also relatively
common
- When picking colors, avoid having your contrasts be red/green or
blue/yellow. Red/blue is a popular choice here.
- For complete accessibility, focus on highly contrasting
shades
- Colorblind-friendly palettes are available!
Similarity
Enclosure
- If you put a physical enclosure around some things, they will be
perceived as being part of a group
- This can be especially handy if you’re already using some other
forms of similarity, or need to indicate areas that WOULD be part of
that group if you had data there.
- Enclosures say “these things are in a group!” - perhaps you want to
compare that group to other things, or focus attention within
the group?
Enclosure
Continuity
- When there are gaps, we tend to “fill in” in the most intuitive
way
Continuity
- This year-on-year change graph has a gap at Feb. 29. But what does
our brain do?
Connection
- If you put a literal connection between two points, people will
interpret them as being connected!
- Connecting two things says “one of these things is adjacent or
linked to to the other”
- No big surprise
- See any line graph for this.
- This is also a good reason not to use a line graph when
your x-axis doesn’t have an order to it!
How?
- With these concepts in mind, how can we use them to emphasize
information?
- And improve clarity
- Let’s begin with a bad first-pass graph and improve it as we
can
- Let’s tell a story about how Washington’s 4th graders fare in math
vs. other coastal states
State NAEP Test Scores
Proximity: Example
Using similarity
- We may be interested in comparing Washington vs. other coastal
states and saying something about the difference.
- We can do this easily by giving all the coastal states a similar
something
- With bar graphs, an obvious pick is color
Similarity: Example
Clarity
- We have our comparison clarified
- Let’s see what else we can do with this graph
Slices of Data
- What data is important to our story, and what data is not?
- We are interested in comparing Washington to other coastal states,
why do we need all these other states
- Having stuff on the graph says “this is worth your attention
somehow.” If it’s not important to the story, it will confuse or
distract from the story
- Chuck ’em!
- This will also give us room to rotate those axis labels
Clarity
Clarity
- And dare we?
- It’s a debate as to whether it ever makes sense for your \(y\)-axis not to start at zero. But here
it’s really making things hard to see. Clarity could be improved by
starting the axis at, say, 100!
- We will leave this as-is, but it might imply something to think
about for improvement in the future
Contrast
- We can use contrast to make Washington stand out again
- Use a light color for others so they fade more into the back
Contrast
Simplify
- Enclosure allows us to get rid of a lot of the borders
- And we an remove the backing ink too
- Don’t underestimate the benefits of removing background
ink. It can really be distracting! Cleaner graphs look cleaner
and are often nicer.
Simplify
Label Data Directly
- Think: have we provided affordances? Have
we put the right answer in the place where you’d think to look for
it?
- Why make the reader work? Put the label right where it’s needed
- Also, remove the cognitive steps of translating the markers
Label Data Directly
Now - Your Turn!
- You’ll be creating a graph by hand
- Think carefully about how to make data comparable
- And how to contrast as needed
- And how to tell the story
Your Turn
- Story: Sales and Marketing may be more expensive, but they haven’t
grown as much as R&D and HR.
Costs in thousands of dollars by department by year
Dept.
|
Sales
|
Marketing
|
R.D
|
HR
|
2018
|
10000
|
15000
|
5000
|
6000
|
2019
|
10500
|
15000
|
7200
|
6500
|
2020
|
10600
|
16000
|
9300
|
8000
|