Lecture 12: Story and Geometry

Nick Huntington-Klein

24 January, 2024

Making it All Make Sense

  • Now that we’ve spent some time on technical detail and how to get things working…
  • Let’s bring it back to the point of all this, which is not to show off our R skills but to communicate effectively
  • Don’t be so proud of your ability to code something up that you forget to make the product good!

Story and Data

  • First off, remember our goals:
  • Get across an idea
  • Communicate clearly

Story and Data

  • It’s surprisingly common for people to forget the goal of getting across an idea
  • Granted, this isn’t always necessary - some visualizations are just fun to look at even if we don’t care what they say. But that’s more like art than communication
  • But in most settings we want to transmit an idea

Story and Data

  • What kind of ideas work well?
  • You can think of a story as something you learn from the data. It reveals something about the world.
  • There’s an important distinction between showing the data and demonstrating an idea
  • Ask “what would I learn from viewing this visualization” and, crucially, “if I learned that thing, would I care?”

Good Stories

Good stories:

  • Use data to broaden our understanding of something, rather than using data to… show us data
  • Provide an understanding that is interesting or actionable
  • Don’t prompt responses of “why should I care about this?”

Visualization Implies the Story

  • Your decision of what to include or exclude, and what to compare or contrast, implies a story
  • Make sure it’s the one you want to get across!
  • Sometimes, you can just straight up say the idea in a title or annotation and just demonstrate with the visualization
  • But a more satisfying (and convincing) result comes if the viewer comes to the conclusion on their own by viewing the viz
  • Think “if I want people to come to this conclusion, what about my visualization points them there?”

Visualization Implies the Story

  • What conclusion would this (fictional) graph make us come to?

Visualization Implies the Story

  • How about this?

Visualization Implies the Story

  • And this one?

Visualization Implies the Story

  • If our goal was to show that Aja’s career has really taken off, which visualization works best?
  • Is that a good story to try to tell?
  • What might someone do with that information?

Landing on the Story

  • It’s very easy for us to write a graph that demonstrates the thing we already know but doesn’t inspire that same idea in others
  • We need to do everything we can and get it all working right if that’s our goal!
  • We talked about removing clutter - this is one reason why. Anything else on the graph is something besides your idea they can get distracted by

Tips

  • Don’t underestimate how difficult it is to get an idea across, especially indirectly by hoping someone sees your intention on a graph
  • The best way to build intuition for how to do this is to show other people your work and see if they come to the same conclusion you intended
  • Ask them what they think the graph means! Not just what it shows but what it means. The answer will often shock you

Picking Geometries

  • In the same vein, the choice of geometry has a lot to do with what stories can be effectively communicated and how they come across
  • There are also some common errors made with certain geometries
  • So let’s talk about them.
  • We must ask: what does this geometry let us compare and contrast, and what do we want to compare and contrast?
  • We can only evaluate a few but this will help us learn to evaluate them overall (and pick up some specific tips)

In General

  • If a geometry shows lots of separate distinct objects (like a bar graph), it wants you to compare those objects
  • If a geometry draws a continuous path (like a line graph, or a cluster of points with a shape to it), it wants you to follow that path and see how it changes smoothly along some axis
  • If a geometry shows an area divided into many parts (like a pie chart, treemap, or a stacked bar graph) it wants you to see how a whole object is divided up

In General

  • If a geometry represents something real (like a map) then it expects us to interpret it as that real thing
  • If a geometry shows differing colors, sizes, or shades (like a lot of stuff), it’s expecting us to notice that something is more and compare it to something that is less
  • If a geometry shows crowded areas and sparse areas (like a density plot, or clusters of points), it’s expecting us to see which areas are well-populated and which aren’t

Bar Graphs

Bar Graphs

Good for:

  • Categorical x-axis variables
  • For which you want to show a single count or summary statistic for each group
  • Or possibly do a grouped bar chart (position = 'dodge') to compare that summary statistic for that group within another grouping

(Tech tips: geom_col() lets you calculate bar height directly, reorder() in aes() lets you order the bars by size)

Bar Graphs

Common mistakes:

  • Overuse when other graphs (line graph, treemap) might be more appropriate
  • Needing to compare values separated across long distances (next slide)
  • Long labels on x-axis (try flipping the coordinates! +coord_flip())
  • Not using labeling to call attention to important parts
  • Stacked bar charts outside a few good applications

Bar Graph Mistakes Examples

  • This requires us to do some visual gymnastics to compare the right stuff. A line graph would do this better.

Bar Graph Mistakes Examples

  • This requires us to do some visual gymnastics to compare the right stuff. A line graph would do this better. Although this is still quite busy! Perhaps a revision could simplify.

Bar Graph Mistakes Examples

Bar Chart Mistakes Examples

Stacked Bar Charts

  • Stacked bar charts make comparisons difficult
  • You can track the size of the bottom category but not the rest
  • Requires you to track a whole lot of information
  • ONLY good for if you are ONLY interested in bottom category AND the total, but NOT any other categories
  • Also, please please only apply this to outcome variables where it makes sense to stack (sum them) together. “Average price by category” doesn’t work.

Stacked Bar Charts

  • More broadly, a common mistake with stacked bar charts that I also see with pie charts, tree maps, and other geometries that divide an area up into sections:
  • Dividing an area up into sections implies you are dividing a whole into pieces and also that you can add the pieces back together to get the whole
  • This doesn’t work with stuff like averages! If you stack “average male income” on top of “average female income” you don’t get “average overall income”.

Some Bar Graph Alternatives

# Good for taking up a bit less space, emphasizing the top
library(ggalt)
mtcars %>%  group_by(cyl) %>% summarize(N = n()) %>%
  ggplot(aes(x = factor(cyl), y = N)) + 
  geom_lollipop() + theme_classic() + 
  labs(x = 'Number of Cylinders',y = 'Number of Cars')

Some Bar Graph Alternatives

# For just plain count-of-how-many, tree maps are way better at lots of categories
# Speaking of which, tree maps are better than pie charts almost 100% of the time
library(treemapify)
mtcars %>% group_by(cyl) %>% summarize(N = n()) %>%
  ggplot(aes(area = N, label = paste0(cyl,' Cylinders: ',N), fill = factor(cyl))) + 
  geom_treemap() + geom_treemap_text() + 
  guides(fill = 'none') + paletteer::scale_fill_paletteer_d('colorblindr::OkabeIto')

Line Graphs

  • Line graphs are good for one-to-one relationships between the x and y axis
  • One x in, one y out \(\rightarrow\) a bunch of x in, one line out
  • Almost always x is in consistent discrete jumps (i.e. 1 2 3 not 1 3 100)
  • Often, x is time
  • Change over time
  • When change occurred
  • How different groups change (multiple line graphs on the same axes)

Common Mistakes with Line Graphs

  • Too many lines
  • Lines that cross back and forth with each other and obscure each other a lot
  • Legends (label the lines! It’s easy! geom_text()). Skip only if there are too many lines.
  • Huge range in the y-axis (or too little range)
  • Multiple y-axes
  • An unordered x-axis (don’t use a line graph for bar graph jobs!) - remember, the line implies continuity!
  • Curved lines can give illusion of continuity if it’s not there

A Line Graph

Scatterplots

  • Scatterplots show the relationship between two continuous variables, plotting every observation
  • Use when that’s your goal, explaining how two variables move together AND you can see that relationship on the graph
  • The shape of the relationship between them (positive, negative, null, U-shaped, etc.) should be clear
  • Can also be handy for pointing out outliers (with annotation)
  • Adding best-fit lines: when? If you want to refer to its slope, or if the shape is still clear but could maybe use a little help, or to contrast a couple of groups

Scatterplots

Common Scatterplot Mistakes

  • Expecting too much of the audience (scatterplots are easy to understand but it usually takes experience to walk away with the story from a scatterplot)
  • Doing it with too many points - likely either a confusing mess, or at least some data will be obscured
  • Non-continuous X or Y variables! geom_jitter() only helps a bit

Too many points!

Alternative for too many points

  • Density coloring with geom_bin2d, although this requires some sophistication to read

Alternatives for Factors

  • Factor vs. continuous: ridge or violin density plots (next section)
  • Factor vs. factor: heatmaps as above, balloon plots

Tables

  • When individual values are important
  • We want things to be easy to look up
  • and CONCISE. Tables are inherently too messy so trim out everything

Bad Table

region group fertility ppgdp lifeExpF pctUrban infantMortality
Afghanistan Asia other 5.968000 499.0 49.49000 23 124.535000
Albania Europe other 1.525000 3677.2 80.40000 53 16.561000
Algeria Africa africa 2.142000 4473.0 75.00000 67 21.458000
American Samoa NA NA NA NA NA NA 11.293887
Angola Africa africa 5.135000 4321.9 53.17000 59 96.191000
Anguilla Caribbean other 2.000000 13750.1 81.10000 100 NA
Argentina Latin Amer other 2.172000 9162.1 79.89000 93 12.337000
Armenia Asia other 1.735000 3030.7 77.33000 64 24.272000
Aruba Caribbean other 1.671000 22851.5 77.75000 47 14.687000
Australia Oceania oecd 1.949000 57118.9 84.27000 89 4.455000
Austria Europe oecd 1.346000 45158.8 83.55000 68 3.713000
Azerbaijan Asia other 2.148000 5637.6 73.66000 52 37.566000
Bahamas Caribbean other 1.877000 22461.6 78.85000 84 14.135000
Bahrain Asia other 2.430000 18184.1 76.06000 89 6.663000
Bangladesh Asia other 2.157000 670.4 70.23000 29 41.786000
Barbados Caribbean other 1.575000 14497.3 80.26000 45 12.284000
Belarus Europe other 1.479000 5702.0 76.37000 75 6.494000
Belgium Europe oecd 1.835000 43814.8 82.81000 97 3.739000
Belize Latin Amer other 2.679000 4495.8 77.81000 53 16.200000
Benin Africa africa 5.078000 741.1 58.66000 42 76.674000
Bermuda Caribbean other 1.760000 92624.7 82.30000 100 NA
Bhutan Asia other 2.258000 2047.2 69.84000 35 37.995000
Bolivia Latin Amer other 3.229000 1977.9 69.40000 67 40.684000
Bosnia and Herzegovina Europe other 1.134000 4477.7 78.40000 49 12.695000
Botswana Africa africa 2.617000 7402.9 51.34000 62 35.117000
Brazil Latin Amer other 1.800000 10715.6 77.41000 87 19.016000
Brunei Darussalam Asia other 1.984000 32647.6 80.64000 76 4.529000
Bulgaria Europe other 1.546000 6365.1 77.12000 72 9.149000
Burkina Faso Africa africa 5.750000 519.7 57.02000 27 70.958000
Burundi Africa africa 4.051000 176.6 52.58000 11 94.083000
Cambodia Asia other 2.422000 797.2 65.10000 20 52.835000
Cameroon Africa africa 4.287000 1206.6 53.56000 59 84.915000
Canada North America oecd 1.691000 46360.9 83.49000 81 4.926000
Cape Verde Africa africa 2.279000 3244.0 77.70000 62 18.458000
Cayman Islands Caribbean other 1.600000 57047.9 83.80000 100 NA
Central African Republic Africa africa 4.423000 450.8 51.30000 39 95.781000
Chad Africa africa 5.737000 727.4 51.61000 28 123.940000
Channel Islands NA NA NA NA NA NA 8.169000
Chile Latin Amer oecd 1.832000 11887.7 82.35000 89 6.792000
China Asia other 1.559000 4354.0 75.61000 48 19.637000
Colombia Latin Amer other 2.293000 6222.8 77.69000 75 16.671000
Comoros Africa africa 4.742000 736.6 63.18000 28 62.830000
Congo Africa africa 4.442000 2665.1 59.33000 63 66.738000
Cook Islands Oceania other 2.530806 12212.1 76.24547 76 11.551788
Costa Rica Latin Amer other 1.812000 7703.8 81.99000 65 9.172000
Cote dIvoire Africa africa 4.224000 1154.1 57.71000 51 68.845000
Croatia Europe other 1.501000 13819.5 80.37000 58 5.571000
Cuba Caribbean other 1.451000 5704.4 81.33000 75 4.959000
Cyprus Asia other 1.458000 28364.3 82.14000 71 4.434000
Czech Republic Europe oecd 1.501000 18838.8 81.00000 74 2.997000
Democratic Republic of the Congo Africa africa 5.485000 200.6 50.56000 36 109.477000
Denmark Europe oecd 1.885000 55830.2 81.37000 87 3.914000
Djibouti Africa africa 3.589000 1282.6 60.04000 76 74.950000
Dominica Caribbean other 3.000000 7020.8 78.20000 67 NA
Dominican Republic Caribbean other 2.490000 5195.4 76.57000 70 21.589000
Timor Leste Asia other 5.918000 706.1 64.20000 29 56.499000
Ecuador Latin Amer other 2.393000 4072.6 78.91000 68 19.070000
Egypt Africa africa 2.636000 2653.7 75.52000 44 22.029000
El Salvador Latin Amer other 2.171000 3425.6 77.09000 65 19.007000
Equatorial Guinea Africa africa 4.980000 16852.4 52.91000 40 93.315000
Eritrea Africa africa 4.243000 429.1 64.41000 22 47.508000
Estonia Europe oecd 1.702000 14135.4 79.95000 70 4.382000
Ethiopia Africa africa 3.848000 324.6 61.59000 17 62.902000
Fiji Oceania other 2.602000 3545.7 72.27000 52 17.216000
Finland Europe oecd 1.875000 44501.7 83.28000 85 2.783000
France Europe oecd 1.987000 39545.9 84.90000 86 3.345000
French Guiana NA NA NA NA NA NA 12.714000
French Polynesia Oceania other 2.033000 24669.0 78.07000 51 7.159000
Gabon Africa africa 3.195000 12468.8 64.32000 86 43.770000
Gambia Africa africa 4.689000 579.1 60.30000 59 66.374000
Georgia Asia other 1.528000 2680.3 77.31000 53 25.585000
Germany Europe oecd 1.457000 39857.1 82.99000 74 3.487000
Ghana Africa africa 3.988000 1333.2 65.80000 52 43.867000
Greece Europe oecd 1.540000 26503.8 82.58000 62 4.488000
Greenland NorthAtlantic other 2.217000 35292.7 71.60000 84 NA
Grenada Caribbean other 2.171000 7429.0 77.72000 40 13.042000
Guadeloupe NA NA NA NA NA NA 6.725000
Guam NA NA NA NA NA NA 8.070000
Guatemala Latin Amer other 3.840000 2882.3 75.10000 50 26.269000
Guinea Africa africa 5.032000 427.5 56.39000 36 84.176000
Guinea-Bissau Africa africa 4.877000 539.4 50.40000 30 109.818000
Guyana Latin Amer other 2.190000 2996.0 73.45000 29 36.830000
Haiti Caribbean other 3.159000 612.7 63.87000 54 58.260000
Honduras Latin Amer other 2.996000 2026.2 75.92000 52 23.515000
Hong Kong Asia other 1.137000 31823.7 86.35000 100 2.026000
Hungary Europe oecd 1.430000 12884.0 78.47000 68 5.304000
Iceland Europe other 2.098000 39278.0 83.77000 94 2.057000
India Asia other 2.538000 1406.4 67.62000 30 47.894000
Indonesia Asia other 2.055000 2949.3 71.80000 45 24.929000
Iran Asia other 1.587000 5227.1 75.28000 71 23.385000
Iraq Asia other 4.535000 888.5 72.60000 66 33.293000
Ireland Europe oecd 2.097000 46220.3 83.17000 62 3.859000
Israel Asia oecd 2.909000 29311.6 84.19000 92 3.347000
Italy Europe oecd 1.476000 33877.1 84.62000 69 3.417000
Jamaica Caribbean other 2.262000 4899.0 75.98000 52 22.023000
Japan Asia oecd 1.418000 43140.9 87.12000 67 2.549000
Jordan Asia other 2.889000 4445.3 75.17000 79 19.140000
Kazakhstan Asia other 2.481000 9166.7 72.84000 59 23.716000
Kenya Africa africa 4.623000 801.8 59.16000 23 58.142000
Kiribati Oceania other 3.500000 1468.2 63.10000 44 52.000000
Kuwait Asia other 2.251000 45430.4 75.89000 98 7.563000
Kyrgyzstan Asia other 2.621000 865.4 72.36000 35 32.765000
Laos Asia other 2.543000 1047.6 69.42000 34 36.809000
Latvia Europe other 1.506000 10663.0 78.51000 68 6.700000
Lebanon Asia other 1.764000 9283.7 75.07000 87 20.223000
Lesotho Africa africa 3.051000 980.7 48.11000 28 62.103000
Liberia Africa africa 5.038000 218.6 58.59000 48 76.853000
Libya Africa africa 2.410000 11320.8 77.86000 78 13.248000
Lithuania Europe other 1.495000 10975.5 78.28000 67 5.941000
Luxembourg Europe oecd 1.683000 105095.4 82.67000 85 2.289000
Macao Asia other 1.163000 49990.2 83.80000 100 4.130000
Madagascar Africa africa 4.493000 421.9 68.61000 31 41.030000
Malawi Africa africa 5.968000 357.4 55.17000 20 86.060000
Malaysia Asia other 2.572000 8372.8 76.86000 73 6.880000
Maldives Asia other 1.668000 4684.5 78.70000 41 8.070000
Mali Africa africa 6.117000 598.8 53.14000 37 92.206000
Malta Europe other 1.284000 19599.2 82.29000 95 5.405000
Marshall Islands Oceania other 4.384466 3069.4 70.60000 72 21.000000
Martinique NA NA NA NA NA NA 7.158000
Mauritania Africa africa 4.361000 1131.1 60.95000 42 69.930000
Mauritius Africa africa 1.590000 7488.3 76.89000 42 12.112000
Mayotte NA NA NA NA NA NA 5.884000
Mexico Latin Amer oecd 2.227000 9100.7 79.64000 78 14.146000
Micronesia Oceania other 3.307000 2678.2 70.17000 23 31.447000
Moldova Europe other 1.450000 1625.8 73.48000 48 14.344000
Mongolia Asia other 2.446000 2246.7 72.83000 63 30.705000
Montenegro Europe other 1.630000 6509.8 77.37000 61 7.733000
Morocco Africa africa 2.183000 2865.0 74.86000 59 28.502000
Mozambique Africa africa 4.713000 407.5 51.81000 39 77.858000
Myanmar Asia other 1.939000 876.2 67.87000 34 44.802000
Namibia Africa africa 3.055000 5124.7 63.04000 39 29.761000
Nauru Oceania other 3.300000 6190.1 57.10000 100 45.800000
Nepal Asia other 2.587000 534.7 70.05000 19 32.013000
Neth Antilles Caribbean other 1.900000 20321.1 79.86000 93 12.281000
Netherlands Europe oecd 1.794000 46909.7 82.79000 83 4.168000
New Caledonia Oceania other 2.091000 35319.5 80.49000 57 4.680000
New Zealand Oceania oecd 2.135000 32372.1 82.77000 86 4.757000
Nicaragua Latin Amer other 2.500000 1131.9 77.45000 58 18.315000
Niger Africa africa 6.925000 357.7 55.77000 17 85.820000
Nigeria Africa africa 5.431000 1239.8 53.38000 51 87.561000
Niue NA NA NA NA NA NA 7.800000
North Korea Asia other 1.988000 504.0 72.12000 60 25.053000
Northern Mariana Islands NA NA NA NA NA NA 4.859087
Norway Europe oecd 1.948000 84588.7 83.47000 80 2.940000
Oman Asia other 2.146000 20791.0 76.44000 73 8.414000
Pakistan Asia other 3.201000 1003.2 66.88000 36 65.724000
Palau Oceania other 2.000000 10821.8 72.10000 84 20.075282
Palestinian Territory Asia other 4.270000 1819.5 74.81000 74 19.503000
Panama Latin Amer other 2.409000 7614.0 79.07000 75 16.168000
Papua New Guinea Oceania other 3.799000 1428.4 65.52000 13 44.474000
Paraguay Latin Amer other 2.858000 2771.1 74.91000 62 27.375000
Peru Latin Amer other 2.410000 5410.7 76.90000 77 18.273000
Philippines Asia other 3.050000 2140.1 72.57000 49 20.886000
Poland Europe oecd 1.415000 12263.2 80.56000 61 5.546000
Portugal Europe oecd 1.312000 21437.6 82.76000 61 4.175000
Puerto Rico Caribbean other 1.757000 26461.0 83.20000 99 7.243000
Qatar Asia other 2.204000 72397.9 78.24000 96 8.195000
Republic of Korea Asia other 1.389000 21052.2 83.95000 83 3.647000
Reunion NA NA NA NA NA NA 5.884000
Romania Europe other 1.428000 7522.4 77.95000 58 12.216000
Russian Federation Europe other 1.529000 10351.4 75.01000 73 10.534000
Rwanda Africa africa 5.282000 532.3 57.13000 19 92.870000
Saint Lucia Caribbean other 1.907000 6677.1 77.54000 28 12.260000
Samoa Oceania other 3.763000 3343.3 76.02000 20 19.848000
Sao Tome and Principe Africa africa 3.488000 1283.3 66.48000 63 47.486000
Saudi Arabia Asia other 2.639000 15835.9 75.57000 82 16.202000
Senegal Africa africa 4.605000 1032.7 60.92000 43 49.802000
Serbia Europe other 1.562000 5123.2 77.05000 56 10.630000
Seychelles Africa africa 2.340000 11450.6 78.00000 56 NA
Sierra Leone Africa africa 4.728000 351.7 48.87000 39 103.459000
Singapore Asia other 1.367000 43783.1 83.71000 100 1.916000
Slovakia Europe oecd 1.372000 15976.0 79.53000 55 5.676000
Slovenia Europe oecd 1.477000 23109.8 82.84000 49 3.279000
Solomon Islands Oceania other 4.041000 1193.5 70.00000 19 34.569000
Somalia Africa africa 6.283000 114.8 53.38000 38 100.017000
South Africa Africa africa 2.383000 7254.8 54.09000 62 45.892000
Spain Europe other 1.504000 30542.8 84.76000 78 3.573000
Sri Lanka Asia other 2.235000 2375.3 78.40000 14 11.213000
St Vincent and Grenadines Caribbean other 1.995000 6171.7 74.73000 50 20.974000
Sudan Africa africa 4.225000 1824.9 63.82000 41 57.328000
Suriname Latin Amer other 2.266000 7018.0 74.18000 70 19.775000
Swaziland Africa africa 3.174000 3311.2 48.54000 21 64.622000
Sweden Europe oecd 1.925000 48906.2 83.65000 85 2.544000
Switzerland Europe oecd 1.536000 68880.2 84.71000 74 3.513000
Syria Asia other 2.772000 2931.5 77.72000 56 13.764000
Tajikistan Asia other 3.162000 816.0 71.23000 26 50.947000
Tanzania Africa africa 5.499000 516.0 60.31000 27 53.658000
TFYR Macedonia Europe other 1.397000 4434.5 77.14000 59 13.063000
Thailand Asia other 1.528000 4612.8 77.76000 34 11.398000
Togo Africa africa 3.864000 524.6 59.40000 44 67.297000
Tokelau NA NA NA NA NA NA 31.250000
Tonga Oceania other 3.783000 3543.1 75.38000 24 20.591000
Trinidad and Tobago Caribbean other 1.632000 15205.1 73.82000 14 24.458000
Tunisia Africa africa 1.909000 4222.1 77.05000 68 18.384000
Turkey Asia oecd 2.022000 10095.1 76.61000 70 19.901000
Turkmenistan Asia other 2.316000 4587.5 69.40000 50 48.797000
Tuvalu Oceania other 3.700000 3187.2 65.10000 51 17.322835
Uganda Africa africa 5.901000 509.0 55.44000 13 72.265000
Ukraine Europe other 1.483000 3035.0 74.58000 69 11.822000
United Arab Emirates Asia other 1.707000 39624.7 78.02000 84 6.608000
United Kingdom Europe oecd 1.867000 36326.8 82.42000 80 4.702000
United States North America oecd 2.077000 46545.9 81.31000 83 6.460000
United States Virgin Islands NA NA NA NA NA NA 9.990000
Uruguay Latin Amer other 2.043000 11952.4 80.66000 93 11.754000
Uzbekistan Asia other 2.264000 1427.3 71.90000 36 44.481000
Vanuatu Oceania other 3.750000 2963.5 73.58000 26 24.135000
Venezuela Latin Amer other 2.391000 13502.7 77.73000 94 15.278000
Viet Nam Asia other 1.750000 1182.7 77.44000 31 18.263000
Wallis and Futuna Islands NA NA NA NA NA NA 5.200000
Western Sahara NA NA NA NA NA NA 36.350000
Yemen Asia other 4.938000 1437.2 67.66000 32 44.412000
Zambia Africa africa 6.300000 1237.8 50.04000 36 80.956000
Zimbabwe Africa africa 3.109000 573.1 52.72000 39 47.284000

Table Fixes

  • Tables for a non-specialist audience should only have a few variables. Usually 1-2 “identifying” rows (state, year, rank, etc.) and 1-2 “data” rows
  • Make sure the numbers look nice, no eight digits after the decimal; add things like % signs and $ signs.
  • In R you can often do this with the scales package

Easy to Read

  • Short: Nobody likes to scroll through an enormous table
  • Consider not showing ALL the data here, maybe just a “top ten” if that’s relevant
  • If all the data really is necessary, try to make its presentation shorter and encourage scrolling while everything else stays in place
  • Consider “striping” to make different rows easy to distinguish with `kableExtra::kable_styling(bootstrap_options = ‘striped’)
  • More coloring and customization with ggtexttable in ggpubr (note this makes the table as an image)

Easy to Read

Top Five Opportunities for Expansion
Country Per-Capita GDP Percent Urban
Bermuda Bermuda $92,624.70 100%
Cayman Islands Cayman Islands $57,047.90 100%
Macao Macao $49,990.20 100%
Singapore Singapore $43,783.10 100%
Hong Kong Hong Kong $31,823.70 100%
  • This could be even better - generally you want as few borders as possible (but RMarkdown is not cooperating)

Hall of Shame

  • Pie charts: not the worst but rarely better than a bar chart or treemap. We just don’t distinguish angles that well. Works best with ONLY one category you care about, and with percentages labeled. At least people understand them
  • Word clouds: I have no idea what anyone would learn from one of these. Looks pretty?
  • Bubble charts. Carries little info in a lot of space, and in very hard-to-see ways (compare circle areas?). Tableau recommends them for some reason.

Let’s evaluate

How about density plots?

  • geom_density() and geom_histogram, geom_dotplot, extras like geom_violin and the ggridges package
  • What would these work well for and what would they not?
  • What problems might we run into?

Let’s evaluate

  • How about heatmaps on actual maps? Same question
Map-based state heatmap

Let’s evaluate

  • And what would a version like this solve for us?
Hex-based state map

Practice

  • Download this file of robocall complaints to the FCC: https://www.kaggle.com/fcc/robocall-complaints
  • Think of a basic story (it’s ok if it’s not stellar)
  • Think about the best geometry to tell that story
  • Sketch a graph to tell that story as clearly as possible
  • If you have time, make it
  • You can load it with read_csv() and, if necessary, use ymd in lubridate to make the date a date

HEY HEY HEY

Downloadable Geometries

  • There are so, so many ggplot2 geometries available
  • Way too many to cover - see the ggplot2 extension gallery in addition to the base-ggplot2 cheat sheet from last time
  • I will go over some popular ones. Google package names for more detailed explanations
  • We won’t go deeply into how to use these, this is more to show you they exist and an example so you can look further yourself if you’re interested

geom_text to geom_repel

  • We covered geom_text last time
  • But the text gets in the way! How did I handle this in that mtcars graph I just did?

geom_text_repel

library(ggrepel); data(mtcars)
mtcars <- mtcars %>%
  mutate(Transmission = factor(am, labels = c('Automatic','Manual')),
         CarName = row.names(mtcars))
ggplot(mtcars, aes(x = mpg, y = hp, color = Transmission, label = CarName)) + 
  geom_text_repel() + labs(x = 'MPG',y='HP')

ggmap

  • ggplot2 has extensive mapping capabilities
  • Including straightforward-ish map animations with the tmap package
  • Also see libraries choloropleth, tigris, tmap, sf, fiftystater

gganimate

  • The gganimate package makes it fairly straightforward to animate your graphs
  • Take a graph and use a transition_ command to have it assign one axis to time
  • Make sure this actually adds something! This limits where the graph can be shown

gganimate

library(gganimate)
data(gapminder)
options(gganimate.dev_args = list(width = 650, height = 400))
p <- ggplot(gapminder,
       aes(x = gdpPercap, y = lifeExp, color = continent)) + 
  geom_point() + 
  scale_x_log10() + 
  labs(x = "GDP per Capita (log scale)", y = "Life Expectancy",
       title = "GDP and Life Expectancy by Country, 1952-2007",
       color = 'Continent') + 
  transition_time(year))
animate(p,nframes = 200,end_pause = 30)

gganimate