3

Today was the final day for an On Demand event I adminned. We got some data back from the provider today. Vendors bought in at different tiers, and only one T1 was allowed because they're a sponsor. The higher the Tier number, the fewer graphics options and later one's exhibit appears.

I applied a facet_grid to the data to separate by Tier. I'd like to add a trend line or at least three hlines (mean, mean ± stdev) to the graphic that sorts by position from the start of the exhibition hall to illustrate that the later the position, the less likely a vendor will be visited.

This code

Interactions %>% 
    ggplot(mapping = aes(fill=Tier, color=Tier, x=reorder(`Booth Name`, -BoothOrder, max), y=`TotalInteractions`)) + 
        scale_fill_manual(values=GraphColors) +
        geom_col() + 
        geom_text(aes(label=TotalInteractions), color='black', nudge_y = 10) + 
        facet_grid(Tier ~ ., scales = 'free_y', space = 'free_y', drop = T) +
        xlab(label = 'Booth Name') + 
        ylab(label = 'Total Interactions, grouped by Tier, ordered by Booth Order in Exhibit Hall') +
        coord_flip()

begets this graphic (vendor names anonymized).

graphic , grouped by tier, ordered by static booth position

I'd like in each facet panel to show the mean and stdev and/or a trend line illustrating the likelihood of an attendee visiting a vendor which is at the end of the virtual exhibition hall. This would be comparable to a live event having someone off in a corner, furthest from the entrance or a main attraction.

Do I need to add more geom_somethings or facet differently?

Here's the redacted dput if anyone wants it.

structure(list(`Booth Name` = structure(c("6066eecb44", "da7e90874c", 
"76f9149b67", "ce285d23a7", "6e38489fe3", "eef7ae4fb6", "c171400d47", 
"29cadfb808", "16d463a501", "06aed259dd", "5c3ed6d72e", "6196941184", 
"8ad3ea5fa4", "98a8388f89", "b2f06f4240", "7034dda2fa", "da004a8aed", 
"da317748e2", "ffd775a22b", "461ac5053c", "45a2dc3ba8", "9e28ff5dd5", 
"23c6d72b14", "83a776083d", "3c13b35d6b", "83152ac13a", "9a1a86885c", 
"c1599dec43", "2bb225f0ba", "b6f9b29b5e", "7cfe83e072", "717bfc4838", 
"e213328e22", "c9af37768a", "122d80d313", "701a01a7d6", "cb2e52e25a", 
"0214e13085", "47f08bcef3", "7ace29dd27", "e8ecf5ceff", "d8eb53a6b0"
), class = c("hash", "md5")), Tier = structure(c(1L, 2L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("1", "2", "3", "4", "5"
), class = "factor"), `Total Booth Visits` = c(187, 137, 101, 
198, 107, 109, 119, 119, 95, 90, 191, 157, 51, 146, 97, 131, 
62, 62, 161, 98, 54, 68, 67, 202, 274, 47, 100, 97, 135, 73, 
74, 109, 68, 69, 79, 154, 45, 55, 38, 15, 73, 98), `Unique Booth Visits` = c(133, 
112, 87, 137, 84, 99, 101, 102, 79, 75, 155, 133, 46, 111, 78, 
110, 58, 54, 133, 80, 51, 57, 61, 156, 205, 40, 83, 82, 108, 
65, 65, 95, 63, 60, 73, 125, 41, 43, 36, 13, 64, 88), `Documents Clicked` = c(7, 
0, 3, 4, 0, 9, 20, 8, 5, 0, 6, 4, 2, 0, 1, 7, 9, 2, 0, 0, 0, 
0, 0, 12, 12, 0, 0, 0, 1, 11, 6, 0, 0, 0, 0, 14, 0, 3, 0, 0, 
13, 0), `Videos Viewed` = c(2, 0, 24, 9, 20, 13, 0, 0, 2, 0, 
10, 0, 5, 0, 6, 0, 6, 13, 0, 0, 0, 18, 0, 11, 20, 2, 0, 0, 0, 
0, 0, 0, 6, 5, 6, 28, 0, 0, 0, 0, 0, 0), `Tabs Clicked` = c(53, 
6, 8, 13, 14, 12, 11, 10, 17, 4, 30, 37, 7, 34, 13, 4, 8, 3, 
18, 36, 5, 19, 6, 25, 50, 6, 28, 16, 5, 1, 2, 27, 9, 14, 11, 
43, 4, 2, 1, 0, 7, 26), BoothOrder = c(1, 11, 4, 2, 14, 5, 8, 
6, 15, 17, 10, 9, 38, 22, 23, 13, 29, 25, 7, 30, 33, 34, 36, 
12, 3, 39, 40, 19, 16, 20, 21, 24, 26, 27, 28, 18, 32, 31, 35, 
42, 37, 41), DupeVisits = c(54, 25, 14, 61, 23, 10, 18, 17, 16, 
15, 36, 24, 5, 35, 19, 21, 4, 8, 28, 18, 3, 11, 6, 46, 69, 7, 
17, 15, 27, 8, 9, 14, 5, 9, 6, 29, 4, 12, 2, 2, 9, 10), TotalInteractions = c(249, 
143, 136, 224, 141, 143, 150, 137, 119, 94, 237, 198, 65, 180, 
117, 142, 85, 80, 179, 134, 59, 105, 73, 250, 356, 55, 128, 113, 
141, 85, 82, 136, 83, 88, 96, 239, 49, 60, 39, 15, 93, 124)), row.names = c(NA, 
-42L), class = c("tbl_df", "tbl", "data.frame"))
desertnaut
  • 1,908
  • 2
  • 13
  • 23
user208145
  • 133
  • 3
  • 3
    There's no issue with your question but I think you're more likely to get an answer if you ask it on https://stackoverflow.com/ – Erwan Oct 11 '20 at 14:51
  • 2
    @BenjiAlbert Thanks for picking up on the edit. I don't mind the rejection, but please consider reviewing the edits more closely. I have removed RStudio from tags and title because this has nothing to do with the IDE. – M-- Oct 11 '20 at 20:38
  • @M-- my apologies, you are right. Thank you for alerting me to this and for providing a good answer to this question. – Ben Oct 12 '20 at 17:14

1 Answers1

3

What you can do is creating another dataset with the mean and mean±stdev grouped by Tier. Then you can use that in geom_hline and ggplot will pick up facets from Tier. See below;

library(dplyr)
library(tidyr)
library(ggplot2)
Interactions %>% 
  group_by(Tier) %>% 
  summarise(mean = mean(TotalInteractions),
            `mean-stdev` = sum(mean(TotalInteractions), -sd(TotalInteractions), na.rm = T),
            `mean+stdev` = sum(mean(TotalInteractions), +sd(TotalInteractions), na.rm = T)) %>% 
  pivot_longer(-Tier) -> Int_hline
Interactions %>% 
  ggplot(aes(fill=Tier, color=Tier, 
             x=reorder(`Booth Name`, -BoothOrder, max), y=`TotalInteractions`)) + 
#  scale_fill_manual(values=GraphColors) + ## commented-out since we don't have GraphColors
  geom_col() + 
  geom_hline(data = Int_hline, aes(yintercept = value)) +
  geom_text(aes(label=TotalInteractions), color='black', nudge_y = 10) + 
  facet_grid(Tier ~ ., scales = 'free_y', space = 'free_y', drop = T) +
  xlab(label = 'Booth Name') + 
  ylab(label = 'Total Interactions, grouped by Tier, ordered by Booth Order in Exhibit Hall') +
  coord_flip()

Created on 2020-10-11 by the reprex package (v0.3.0)

M--
  • 300
  • 2
  • 12