Hands-on Exercise 3a

Author

Li Jiayi

Published

January 21, 2024

Modified

February 24, 2024

Content of This Page

1. Getting started

Installing and loading the required libraries

In this exercise, ggiraph, plotly, DT, tidyverse, patchwork packages are used. A summary of the new packages introduced can be found at the section below.

pacman::p_load(ggiraph, plotly, 
               patchwork, DT, tidyverse) 

Importing data

The code chunk below is to import the data

exam_data <- read_csv("data/Exam_data.csv")

2. A summary

In this exercise, beside tidyverse, 3 new R packages will be used. They are:

  • ggiraph for making ‘ggplot’ graphics interactive.

  • plotly, R library for plotting interactive statistical graphs.

  • DT provides an R interface to the JavaScript library DataTables that create interactive table on html page.

3. Dive into packages

ggigraph

ggiraph is an htmlwidget and a ggplot2 extension which allows interaction made with ggplot geometries

continuing 3 arguments:

  • Tooltip: a column of data-sets that contain tooltips to be displayed when the mouse is over elements.

  • Onclick: a column of data-sets that contain a JavaScript function to be executed when elements are clicked.

  • Data_id: a column of data-sets that contain an id to be associated with elements.

If it used within a shiny application, elements associated with an id (data_id) can be selected and manipulated on client and server sides. Refer to this article for more detail explanation.

Tooltip

Below shows a typical code chunk to plot an interactive statistical graph by using ggiraph package. Notice that the code chunk consists of two parts. First, an ggplot object will be created. Next, girafe() of ggiraph will be used to create an interactive svg object.

Show the code
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = ID),
    stackgroups = TRUE, 
    binwidth = 1, 
    method = "histodot") +
  scale_y_continuous(NULL, 
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6*0.618
)

Notice that two steps are involved.

  • an interactive version of ggplot2 geom (i.e. geom_dotplot_interactive()) will be used to create the basic graph.

  • girafe() will be used to generate an svg object to be displayed on an html page.

By hovering the mouse pointer on an data point of interest, the student’s ID will be displayed.

The content of the tooltip can be customized by including a list object as shown in the code chunk below.

Show the code
exam_data$tooltip <- c(paste0(     
  "Name = ", exam_data$ID,         
  "\n Class = ", exam_data$CLASS)) 

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = exam_data$tooltip), 
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 8,
  height_svg = 8*0.618
)

The first three lines of codes in the code chunk create a new field called tooltip. At the same time, it populates text in ID and CLASS fields into the newly created field. Next, this newly created field is used as tooltip field as shown in the code of line 7.

Code chunk below uses opts_tooltip() of ggiraph to customize tooltip rendering by add css declarations.

Show the code
tooltip_css <- "background-color:white; #<<
font-style:bold; color:black;" #<<

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = ID),                   
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(    #<<
    opts_tooltip(    #<<
      css = tooltip_css)) #<<
)                                        

Background colour of the tooltip is black and the font colour is white and bold.

Show the code
tooltip <- function(y, ymax, accuracy = .01) {
  mean <- scales::number(y, accuracy = accuracy)
  sem <- scales::number(ymax - y, accuracy = accuracy)
  paste("Mean maths scores:", mean, "+/-", sem)
}

gg_point <- ggplot(data=exam_data, 
                   aes(x = RACE),
) +
  stat_summary(aes(y = MATHS, 
                   tooltip = after_stat(  
                     tooltip(y, ymax))),  
    fun.data = "mean_se", 
    geom = GeomInteractiveCol,  
    fill = "light blue"
  ) +
  stat_summary(aes(y = MATHS),
    fun.data = mean_se,
    geom = "errorbar", width = 0.2, size = 0.2
  )

girafe(ggobj = gg_point,
       width_svg = 8,
       height_svg = 8*0.618)                                   

Hover

Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over (hovering effect).

Code chunk below shows the second interactive feature of ggiraph, namely data_id.

Note that the default value of the hover css is hover_css = “fill:orange;”.

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(           
    aes(data_id = CLASS),             
    stackgroups = TRUE,               
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618                      
)                                                                          
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(           
    aes(data_id = CLASS, tooltip = ID),             
    stackgroups = TRUE,               
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618                      
)        

In the code chunk below, css codes are used to change the highlighting effect.

Note: Different from previous example, in this example the ccs customisation request are encoded directly.

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #202020;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)                                                                                   

Combining tooltip and hover effect

There are time that we want to combine tooltip and hover effect on the interactive statistical graph as shown in the code chunk below.

Interactivity: Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over. At the same time, the tooltip will show the CLASS.

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = CLASS, 
        data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #202020;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)                                                                                                                

Click

onclick argument of ggiraph provides hotlink interactivity on the web.

Interactivity: Web document link with a data object will be displayed on the web browser upon mouse click.

Note that click actions must be a string column in the dataset containing valid javascript instructions.

exam_data$onclick <- sprintf("window.open(\"%s%s\")",
"https://www.moe.gov.sg/schoolfinder?journey=Primary%20school",
as.character(exam_data$ID))

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(onclick = onclick),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618)                                                                                               

Coordinated multiple views methods has been implemented in the data visualisation below.

Notice that when a data point of one of the dotplot is selected, the corresponding data point ID on the second data visualisation will be highlighted too.

In order to build a coordinated multiple views as shown in the example above, the following programming strategy will be used:

  1. Appropriate interactive functions of ggiraph will be used to create the multiple views.

  2. patchwork function of patchwork package will be used inside girafe function to create the interactive coordinated multiple views.

The data_id aesthetic is critical to link observations between plots and the tooltip aesthetic is optional but nice to have when mouse over a point.

p1 <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +  
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

p2 <- ggplot(data=exam_data, 
       aes(x = ENGLISH)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") + 
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

girafe(code = print(p1 + p2), 
       width_svg = 6,
       height_svg = 3,
       options = list(
         opts_hover(css = "fill: #202020;"),
         opts_hover_inv(css = "opacity:0.2;")
         )
       )                                                                          

Plotly

Plotly’s R graphing library create interactive web graphics from ggplot2 graphs and/or a custom interface to the (MIT-licensed) JavaScript library plotly.js inspired by the grammar of graphics. Different from other plotly platform, plot.R is free and open source.

There are two ways to create interactive graph by using plotly, they are:

  • by using plot_ly(), and

  • by using ggplotly()

plot_ly(data = exam_data, 
             x = ~MATHS, 
             y = ~ENGLISH)                                   

In the code chunk below, color argument is mapped to a qualitative visual variable (i.e. RACE).

plot_ly(data = exam_data, 
        x = ~ENGLISH, 
        y = ~MATHS, 
        color = ~RACE)                                                                                                               

The code chunk below plots an interactive scatter plot by using ggplotly().

p <- ggplot(data=exam_data, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
ggplotly(p)                                                                                                              

The creation of a coordinated linked plot by using plotly involves three steps:

  • highlight_key() of plotly package is used as shared data.

  • two scatterplots will be created by using ggplot2 functions.

  • lastly, subplot() of plotly package is used to place them next to each other side-by-side.

Thing to learn from the code chunk:

d <- highlight_key(exam_data)
p1 <- ggplot(data=d, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

p2 <- ggplot(data=d, 
            aes(x = MATHS,
                y = SCIENCE)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
subplot(ggplotly(p1),
        ggplotly(p2))                                                                                                              

DT

Crosstalk is an add-on to the htmlwidgets package. It extends htmlwidgets with a set of classes, functions, and conventions for implementing cross-widget interactions (currently, linked brushing and filtering).

  • A wrapper of the JavaScript Library DataTables

  • Data objects in R can be rendered as HTML tables using the JavaScript library ‘DataTables’ (typically via R Markdown or Shiny).

DT::datatable(exam_data, class= "compact")                                                                                                        

Code chunk below is used to implement the coordinated brushing shown above.

Things to learn from the code chunk:

  • highlight() is a function of plotly package. It sets a variety of options for brushing (i.e., highlighting) multiple plots. These options are primarily designed for linking multiple plotly graphs, and may not behave as expected when linking plotly to another htmlwidget package via crosstalk. In some cases, other htmlwidgets will respect these options, such as persistent selection in leaflet.

  • bscols() is a helper function of crosstalk package. It makes it easy to put HTML elements side by side. It can be called directly from the console but is especially designed to work in an R Markdown document. Warning: This will bring in all of Bootstrap!.

d <- highlight_key(exam_data) 
p <- ggplot(d, 
            aes(ENGLISH, 
                MATHS)) + 
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

gg <- highlight(ggplotly(p),        
                "plotly_selected")  

crosstalk::bscols(gg,               
                  DT::datatable(d), 
                  widths = 5)                                                                                                                 

4. Some Plotting Exercise

improve some of the plotting practice from Hands-on Exercise 1 using the 4 new libraries!

Stacked Bar Chart of Race Distribution by Gender

Show the code
p <- ggplot(data = exam_data, 
       aes(x = reorder(RACE, -table(RACE)[RACE]), fill = GENDER)) +
  geom_bar(position = "stack",
           alpha = 0.9) +
  geom_text(
    aes(label = after_stat(count)),
    stat = "count",
    position = position_stack(vjust = 0.5),
    size = 3,
    color = "white"
  ) +
  labs(title = "Race Distribution by Gender", x = "Race", y = "Number of Students") +
  theme_minimal() 

ggplotly(p) 

Boxplot of English Scores by Class

Show the code
p <- ggplot(data = exam_data, 
       aes(x = CLASS, y = ENGLISH)) +
  geom_boxplot(fill = "#D1EEEE", color = "#7A8B8B") +
  geom_hline(yintercept = mean(exam_data$ENGLISH), linetype = "dashed", color = "#CD2626") +
  stat_summary(
    fun = mean, 
    geom = "point", 
    color = "#CD2626"
  ) +
  annotate(
    "text", 
    x = 1,  y = mean(exam_data$ENGLISH) + 2,
    label = paste("Avg:", round(mean(exam_data$ENGLISH), 2)),
    color = "#CD2626"
  ) +
  coord_cartesian(ylim = c(0, 100)) +
  labs(
    title = "English Scores by Class",
    x = "Class",
    y = "English Score"
  ) +
  theme_minimal()

ggplotly(p) 

Scatterplot of Math and Science Scores

Show the code
p <- ggplot(data = exam_data,
       aes(x = MATHS, y = SCIENCE)) +
  geom_point(aes(color = GENDER), size = 1.5, alpha = 0.7) +
  geom_hline(yintercept = 50, linetype = "dashed", color = "gray") +  
  geom_vline(xintercept = 50, linetype = "dashed", color = "gray") +  
  geom_smooth(method = "lm", size = 0.5) +      
  labs(
    title = "Correlation between Math and Science Scores",
    x = "Math Score",
    y = "Science Score"
  ) +
  coord_cartesian(xlim = c(0, 100), ylim = c(0, 100)) +
  theme_minimal()
ggplotly(p) 

Density Plot of English Scores by Class

Show the code
# Density plot of ENGLISH scores combined for both genders faceted by class
p <- ggplot(data = exam_data, 
       aes(x = ENGLISH, fill = GENDER)) +
  geom_density(alpha = 0.5, color = "black", linewidth = 0.3) + 
  labs(title = "Distribution of English Scores by Class", x = "English Score") +
  theme_minimal() +
  facet_grid(CLASS ~ .) +
  theme(axis.text.y = element_blank(),
        axis.ticks.y = element_blank(),
        legend.text = element_text(size = 8),  
        legend.title = element_text(size = 8))

ggplotly(p) 

reference

ggiraph

This link provides online version of the reference guide and several useful articles. Use this link to download the pdf version of the reference guide.

plotly for R