We can use this space to go ahead and write out any exercise or prework that you would like have done prior to coming.

Session 1

This plot is generated using ggplot2 (Wickham 2016).

library(ggplot2)
ggplot(iris, aes(Species, Sepal.Length))+
  geom_col()

Session 2

Session 3

Follow-up work

Here I describe the follow-up work I did for Session 3 wherein I imported a data set using rio(), practiced creating visualisations of that data set with ggplot(), documented those efforts using R Markdown, and published that documentation to my Github website.

First, I loaded the tidyverse and rio packages for managing and importing data into R.

library(tidyverse)
library(rio)

Then I imported my data set. In this data set, I explore belief in harmless wrongs (i.e. the belief that certain behaviors are immoral even if they do not harm) in a sample of users of the website Amazon Mechanical Turk. I tested what proportion of people believe in harmless wrongs or, alternatively, believe that behaviors are only wrong insofar as they cause harm. I also askedpeople who belive in harmless wrongs to provide an example of a harmless wrong, and I measured various perceptions of those harmless wrongs (immoral participants thought those behvaiors were, how indicative those behaviors were of poor moral character, etc.)

HarmlessWrongs <- import("Study175bDatasav.sav")

After importing the dataset, I checked the data set using the head() function.

head(HarmlessWrongs, 10)

I am interested in why people consider harmless wrongs to be immoral. If these behaviors cause no harm, then what is unethical about them?

In my data set, I asked people who believe in harmless wrongs to supply an example of a harmless wrong. I also asked these people to indicate precisely how immoral they considered that behavior to be (1 = not at all; 7 = very), and I asked people to rate the behavior on several other dimensions. One of those dimensions was the degree to which participants view the behavior as indicative of poor moral character (1 = not at all; 7 = very much so).

I am interested whether harmless wrongs are perceived as wrong because they are indicative of poor moral character. Therefore, I created a scatterplot of the relationship between perceptions of harmless wrongs as indicating poor moral character (on the X axis) and perceptions of harmless wrongs as immoral (on the Y axis). I used ggplot’s geom_point() geometry.

ggplot (HarmlessWrongs, aes(x = HWpoorcharacter, y = HWHowImmoral)) + geom_point()

One limitation of the above visualization is that it does not depict the large numbers of participants with identical responses. Therefore, I created a second visualization using as my geometric object geom_co1 forunt().

ggplot (HarmlessWrongs, aes(x = HWpoorcharacter, y = HWHowImmoral)) + geom_count()

I also created a box plot visualization.

ggplot (HarmlessWrongs, aes(x = factor(HWpoorcharacter), y = HWHowImmoral)) + geom_boxplot()

And I created a violin plot visualization.

ggplot (HarmlessWrongs, aes(x = factor(HWpoorcharacter), y = HWHowImmoral)) + geom_violin()

Session 4

For my follow-up work to Session 4, I worked on the code for importing data directly from Qualtrics using the qualtRics package. I will be collecting data in Qualtrics over multiple time points and thus would like automate the process of downloading and analyzing those data as they come in.

First, I installed and loaded the qualtRics package.

install.packages("remotes")
remotes::install_github("ropensci/qualtRics")
library(qualtRics)

Then I registered my Qualtrics credentials. (I assigned my Qualtrics API token as the value of QualtricsAPIToken within .rprofile to keep it hidden here.) I also told R not to use the labels but rather to use values for survey responses.

qualtrics_api_credentials(api_key = QualtricsAPIToken, 
                          base_url = "wakeforest.ca1.qualtrics.com",
                          install = TRUE)

Apparently then running this code reloads the environment to enable using the credentials without restarting R.

readRenviron("~/.Renviron")

I then imported the Harmless Wrongs data set directly from Qualtrics using the getSurvey() function.

HarmlessWrongsfromQualtrics <- getSurvey(surveyID = "SV_8IbGxs0unkqqERv", includeQuestionIds = c("HWHowImmoral", "HWpoorcharacter"))
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%

I then checked the data set using the head() function.

head(HarmlessWrongsfromQualtrics, 10)

I am now able to retrieve data directly from Qualtrics, which will be useful for my SOTL project this semester. I will be asking my students to complete Qualtrics surveys every other week for the entire semester. I hope to write some R code that will download, analyze, and create a report from those data and that I can (re)run at any point in the semester.

Session 5

Following session 5, I met with a statistics consultant to learn how to set up my data properly for using a multi-level model for analyzing my data. I am feeling good about having a proper, tidy format to work toward. I have also collected more data from my class. And now that I am ready to start combining spreadsheets into a single, master data frame to analyze, I’m realizing one issue – I need to anonymize my data set. At each time point in my study, students are completing a Qualtrics survey in which they are entering their Wake Forest user names. I want to replace these user names with pseudo-randomly generated numbers that I can then use to link their responses across surveys. I am not sure yet of the best way to do this!

Session 6

Following session 6, I completed data collection for my first three time points. Again, every two weeks, I am surveying my students to gauge their motivations for doing well in the class. I am also measuring various classroom perceptions and experiences as potential predictors of changes in motivation. With three time points collected, I have enough data to start writing code to wrangle the data into a tidy dataset for conducting a multi-level model to see what predicts changes in motivation over time.

First, I imported the three data sets from Qualtrics using the qualtRics package.

library(qualtRics)
TimePoint1 <- getSurvey(surveyID = "SV_bjyUQzDIYvIC9sF", startDate = "2019-01-17", endDate = "2019-01-20", useLabels = FALSE)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
TimePoint2 <- getSurvey(surveyID = "SV_784VMn6ZTQHld3f", startDate = "2019-01-31", endDate = "2019-02-03", useLabels = FALSE)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
TimePoint3 <- getSurvey(surveyID = "SV_0xIjxfcFeBWERxP", startDate = "2019-02-12", endDate = "2019-02-17", useLabels = FALSE)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%

Next, I added a variable to each data frame labeling it with its appropriate time point.

TimePoint1$TimePoint <- 1
TimePoint2$TimePoint <- 2
TimePoint3$TimePoint <- 3

Then I combined the data frames from each time point into a master data frame, but selecting only the variables of interest.

listOfTimePointDFs <- list(TimePoint1, TimePoint2, TimePoint3)
CompiledClassroomMotivationDataset <- do.call(rbind, lapply(listOfTimePointDFs, subset, select=c("WFUUsername", "HowChallenging","DoWell", "ActiveParticipant", "BigPicture", "GoodRapport", "Motivation_1", "Motivation_2", "Motivation_3", "Motivation_4", "TimePoint")))

Some students included “(???)” or “(???)” in their WFUUsernames even though instructed not to. So I deleted such text from the data frame.

CompiledClassroomMotivationDataset$WFUUsername <- gsub("@wfu.edu", "", CompiledClassroomMotivationDataset$WFUUsername, ignore.case=TRUE)
CompiledClassroomMotivationDataset$WFUUsername <- gsub("@wfu", "", CompiledClassroomMotivationDataset$WFUUsername, ignore.case=TRUE)

I then calculated an autonomous motivation score from participants’ responses. This is the main dependent variable in the study. It is a composite of four ratings that captures the degree to which people are pursuing a goal (in this case, doing well in the course) for themselves (i.e. because they want to do well and value doing well) rather than for external causes.

CompiledClassroomMotivationDataset$AutonomousMotivation <- ((CompiledClassroomMotivationDataset$Motivation_3 + CompiledClassroomMotivationDataset$Motivation_4) - (CompiledClassroomMotivationDataset$Motivation_1 + CompiledClassroomMotivationDataset$Motivation_2))

To do: I still need to anonymize the data set by replacing students’ WFU usernames with numerical codes. I will then be ready to analyze my data using the lme4 package.

Session 7

I realized that I want to do some time-lagged analyses in my project, meaning I want to test whether students’ motivations at one time point are predicted by their perceptions and experiences of the course at the most recent time point (or, vice versa, whether students’ experiences and perceptions at one time point are predicted by their motivations at the most recent time point). To do this, I need to add data to my data frame. Currently, each row in the data frame contains participants’ responses during a single time point. I need to include in each row participants’ responses at the most recent time point as well.

Session 8


References

Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.

Copyright © 2018 E.J. Masicampo. All rights reserved.