You can detail what you learned in the session within these pages. You can also use the Rmd markdown language and insert code chunks with CMD/CTRL + SHIFT + I.

Session 4

What will my data look like?

I’m interested in how internally motivated students are to do well in their courses. If they are internally motivated, then they are motivated by the view that the course is fun and/or that it is personally important. If they are externally motivated, then their efforts in the course are motivated by the fact that they have to do well in the course (i.e. they are being presured or coerced into doing well in the course).

I will be measuring people’s internal motivation at various time points throughout the semester. I will also be measuring how well they think they are doing in the course at each time point. I hypothesize that people will tend to become less internally motivated throughout the course of the semester and that this change will be mediated by how well people think they are doing in the course, such that as students begin to struggle in the course, their motivated shifts away from being internal to being more external.

My data will be a series of Qualtrics surveys that students will complete every 2 to 3 weeks throughout the semester. Each Qualtrics survey will contain three or four variables: student ID, student motivation (measured via 1 to 2 questions), and perceptions of course performance.

Tidying my data

I will need to combine the surveys I receive at each time point. I’m not sure how to do this yet! And once the data are combined, I will need to organize in the following way to make tidy: Variables: Participant ID, time point (1 through n), motivation rating, performance rating

It would be nice to write my R code so that I do not have to amend it at all each time I add a new time point. Apparently this is possible. Mike mentioned that there is a QualtRics R package that can be used to import data (using an API token) automatically from Qualtrics into R so that I will not have change my Rmarkdown code after each additional time point.

To get the necessary codes for the above: After logging into Qualtrics, go to my icon in the upper right, go to account settings, then go to Qualtrics IDs. That will be where my API token is and where my survey codes are.

Mike showed me this website that descibes some of the functions of the qualtRics package: https://github.com/ropensci/qualtRics

As some follow-up work, I may try importing my harmless wrongs data set directly from Qualtrics using the qualtRics package.

NOTE from Mike: If I want to post this process on my website, I will need to hide my Qualtircs API token within my r profile.

Other misc notes

If I have missing data represented by the number 99, I can use dplyr’s na_if() function to specify which values to consider missing data. For example: na_if(99)

Session 5

ropensci.org/packages/ - Useful website for browsing for packages

CRAN task view: https://cran.r-project.org/web/views/ This website gives overviews of various packages that could be useful, organized by topic

Consider starting to do my professional website via R hosted on github; see Jerid’s website as an example

As we move toward working on our individual projects, here is a list of things I need to work on figuring out:

-Interfacing with Qualtrics (I have mostly figured this out)

-Combining various spreadsheets into a single, tidy dataframe - will need to work on the R code for this

-Conducting a multi-level model analysis - have never done this before in R or elsewhere

Session 7

Jerid walked us through the analyzr package: https://wfu-tlc.github.io/analyzr

I installed analyzr

library(devtools)
devtools::install_github("WFU-TLC/analyzr")
library(analyzr)

I then looked at the sbc (Santa Barbara Corpus of Spoken American English) data set, using the glimpse function.

glimpse(sbc)

## Observations: 15,475
## Variables: 13
## $ id              <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ name            <chr> "lenore", "lenore", "lenore", "lenore", "lenor...
## $ gender          <chr> "female", "female", "female", "female", "femal...
## $ age             <dbl> 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30...
## $ dialect         <chr> "Los Angeles", "Los Angeles", "Los Angeles", "...
## $ dialect_state   <chr> "CA", "CA", "CA", "CA", "CA", "CA", "CA", "CA"...
## $ current_state   <chr> "CA", "CA", "CA", "CA", "CA", "CA", "CA", "CA"...
## $ highest_edu     <chr> "BA", "BA", "BA", "BA", "BA", "BA", "BA", "BA"...
## $ years_edu       <dbl> 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16...
## $ occupation      <chr> "student", "student", "student", "student", "s...
## $ ethnicity       <chr> "white", "white", "white", "white", "white", "...
## $ utterance       <chr> "... So you don't need to go ... borrow equipm...
## $ utterance_clean <chr> "so you don't need to go borrow equipment from...

Jerid walked us through selecting specific variables and only those rows with complete data

sbc <- sbc %>% select(id, name, gender, age, years_edu, utterance_clean)
sbc <- sbc[complete.cases(sbc), ]

Related to the above, Mike mentioned the tidy way of doing this (i.e. selecting only complete cases) via filter complete cases.

Isolating DV:

sbc <- 
  sbc %>% 
  mutate(um = str_count(utterance_clean, "\\b(um|u=m)\\b")) %>% 
  mutate(uh = str_count(utterance_clean, "\\b(uh|u=h)\\b"))

The \b represents a word boundary (could be a comma, space, etc.) Allows for elongated uhs and ums.

Jerid then walkes us through visualizing and then testing the significance of various relationships in the sbc data set.

Course Notes