Tetrachoric Correlation

The tetrachoric correlation is a correlation measure for binary variables and is commonly used in psychometrics, particularly for Item Response Theory (IRT). As with other correlation measures, a tetrachoric correlation ranges between -1 and 1, and describes the association between the two binary variables.

In the following video I discuss the basics on tetrachoric correlation and then work through some examples in R. The R code used in the video is available at the bottom of this page.



Subscribe below to get updates on my latest videos, courses, and other useful information.


library(psych)      #for tetrachoric function
library(tidyverse)
library(openintro)  #for resume data

# dummy examples
x<-matrix(c(30,0,0,30), nrow=2)
x
tetrachoric(x)

y<-matrix(c(0,30,30,0), nrow=2)
y
tetrachoric(y)

z<-matrix(c(45,9,27,16), nrow=2)
z
tetrachoric(z)

# resume data
tetrachoric(select(resume, received_callback, military))
tetrachoric(select(resume, received_callback, computer_skills:military))

# doesn't work with string variable
tetrachoric(select(resume, received_callback, resume_quality))
# make dummy variable or create table
resume <- resume %>% mutate(resume_quality_high=ifelse(resume_quality== "high",1,0))
tetrachoric(select(resume, received_callback, resume_quality_high))
tetrachoric(table(resume$received_callback,resume$resume_quality))

tetra_output<-tetrachoric(table(resume$resume_quality,resume$received_callback))
names(tetra_output)
tetra_output$rho