melt() – tidy data

Just a short note to help me remember the melt()-function.

Lets create some messy data:

ID <- 1:10
T1 <- runif(10,10,20)
T2 <- runif(10,20,30)
T3 <- runif(10,30,40)
df <- data.frame(ID=ID, T1=T1, T2=T2, T3=T3)

This is heavily inspired by a practical problem a student came to us with. There is 10 different patients, at time = T1, there is a certain value measured on the patient. At time = T2 the same value is measured. And again at time = T3, where T1<T2<T3.

We would now like to plot the development of those values as a function of time (or just T1,T2 and T3).

How to do that?

Using reshape2, and making sure we have the tidyverse packages loaded:

library(tidyverse)
library(reshape2)
clean <- melt(df, id.vars=c("ID"), value.names=c("T1", "T2", "T3"))
head(clean)
##   ID variable    value
## 1  1       T1 13.93662
## 2  2       T1 10.30468
## 3  3       T1 18.14351
## 4  4       T1 17.58294
## 5  5       T1 18.13877
## 6  6       T1 10.21993

Nice, we now have tidy data.

Note that melt() also takes a variable “variable.name”, in case we have a different sort of mess.

Now it is easy to plot:

plot <- ggplot(clean, aes(x=variable, y=value, group=ID)) +
  geom_line()
plot

plot of chunk unnamed-chunk-4

Neat.