Replacing values in a dataframe – to what a previous value was

Given a set of data, where some values indicate that they are the same as a previous value, how to replace them with the correct value.

Eg, this dataframe:

(m <- data.frame(i=c(1:10,NA), t=c("lorem", "do", "do", "Do", "ipsum", "do", "Do", "(do)", "dolor", NA, "test"), stringsAsFactors=F))
##     i     t
## 1   1 lorem
## 2   2    do
## 3   3    do
## 4   4    Do
## 5   5 ipsum
## 6   6    do
## 7   7    Do
## 8   8  (do)
## 9   9 dolor
## 10 10  <NA>
## 11 NA  test

How to replace the first three “do”s with “lorem” and the next set of “do”s with “ipsum”

Using fill() from the tidyr package is straight forward. It takes a vector, locates all NA, and replaces them with the last, non-NA value.
Simple enough, change all the variations of “do” to NA, run fill(). Done.
One problem, there might be NAs in the dataset, that we do not want to affect.
Solution – there might be a more elegant one, but this works:

  1. Change the NAs to something that do not occur in the data
  2. Change to variations of “do” to NA
  3. Use the fill()-function
  4. Change the NAs from step 1 back to NA
library(tidyr)
rpl <- "replacement"
m[is.na(m$t),]$t <- rpl
doset <- c("do", "Do", "(do)")
m[(m$t %in% doset),]$t <- NA

m <- m %>% fill(t)
m[(m$t == rpl),]$t <- NA
m
##     i     t
## 1   1 lorem
## 2   2 lorem
## 3   3 lorem
## 4   4 lorem
## 5   5 ipsum
## 6   6 ipsum
## 7   7 ipsum
## 8   8 ipsum
## 9   9 dolor
## 10 10  <NA>
## 11 NA  test

Done!

Oh, and by the way, this is my first post generated directly from RStudio!