Explain gather() and spread() in R along with an example.

870    Asked by Dadhijaraj in Data Science , Asked on Nov 8, 2019
Answered by Dadhija raj

Gather() function is used to collapse multiple columns into key-pair values. The data frame above is considered wide since the time variable (represented as quarters) is structured such that each quarter represents a variable.

Let us create fake data to clean with tidyr functions.

comp <- c(1,1,1,2,2,2,3,3,3)

yr <- c(1998,1999,2000,1998,1999,2000,1998,1999,2000)

q1 <- runif(9, min=0, max=100)

q2 <- runif(9, min=0, max=100)

q3 <- runif(9, min=0, max=100)

q4 <- runif(9, min=0, max=100)

df <- data.frame(comp=comp,year=yr,Qtr1 = q1,Qtr2 = q2,Qtr3 = q3,Qtr4 = q4)

Df


Now let us use gather data using pipe operator

# Using Pipe Operator

head(df %>% gather(Quarter,Revenue,Qtr1:Qtr4))


By using just the function, we can do something like this

# With just the function

head(gather(df,Quarter,Revenue,Qtr1:Qtr4))


Now we will use spread function to see what it does

Let us create a different data

stocks <- data.frame(

  time = as.Date('2009-01-01') + 0:9,

  X = rnorm(10, 0, 1),

  Y = rnorm(10, 0, 2),

  Z = rnorm(10, 0, 4)

)

stocks

stocksm %>% spread(stock, price)


stocksm %>% spread(time, price)



Your Answer

Interviews

Parent Categories