Hints for working with R for statistics
Here I collect a few answers I have given to requests for some hints for the assignments:
First, when you create a new table that you want to use later, you need to save the output of your operations in a new data object:
newtable <- oldtable %>% some_operation() %>% another_operation()
Second, to create new variables in a data table you have to use the operation
mutate(newvariable1 = some_function(oldvar1), newvariable2 = other_function(oldvar2))
to add to the existing table or you have to use
transmute(newvariable1 = some_function(oldvar1), newvariable2 = other_function(oldvar2))
to instead replace all variables from the existing table with the new ones you define within the transmute command.
Third, let’s say x is defined as a numeric vector with x <-c(4,1,2) then, factor(x) will output a data object of class factor with three unique levels (the number of unique entries in the vector x), the names for each level will be the original number as a character:
x <- c(4,1,2)
x
## [1] 4 1 2
y <- factor(x)
y
## [1] 4 1 2
## Levels: 1 2 4
If you want to change the names for the levels, you can define them in the factor command with the option “labels”, the levels and corresponding names of the levels follow the same order:
y <- factor(x, labels = c("eins", "zwei", "!drei"))
y
## [1] !drei eins zwei
## Levels: eins zwei !drei