The R factor is a data structure that is used for the purpose of the field which takes only a predefined finite number of values. R factors take a limited number of different values. To store data on multiple levels and categorize the data is done by these objects. Strings and integers can be stored and are useful in a column that has a limited number of unique values.
Examples of factors can be male, female.
Attributes of R factor
There are the following attributes of the R factor:
X
The input vector is transformed into a factor.
levels
It represents a set of unique values taken by X.
labels
It corresponds to the number of labels.
Exclude
It specifies the value that is excluded.
ordered
It determines if the levels are ordered.
nmax
It specifies the upper bound for the maximum number of levels.
How to create R factor?
Factor can be created in two steps:
- First step is to create a vector.
- Second step is to convert the vector into a factor.
The factor() function is used in R to convert the vector into a factor. Here is the syntax to convert the vector into a factor:
Let’s understand the use of the factor() function with an example:
# Creating a vector as input. v <- c("Python", "R", "Java", "C++", "Ruby", "HTML") print(v) print(is.factor(v)) # Applying the factor function. f <- factor(v) print(f) print(is.factor(f))
Output
Accessing components of factor
The factors components can be accessed like vectors. This procedure is much similar to the vector. The elements can be accessed with help of the indexing method or using logical vectors.
Have a look at the below example to understand the different ways of accessing the R factor components.
# Creating a vector as input. v <- c("Python", "R", "Java", "C++", "Ruby", "HTML") # Applying the factor function. f <- factor(v) #Printing all elements of factor print(f) #Accessing 4th element of factor print(f[4]) #Accessing 5th and 7th element print(f[c(5,7)]) #Accessing all element except 4th one print(f[-4]) #Accessing elements using logical vector print(f[c(TRUE,FALSE,FALSE,FALSE,TRUE,TRUE)])
Output
Modification of R factor
R programming allows to modification of the factor like data frames. We can modify the value of a factor by re-assigning it. You cannot choose values outside of its predefined levels, which means if the level of value is not presented you cannot insert value. To do this you have to create a level of value, and then you are able to add it to our factor.
Have a look at the example to know how the modification is done in factors:
# Creating a vector as input. v <- c("Python", "R", "Java", "C++", "Ruby", "HTML") # Applying the factor function. f <- factor(v) #Change 5th element of factor f[5] <-"C#" print(f) #change 5th element of factor f[4] <- "JavaScript" # cannot assign values outside levels print(f) #Adding the value to the level levels(f) <- c(levels(f)," JavaScript ") #Adding new level f[5] <- " JavaScript "
Output
Factor in Data Frame
R programming treats the frame text column as categorical data and creates factors on it.
Have a look at the example to know how the data frames are created:
# Creating the vectors h <- c(152,176,152,166,156,147,122) w <- c(44,47,44,43,67,52,33) g <- c("m","m","f","f","m","f","m") data<- data.frame(h,w,g) print(data) # Testing if the gender column is a factor. print(is.factor(data$g)) # Printing the levels. print(data$g)
Output
Changing order of the levels
In R programming, with the help of the factor function, you can change the order of the levels in the factor. Have a look at the example to know how to change the order of the levels in factors:
data <- c("Python", "R", "Java", "C++", "Ruby", "HTML") f<- factor(data) print(f) # Apply the factor function nf<- factor(f,levels = c("C++", "Python", "R", "HTML", "Ruby", "Java")) print(nf)
Output
Generating Factor Levels
The gl() function is used to generate factor levels. The syntax of gl() function is given below:
Here:
- n indicates the number of levels.
- k indicates the number of replications.
- labels is a vector of labels for the resulting factor levels.
Have a look at the example to know to generate factor levels :
f<- gl(3,5,labels=c("C++", "Ruby", "HTML")) f
Output: