R Factor

The R factor is a data structure that is used for the purpose of the field which takes only a predefined finite number of values. R factors take a limited number of different values. To store data on multiple levels and categorize the data is done by these objects. Strings and integers can be stored and are useful in a column that has a limited number of unique values.

Examples of factors can be male, female.

Attributes of R factor

There are the following attributes of the R factor:

R factor attributes
R factor attributes

X

The input vector is transformed into a factor.

levels

It represents a set of unique values taken by X.

labels

It corresponds to the number of labels.

Exclude

It specifies the value that is excluded.

ordered

It determines if the levels are ordered.

nmax

It specifies the upper bound for the maximum number of levels.

How to create R factor?

Factor can be created in two steps:

  1. First step is to create a vector.
  2. Second step is to convert the vector into a factor.

The factor() function is used in R to convert the vector into a factor. Here is the syntax to convert the vector into a factor:

Let’s understand the use of the factor() function with an example:

# Creating a vector as input.  
v <- c("Python", "R", "Java", "C++", "Ruby", "HTML")
  
print(v)  
print(is.factor(v))  
  
# Applying the factor function.  
f <- factor(v)  
  
print(f)  
print(is.factor(f))  

Output

1 48

Accessing components of factor

The factors components can be accessed like vectors. This procedure is much similar to the vector. The elements can be accessed with help of the indexing method or using logical vectors.

Have a look at the below example to understand the different ways of accessing the R factor components.

# Creating a vector as input.  
v <- c("Python", "R", "Java", "C++", "Ruby", "HTML")  
# Applying the factor function.  
f <- factor(v)  
  
#Printing all elements of factor  
print(f)  
  
#Accessing 4th element of factor  
print(f[4])  
  
#Accessing 5th and 7th element  
print(f[c(5,7)])  
  
#Accessing all element except 4th one  
print(f[-4])  
  
#Accessing elements using logical vector  
print(f[c(TRUE,FALSE,FALSE,FALSE,TRUE,TRUE)])  

Output

1 49

Modification of R factor

R programming allows to modification of the factor like data frames. We can modify the value of a factor by re-assigning it. You cannot choose values outside of its predefined levels, which means if the level of value is not presented you cannot insert value. To do this you have to create a level of value, and then you are able to add it to our factor.

Have a look at the example to know how the modification is done in factors:

# Creating a vector as input.  
v <- c("Python", "R", "Java", "C++", "Ruby", "HTML")  
  
# Applying the factor function.  
f <- factor(v)  
   
#Change 5th element of factor 
f[5] <-"C#"  
print(f)  
  
#change 5th element of factor 
f[4] <- "JavaScript"    # cannot assign values outside levels  
print(f)  
  
#Adding the value to the level  
levels(f) <- c(levels(f)," JavaScript ") #Adding new level  
f[5] <- " JavaScript "  

Output

1 50

Factor in Data Frame

R programming treats the frame text column as categorical data and creates factors on it.

Have a look at the example to know how the data frames are created:

# Creating the vectors  
h <- c(152,176,152,166,156,147,122)  
w <- c(44,47,44,43,67,52,33)  
g <- c("m","m","f","f","m","f","m")  
  
data<- data.frame(h,w,g)  
print(data)  
  
# Testing if the gender column is a factor.  
print(is.factor(data$g))  
  
# Printing the levels.  
print(data$g)  

Output

f4 1

Changing order of the levels

In R programming, with the help of the factor function, you can change the order of the levels in the factor. Have a look at the example to know how to change the order of the levels in factors:

data <- c("Python", "R", "Java", "C++", "Ruby", "HTML")  

f<- factor(data)  
print(f)  
  
# Apply the factor function  
nf<- factor(f,levels = c("C++", "Python", "R", "HTML", "Ruby", "Java"))
print(nf)  

Output

1 52

Generating Factor Levels

The gl() function is used to generate factor levels. The syntax of gl() function is given below:

Here:

  • n indicates the number of levels.
  • k indicates the number of replications.
  • labels is a vector of labels for the resulting factor levels.

Have a look at the example to know to generate factor levels :

f<- gl(3,5,labels=c("C++", "Ruby", "HTML"))
f

Output:

fac6