R features
Well-developed and effective programming
language – R is a simple yet powerful language specifically
designed for data analysis and statistical computing.
Efficient data handling and storage – R provides
robust structures for handling and storing large datasets
efficiently.
Rich set of operators – R includes a variety of
operators for performing calculations on arrays, lists, vectors, and
matrices.
Comprehensive tools for data analysis – R offers
numerous built-in functions and packages that support a wide range of
data analysis tasks.
Advanced graphical capabilities – R supports
both basic and advanced graphical techniques for data visualization,
aiding in better interpretation of results.
Extensive packages for advanced applications – R
has a vast collection of packages for data mining, machine learning, and
big data analytics, making it highly suitable for modern data science
workflows.
R Environment Workspace Description

R Script/File – This is where users write and
save their R code. It allows you to edit, run, and manage reusable
scripts for analysis and automation.
Environment – Displays the current workspace,
including all the variables, data frames, functions, and loaded objects
that are active in your session.
History – Keeps a record of all the R commands
executed during the session, allowing users to review and reuse previous
commands easily.
Console – The command-line interface where users
can directly enter and execute R commands and immediately see the output
or results.
Files – Provides access to the directory
structure of your computer so that you can browse, open, or manage files
and folders.
Plots – Displays visual output generated from
data visualizations, such as graphs and charts created using R’s
plotting functions.
Packages – Shows the list of R packages
installed and loaded in the session. Users can install new packages or
load/unload existing ones from here.
Help – Offers documentation and help for R
functions, packages, and commands. Users can search for and view
detailed descriptions, usage, and examples.
The step-by-step instructions to download and install R programming
and RStudio:
Step 1: Download and Install R
Go to the CRAN R Project Website https://cran.r-project.org ,
Choose your Operating System : Click on “Download R for Windows”,
“Download R for macOS”, or “Download R for Linux” depending on your
system.
Download the R Installer: For Windows: Click “base” > then
click “Download R x.x.x for Windows” (e.g., R 4.3.1). For macOS: Click
the link for the latest .pkg installer
Run the Installer: Open the downloaded file, Follow the
instructions to install R (default settings are usually fine)
Step 2: Download and Install RStudio
Go to the RStudio Website https://posit.co/download/rstudio-desktop/ ,Download
RStudio Desktop (Free).
Click on “Download RStudio Desktop for [Your OS]”. It will
redirect you to the appropriate installer for Windows, macOS, or
Linux
Install RStudio
Run the downloaded installer: Follow the on-screen
instructions.
Packages and Libraries
R provides access to a vast ecosystem of packages, with more than
7,769 packages available through CRAN (Comprehensive R Archive Network).
These packages extend R’s functionality and support a wide range of
tasks in data science and statistical computing. Some of the key
capabilities offered by R packages include:
Data visualization: Tools for creating charts,
graphs, and interactive plots (e.g., ggplot2, plotly).
Data statistics and exploration: Functions to
summarize, explore, and understand data distributions and
relationships.
Data transformation: Utilities for cleaning,
reshaping, and manipulating data (e.g., dplyr, tidyr).
Outlier detection: Methods for identifying and
handling anomalies in data.
Feature selection: Techniques to select important
variables that contribute most to prediction or classification.
Dimension reduction: Tools like PCA (Principal
Component Analysis) to reduce the number of variables while preserving
data structure.
Classification: Algorithms for supervised learning
and assigning data to categories (e.g., decision trees, SVM).
Clustering: Unsupervised learning techniques for
grouping similar data points (e.g., k-means, hierarchical
clustering).
Regression validation: Tools to evaluate the
performance of regression models.
Classification validation: Functions to assess
accuracy, precision, recall, and F1-score of classification models.
Clustering validation: Metrics to validate the
quality and effectiveness of clustering results (e.g., silhouette
score).
Installing and Loading Packages in R Using RStudio GUI
Open RStudio: Start by opening the RStudio application on your
computer.
Install a Package Using the GUI: Go to the “Packages” tab in the
bottom-right pane.
Click on the “Install” button (usually at the top left of the
“Packages” tab). A dialog box will appear. In the “Packages” field, type
the name of the package you want to install (e.g., ggplot2). Make sure
the checkbox “Install dependencies” is selected.
Click the “Install” button. RStudio will now download and install
the package from CRAN.
Load a Package Using the GUI: After installation, stay in the
“Packages” tab. Find the installed package in the list (e.g., ggplot2).
Tick the checkbox next to the package name to load it into your R
session.
Steps to Install and Load Packages in R
- Installing a Package: To install a package in R,
you use the install.packages() function. This step downloads the package
from CRAN and installs it on your system.
Example: install.packages(“ggplot2”)
This command installs the ggplot2 package, which is widely used for
data visualization.
- Loading a Package : Once a package is installed,
you need to load it into your current R session using the library( )
function.
Example: library(ggplot2)
This command loads the ggplot2 package so that you can use its
functions.
- Check if a Package is Already Installed : You can
check if a package is already installed using the require() function or
by checking the installed packages list.
Example: require(ggplot2)
If the package is not installed, require() will return FALSE.
- Install and Load Multiple Packages You can install
and load multiple packages using a vector and a loop.
Example:
packages <- c("dplyr", "tidyr", "ggplot2")
install.packages(packages) # Install all
lapply(packages, library, character.only = TRUE) # Load all
- Viewing Installed Packages : You can view all
installed packages using:
installed.packages()
- Update Packages (Optional) : To update all
installed packages:
update.packages()
Certainly! Here’s a detailed explanation of the basic data
types in R, with proper sentences and examples for each:
Basic Data Types in R
R is a powerful programming language used for statistical computing
and data analysis. It supports several basic data
types, which are fundamental for handling and manipulating
data. These data types include: numeric,
integer, complex, character
(string), and logical (boolean).
1. Numeric
- The numeric data type represents real
numbers (i.e., numbers with decimal points).
- It is the default type for numbers in R unless specified
otherwise.
Example:
x <- 3.14
class(x) # Output: "numeric"
[1] "numeric"
Here, x
is a numeric variable storing the value
3.14.
2. Integer
- An integer is a whole number (without a decimal
point).
- In R, you must use an
L
suffix to
explicitly define an integer.
Example:
y <- 10L
class(y) # Output: "integer"
[1] "integer"
The L
tells R to treat 10
as an integer
rather than a numeric (which would be a double).
3. Complex
- The complex data type represents complex
numbers, which have both a real and imaginary part (e.g., 1 +
2i).
- Useful in mathematical computations involving complex
arithmetic.
Example:
z <- 2 + 3i
class(z) # Output: "complex"
[1] "complex"
Here, z
is a complex number with a real part
2
and an imaginary part 3i
.
4. Character (String)
- The character data type is used to store
text or string values, such as words, phrases, or any
sequence of characters.
- Character data must be enclosed in either single
'
or
double "
quotes.
Example:
name <- "Srinivas"
class(name) # Output: "character"
[1] "character"
Here, "Srinivas"
is a character string.
5. Logical (Boolean)
- The logical data type stores TRUE
or FALSE values.
- This type is often used in conditions, comparisons, and filtering
data.
Example:
is_senior <- TRUE
class(is_senior) # Output: "logical"
[1] "logical"
age <- 25
age > 30 # Output: FALSE
[1] FALSE
Here, is_senior
is a logical variable. The expression
age > 30
also returns a logical value.
Summary Table
Numeric |
Decimal or floating-point numbers |
x <- 5.6 |
Integer |
Whole numbers with L suffix |
y <- 100L |
Complex |
Numbers with real + imaginary parts |
z <- 3 + 4i |
Character |
Text or strings |
name <- "R User" |
Logical |
Boolean values (TRUE or FALSE ) |
flag <- FALSE |
Arithmetic Operators in R
R provides a set of arithmetic operators that are used to perform
basic mathematical operations on numeric values, vectors, or variables.
These operators include addition,
subtraction, multiplication,
division, modulus, and
exponentiation.
1. Addition (+
)
The +
operator is used to add two or more numbers.
Example:
a <- 10
b <- 5
result <- a + b
print(result) # Output: 15
[1] 15
In this example, 10 and 5 are added to get 15.
2. Subtraction (-
)
The -
operator is used to subtract one number from
another.
Example:
a <- 10
b <- 5
result <- a - b
print(result) # Output: 5
[1] 5
Here, 5 is subtracted from 10 to get 5.
3. Multiplication (*
)
The *
operator is used to multiply numbers.
Example:
a <- 10
b <- 5
result <- a * b
print(result) # Output: 50
[1] 50
The multiplication of 10 and 5 results in 50.
4. Division (/
)
The /
operator is used to divide one number by
another.
Example:
a <- 10
b <- 5
result <- a / b
print(result) # Output: 2
[1] 2
Here, 10 divided by 5 gives the result 2.
5. Modulus (%%
)
The %%
operator gives the remainder
when one number is divided by another.
Example:
a <- 10
b <- 3
result <- a %% b
print(result) # Output: 1
[1] 1
When 10 is divided by 3, the remainder is 1.
6. Exponentiation (^
)
The ^
operator is used to raise a number to the power of
another.
Example:
a <- 2
b <- 3
result <- a ^ b
print(result) # Output: 8
[1] 8
In this case, 2 raised to the power of 3 equals 8.
Summary Table
+ |
Addition |
5 + 3 |
8 |
- |
Subtraction |
5 - 3 |
2 |
* |
Multiplication |
5 * 3 |
15 |
/ |
Division |
6 / 3 |
2 |
%% |
Modulus (remainder) |
10 %% 3 |
1 |
^ |
Exponentiation |
2 ^ 4 |
16 |
These arithmetic operators are frequently used in mathematical
computations, statistical analysis, and data manipulation tasks in
R.
Relational Operators in R
Relational operators in R are used to compare two values or
expressions. The result of a relational operation is always a
logical value: either TRUE
or
FALSE
. These are essential for decision-making and
conditional programming.
1. <
(Less than)
This operator checks whether the value on the left is less than the
value on the right.
Example:
a <- 10
b <- 20
c <- a < b
print(c) # Output: TRUE
[1] TRUE
Explanation: Since 10 is less than 20, the result is
TRUE
.
2. >
(Greater than)
Checks if the value on the left is greater than the value on the
right.
Example:
a <- 10
b <- 20
c <- a > b
print(c) # Output: FALSE
[1] FALSE
Explanation: 10 is not greater than 20, so the result is
FALSE
.
3. <=
(Less than or equal to)
Checks if the left-hand side is less than or equal to the right-hand
side.
Example:
a <- 10
b <- 10
c <- a <= b
print(c) # Output: TRUE
[1] TRUE
Explanation: 10 is equal to 10, so it satisfies the condition.
4. >=
(Greater than or equal to)
Checks if the value on the left is greater than or equal to the one
on the right.
Example:
a <- 15
b <- 10
c <- a >= b
print(c) # Output: TRUE
[1] TRUE
Explanation: 15 is greater than 10, so the result is
TRUE
.
5. ==
(Equal to)
Tests whether two values are equal.
Example:
a <- 10
b <- 10
c <- a == b
print(c) # Output: TRUE
[1] TRUE
Explanation: Both values are equal, so it returns
TRUE
.
6. !=
(Not equal to)
Checks if the two values are not equal.
Example:
a <- 10
b <- 20
c <- a != b
print(c) # Output: TRUE
[1] TRUE
Explanation: 10 and 20 are not equal, so the result is
TRUE
.
Summary Table
< |
Less than |
10 < 20 |
TRUE |
> |
Greater than |
10 > 20 |
FALSE |
<= |
Less than or equal to |
10 <= 10 |
TRUE |
>= |
Greater than or equal to |
15 >= 10 |
TRUE |
== |
Equal to |
10 == 10 |
TRUE |
!= |
Not equal to |
10 != 20 |
TRUE |
These relational operators are particularly useful in if-else
conditions, filtering data, and loop
control in R programming.
Logical Operators in R
Logical operators are used to combine or modify logical values
(TRUE
or FALSE
) in R. They are especially
useful in conditional statements, data
filtering, and vectorized operations.
1. !
(Logical NOT)
This operator reverses the logical state of its operand. If a
condition is TRUE
, applying !
will make it
FALSE
, and vice versa.
Example:
x <- TRUE
result <- !x
print(result) # Output: FALSE
[1] FALSE
Explanation: The value of x
is
TRUE
. The !
operator flips it to
FALSE
.
2. &
(Element-wise Logical AND)
This operator performs a logical AND element by
element on two logical vectors. It returns TRUE
if
both elements are TRUE
.
Example:
a <- c(TRUE, FALSE, TRUE)
b <- c(TRUE, TRUE, FALSE)
result <- a & b
print(result) # Output: TRUE FALSE FALSE
[1] TRUE FALSE FALSE
Explanation: It checks each corresponding pair:
TRUE & TRUE
→ TRUE
FALSE & TRUE
→ FALSE
TRUE & FALSE
→ FALSE
3. &&
(Logical AND – evaluates only first
element)
This operator also performs logical AND, but only on the
first element of each operand. It is often used in control
structures like if
statements.
Used for scalar (single logical) comparisons.
It only checks the first element of each logical vector.
If your vector has more than one element, you’ll get an error or
unexpected result.
Example:
a <- c(TRUE, FALSE)
b <- c(TRUE, TRUE)
result <- (a & b)
print(result) # Output: TRUE FALSE
[1] TRUE FALSE
a <- c(TRUE, FALSE)
b <- c(TRUE, TRUE)
result <- (a && b )
Error in a && b : 'length = 2' in coercion to 'logical(1)'
a <- c(TRUE, FALSE)
b <- c(TRUE, TRUE)
result <- (a[1] && b[1])
print(result) # Output: TRUE
[1] TRUE
Explanation: It checks only the first elements:
TRUE && TRUE
→ TRUE
4. |
(Element-wise Logical OR)
Performs logical OR element by element. It returns
TRUE
if either of the elements is
TRUE
.
Example:
a <- c(TRUE, FALSE, FALSE)
b <- c(FALSE, TRUE, FALSE)
result <- a | b
print(result) # Output: TRUE TRUE FALSE
[1] TRUE TRUE FALSE
Explanation:
TRUE | FALSE
→ TRUE
FALSE | TRUE
→ TRUE
FALSE | FALSE
→ FALSE
5. ||
(Logical OR – evaluates only first element)
This performs logical OR on only the first element
of the logical vectors.
Example:
a <- c(FALSE, FALSE)
b <- c(TRUE, FALSE)
result <- a || b
Error in a || b : 'length = 2' in coercion to 'logical(1)'
a <- c(FALSE, FALSE)
b <- c(TRUE, FALSE)
result <- a[1] || b[1]
print(result) # Output: TRUE
[1] TRUE
Explanation: It checks only the first elements:
FALSE || TRUE
→ TRUE
Summary Table
! |
Logical NOT |
Single value |
!TRUE |
FALSE |
|
|
|
|
& |
Element-wise AND |
Vectors |
c(TRUE, FALSE) & c(TRUE, TRUE) |
TRUE FALSE |
|
|
|
|
&& |
First-element AND |
Vectors |
c(TRUE, FALSE) && c(TRUE, TRUE) |
TRUE |
|
|
|
|
| |
Element-wise Logical OR |
Element-wise OR |
Vectors |
`c(FALSE, FALSE) |
c(TRUE, FALSE)` |
TRUE FALSE |
|
|
|| |
Logical OR |
First-element OR |
Vectors |
`c(FALSE, TRUE) |
|
c(TRUE, FALSE)` |
TRUE |
|
These logical operators are fundamental for filtering rows in
data frames, applying conditions, and
building control flows in R programming.
Certainly! Here’s a detailed explanation of Assignment
Operators in R with examples using proper sentences:
Assignment Operators in R
Assignment operators in R are used to assign values to variables. R
provides multiple ways to assign values, including leftward, rightward,
and equal sign assignments.
1. <-
(Leftward Assignment)
- Description: This is the most commonly used
assignment operator in R. It assigns a value to a variable by pointing
the arrow from the value to the variable.
- Example:
x <- 10
print(x)
[1] 10
- Explanation: The value
10
is assigned
to the variable x
. When you print x
, it
outputs 10
.
2. <<-
(Global or Upward Assignment)
- Description: This operator assigns a value to a
variable in a parent environment (used inside functions
to update global variables).
- Example:
updateVar <- function() {
y <<- 50
}
updateVar()
print(y)
[1] 50
- Explanation: The variable
y
is created
in the global environment even though it is assigned inside a function
using <<-
.
3. =
(Equal Sign Assignment)
- Description: It assigns a value just like
<-
, but it’s more commonly used when passing
arguments to functions.
- Example:
z = 20
print(z)
[1] 20
- Explanation: The value
20
is assigned
to the variable z
. Both <-
and
=
can be used for simple assignments, but
<-
is preferred in R programming style.
Summary Table
<- |
Local/Global |
General assignment |
a <- 5 |
<<- |
Parent/Global env |
Inside functions (global) |
x <<- 10 |
= |
Local |
Assign values or function args |
x = 20 |
Certainly! Here’s a detailed explanation of Functions in
R with proper structure, examples, and best practices:
FUNCTIONS IN R
What is a Function in R?
A function in R is a block of reusable code designed
to perform a specific task. R provides many built-in functions, and it
also allows users to define their own functions.
Why Use Functions?
- To avoid repetition of code
- To make code modular, clean, and readable
- To simplify debugging and testing
- To break complex problems into manageable parts
Syntax of a Function in R
function_name <- function(arg1, arg2, ...) {
# Code block
return(result)
}
Example: Simple Function in R
add_numbers <- function(a, b) {
result <- a + b
return(result)
}
add_numbers(5, 3)
[1] 8
Components of a Function
function_name |
Name of the function |
function() |
Keyword to define a function |
arguments |
Input values passed to the function |
return() |
Optional – returns output to the calling environment |
Body |
Block of code that defines what the function does |
Built-in Functions in R
R comes with many built-in functions:
sum() |
Calculates sum of values |
sum(c(1, 2, 3)) |
mean() |
Calculates average |
mean(c(4, 5, 6)) |
sqrt() |
Square root |
sqrt(25) |
length() |
Number of elements in a vector |
length(c(1,2,3)) |
User-defined Function with Default Values
greet <- function(name = "User") {
paste("Hello", name)
}
greet("Srinivas")
[1] "Hello Srinivas"
greet()
[1] "Hello User"
Function Without Return Statement
If return()
is not used, R will return the last
evaluated expression:
multiply <- function(a, b) {
a * b
}
multiply(4, 5)
[1] 20
Nested Function Example
outer <- function(x) {
inner <- function(y) {
return(y + 1)
}
return(inner(x) * 2)
}
outer(4) # returns (4+1)*2 = 10
[1] 10
Anonymous Functions (No Name)
sapply(c(1, 2, 3), function(x) x^2)
[1] 1 4 9
Variable Scope in Functions
Variables created inside a function are
local to that function:
myfunc <- function() {
x <- 10 # Local variable
print(x)
}
myfunc()
[1] 10
print(x) # Error: object 'x' not found
[1] 10
Summary Table
Built-in |
Predefined R functions |
sum() , mean() , sqrt() |
User-defined |
Functions created by users |
my_function <- function() {...} |
With defaults |
Functions with default argument vals |
greet <- function(name = "User") |
Anonymous |
Functions without name |
function(x) x+1 in sapply() |
