R features

  1. Well-developed and effective programming language – R is a simple yet powerful language specifically designed for data analysis and statistical computing.

  2. Efficient data handling and storage – R provides robust structures for handling and storing large datasets efficiently.

  3. Rich set of operators – R includes a variety of operators for performing calculations on arrays, lists, vectors, and matrices.

  4. Comprehensive tools for data analysis – R offers numerous built-in functions and packages that support a wide range of data analysis tasks.

  5. Advanced graphical capabilities – R supports both basic and advanced graphical techniques for data visualization, aiding in better interpretation of results.

  6. Extensive packages for advanced applications – R has a vast collection of packages for data mining, machine learning, and big data analytics, making it highly suitable for modern data science workflows.

R Environment Workspace Description

Figure: R Environment Workspace

  1. R Script/File – This is where users write and save their R code. It allows you to edit, run, and manage reusable scripts for analysis and automation.

  2. Environment – Displays the current workspace, including all the variables, data frames, functions, and loaded objects that are active in your session.

  3. History – Keeps a record of all the R commands executed during the session, allowing users to review and reuse previous commands easily.

  4. Console – The command-line interface where users can directly enter and execute R commands and immediately see the output or results.

  5. Files – Provides access to the directory structure of your computer so that you can browse, open, or manage files and folders.

  6. Plots – Displays visual output generated from data visualizations, such as graphs and charts created using R’s plotting functions.

  7. Packages – Shows the list of R packages installed and loaded in the session. Users can install new packages or load/unload existing ones from here.

  8. Help – Offers documentation and help for R functions, packages, and commands. Users can search for and view detailed descriptions, usage, and examples.

The step-by-step instructions to download and install R programming and RStudio:

Step 1: Download and Install R

  1. Go to the CRAN R Project Website https://cran.r-project.org ,

  2. Choose your Operating System : Click on “Download R for Windows”, “Download R for macOS”, or “Download R for Linux” depending on your system.

  3. Download the R Installer: For Windows: Click “base” > then click “Download R x.x.x for Windows” (e.g., R 4.3.1). For macOS: Click the link for the latest .pkg installer

  4. Run the Installer: Open the downloaded file, Follow the instructions to install R (default settings are usually fine)

Step 2: Download and Install RStudio

  1. Go to the RStudio Website https://posit.co/download/rstudio-desktop/ ,Download RStudio Desktop (Free).

  2. Click on “Download RStudio Desktop for [Your OS]”. It will redirect you to the appropriate installer for Windows, macOS, or Linux

  3. Install RStudio

  4. Run the downloaded installer: Follow the on-screen instructions.

Packages and Libraries

R provides access to a vast ecosystem of packages, with more than 7,769 packages available through CRAN (Comprehensive R Archive Network). These packages extend R’s functionality and support a wide range of tasks in data science and statistical computing. Some of the key capabilities offered by R packages include:

Data visualization: Tools for creating charts, graphs, and interactive plots (e.g., ggplot2, plotly).

Data statistics and exploration: Functions to summarize, explore, and understand data distributions and relationships.

Data transformation: Utilities for cleaning, reshaping, and manipulating data (e.g., dplyr, tidyr).

Outlier detection: Methods for identifying and handling anomalies in data.

Feature selection: Techniques to select important variables that contribute most to prediction or classification.

Dimension reduction: Tools like PCA (Principal Component Analysis) to reduce the number of variables while preserving data structure.

Classification: Algorithms for supervised learning and assigning data to categories (e.g., decision trees, SVM).

Clustering: Unsupervised learning techniques for grouping similar data points (e.g., k-means, hierarchical clustering).

Regression validation: Tools to evaluate the performance of regression models.

Classification validation: Functions to assess accuracy, precision, recall, and F1-score of classification models.

Clustering validation: Metrics to validate the quality and effectiveness of clustering results (e.g., silhouette score).

Installing and Loading Packages in R Using RStudio GUI

Figure: Installing and Loading Packages in R

  1. Open RStudio: Start by opening the RStudio application on your computer.

  2. Install a Package Using the GUI: Go to the “Packages” tab in the bottom-right pane.

  3. Click on the “Install” button (usually at the top left of the “Packages” tab). A dialog box will appear. In the “Packages” field, type the name of the package you want to install (e.g., ggplot2). Make sure the checkbox “Install dependencies” is selected.

  4. Click the “Install” button. RStudio will now download and install the package from CRAN.

  5. Load a Package Using the GUI: After installation, stay in the “Packages” tab. Find the installed package in the list (e.g., ggplot2). Tick the checkbox next to the package name to load it into your R session.

Steps to Install and Load Packages in R

  1. Installing a Package: To install a package in R, you use the install.packages() function. This step downloads the package from CRAN and installs it on your system.

Example: install.packages(“ggplot2”)

This command installs the ggplot2 package, which is widely used for data visualization.

  1. Loading a Package : Once a package is installed, you need to load it into your current R session using the library( ) function.

Example: library(ggplot2)

This command loads the ggplot2 package so that you can use its functions.

  1. Check if a Package is Already Installed : You can check if a package is already installed using the require() function or by checking the installed packages list.

Example: require(ggplot2)

If the package is not installed, require() will return FALSE.

  1. Install and Load Multiple Packages You can install and load multiple packages using a vector and a loop.

Example:

packages <- c("dplyr", "tidyr", "ggplot2")

install.packages(packages)   # Install all

lapply(packages, library, character.only = TRUE)  # Load all
  1. Viewing Installed Packages : You can view all installed packages using:
installed.packages()
  1. Update Packages (Optional) : To update all installed packages:
update.packages()

Certainly! Here’s a detailed explanation of the basic data types in R, with proper sentences and examples for each:


Basic Data Types in R

R is a powerful programming language used for statistical computing and data analysis. It supports several basic data types, which are fundamental for handling and manipulating data. These data types include: numeric, integer, complex, character (string), and logical (boolean).


1. Numeric

  • The numeric data type represents real numbers (i.e., numbers with decimal points).
  • It is the default type for numbers in R unless specified otherwise.

Example:

x <- 3.14
class(x)  # Output: "numeric"
[1] "numeric"

Here, x is a numeric variable storing the value 3.14.


2. Integer

  • An integer is a whole number (without a decimal point).
  • In R, you must use an L suffix to explicitly define an integer.

Example:

y <- 10L
class(y)  # Output: "integer"
[1] "integer"

The L tells R to treat 10 as an integer rather than a numeric (which would be a double).


3. Complex

  • The complex data type represents complex numbers, which have both a real and imaginary part (e.g., 1 + 2i).
  • Useful in mathematical computations involving complex arithmetic.

Example:

z <- 2 + 3i
class(z)  # Output: "complex"
[1] "complex"

Here, z is a complex number with a real part 2 and an imaginary part 3i.


4. Character (String)

  • The character data type is used to store text or string values, such as words, phrases, or any sequence of characters.
  • Character data must be enclosed in either single ' or double " quotes.

Example:

name <- "Srinivas"
class(name)  # Output: "character"
[1] "character"

Here, "Srinivas" is a character string.


5. Logical (Boolean)

  • The logical data type stores TRUE or FALSE values.
  • This type is often used in conditions, comparisons, and filtering data.

Example:

is_senior <- TRUE
class(is_senior)  # Output: "logical"
[1] "logical"
age <- 25
age > 30  # Output: FALSE
[1] FALSE

Here, is_senior is a logical variable. The expression age > 30 also returns a logical value.


Summary Table

Data Type Description Example
Numeric Decimal or floating-point numbers x <- 5.6
Integer Whole numbers with L suffix y <- 100L
Complex Numbers with real + imaginary parts z <- 3 + 4i
Character Text or strings name <- "R User"
Logical Boolean values (TRUE or FALSE) flag <- FALSE


Arithmetic Operators in R

R provides a set of arithmetic operators that are used to perform basic mathematical operations on numeric values, vectors, or variables. These operators include addition, subtraction, multiplication, division, modulus, and exponentiation.


1. Addition (+)

The + operator is used to add two or more numbers.

Example:

a <- 10
b <- 5
result <- a + b
print(result)  # Output: 15
[1] 15

In this example, 10 and 5 are added to get 15.


2. Subtraction (-)

The - operator is used to subtract one number from another.

Example:

a <- 10
b <- 5
result <- a - b
print(result)  # Output: 5
[1] 5

Here, 5 is subtracted from 10 to get 5.


3. Multiplication (*)

The * operator is used to multiply numbers.

Example:

a <- 10
b <- 5
result <- a * b
print(result)  # Output: 50
[1] 50

The multiplication of 10 and 5 results in 50.


4. Division (/)

The / operator is used to divide one number by another.

Example:

a <- 10
b <- 5
result <- a / b
print(result)  # Output: 2
[1] 2

Here, 10 divided by 5 gives the result 2.


5. Modulus (%%)

The %% operator gives the remainder when one number is divided by another.

Example:

a <- 10
b <- 3
result <- a %% b
print(result)  # Output: 1
[1] 1

When 10 is divided by 3, the remainder is 1.


6. Exponentiation (^)

The ^ operator is used to raise a number to the power of another.

Example:

a <- 2
b <- 3
result <- a ^ b
print(result)  # Output: 8
[1] 8

In this case, 2 raised to the power of 3 equals 8.


Summary Table

Operator Description Example Result
+ Addition 5 + 3 8
- Subtraction 5 - 3 2
* Multiplication 5 * 3 15
/ Division 6 / 3 2
%% Modulus (remainder) 10 %% 3 1
^ Exponentiation 2 ^ 4 16

These arithmetic operators are frequently used in mathematical computations, statistical analysis, and data manipulation tasks in R.


Relational Operators in R

Relational operators in R are used to compare two values or expressions. The result of a relational operation is always a logical value: either TRUE or FALSE. These are essential for decision-making and conditional programming.


1. < (Less than)

This operator checks whether the value on the left is less than the value on the right.

Example:

a <- 10
b <- 20
c <- a < b
print(c)  # Output: TRUE
[1] TRUE

Explanation: Since 10 is less than 20, the result is TRUE.


2. > (Greater than)

Checks if the value on the left is greater than the value on the right.

Example:

a <- 10
b <- 20
c <- a > b
print(c)  # Output: FALSE
[1] FALSE

Explanation: 10 is not greater than 20, so the result is FALSE.


3. <= (Less than or equal to)

Checks if the left-hand side is less than or equal to the right-hand side.

Example:

a <- 10
b <- 10
c <- a <= b
print(c)  # Output: TRUE
[1] TRUE

Explanation: 10 is equal to 10, so it satisfies the condition.


4. >= (Greater than or equal to)

Checks if the value on the left is greater than or equal to the one on the right.

Example:

a <- 15
b <- 10
c <- a >= b
print(c)  # Output: TRUE
[1] TRUE

Explanation: 15 is greater than 10, so the result is TRUE.


5. == (Equal to)

Tests whether two values are equal.

Example:

a <- 10
b <- 10
c <- a == b
print(c)  # Output: TRUE
[1] TRUE

Explanation: Both values are equal, so it returns TRUE.


6. != (Not equal to)

Checks if the two values are not equal.

Example:

a <- 10
b <- 20
c <- a != b
print(c)  # Output: TRUE
[1] TRUE

Explanation: 10 and 20 are not equal, so the result is TRUE.


Summary Table

Operator Description Example Result
< Less than 10 < 20 TRUE
> Greater than 10 > 20 FALSE
<= Less than or equal to 10 <= 10 TRUE
>= Greater than or equal to 15 >= 10 TRUE
== Equal to 10 == 10 TRUE
!= Not equal to 10 != 20 TRUE

These relational operators are particularly useful in if-else conditions, filtering data, and loop control in R programming.


Logical Operators in R

Logical operators are used to combine or modify logical values (TRUE or FALSE) in R. They are especially useful in conditional statements, data filtering, and vectorized operations.


1. ! (Logical NOT)

This operator reverses the logical state of its operand. If a condition is TRUE, applying ! will make it FALSE, and vice versa.

Example:

x <- TRUE
result <- !x
print(result)  # Output: FALSE
[1] FALSE

Explanation: The value of x is TRUE. The ! operator flips it to FALSE.


2. & (Element-wise Logical AND)

This operator performs a logical AND element by element on two logical vectors. It returns TRUE if both elements are TRUE.

Example:

a <- c(TRUE, FALSE, TRUE)
b <- c(TRUE, TRUE, FALSE)
result <- a & b
print(result)  # Output: TRUE FALSE FALSE
[1]  TRUE FALSE FALSE

Explanation: It checks each corresponding pair:

  • TRUE & TRUETRUE
  • FALSE & TRUEFALSE
  • TRUE & FALSEFALSE

3. && (Logical AND – evaluates only first element)

This operator also performs logical AND, but only on the first element of each operand. It is often used in control structures like if statements.

Used for scalar (single logical) comparisons.

It only checks the first element of each logical vector.

If your vector has more than one element, you’ll get an error or unexpected result.

Example:

a <- c(TRUE, FALSE)
b <- c(TRUE, TRUE)
result <- (a & b)
print(result)  # Output: TRUE FALSE
[1]  TRUE FALSE
a <- c(TRUE, FALSE)
b <- c(TRUE, TRUE)
result <- (a && b )
Error in a && b : 'length = 2' in coercion to 'logical(1)'
a <- c(TRUE, FALSE)
b <- c(TRUE, TRUE)
result <- (a[1] && b[1])
print(result)  # Output: TRUE
[1] TRUE

Explanation: It checks only the first elements: TRUE && TRUETRUE


4. | (Element-wise Logical OR)

Performs logical OR element by element. It returns TRUE if either of the elements is TRUE.

Example:

a <- c(TRUE, FALSE, FALSE)
b <- c(FALSE, TRUE, FALSE)
result <- a | b
print(result)  # Output: TRUE TRUE FALSE
[1]  TRUE  TRUE FALSE

Explanation:

  • TRUE | FALSETRUE
  • FALSE | TRUETRUE
  • FALSE | FALSEFALSE

5. || (Logical OR – evaluates only first element)

This performs logical OR on only the first element of the logical vectors.

Example:

a <- c(FALSE, FALSE)
b <- c(TRUE, FALSE)
result <- a || b
Error in a || b : 'length = 2' in coercion to 'logical(1)'
a <- c(FALSE, FALSE)
b <- c(TRUE, FALSE)
result <- a[1] || b[1]
print(result)  # Output: TRUE
[1] TRUE

Explanation: It checks only the first elements: FALSE || TRUETRUE


Summary Table

Operator Name Works On Example Result
! Logical NOT Single value !TRUE FALSE
& Element-wise AND Vectors c(TRUE, FALSE) & c(TRUE, TRUE) TRUE FALSE
&& First-element AND Vectors c(TRUE, FALSE) && c(TRUE, TRUE) TRUE
| Element-wise Logical OR Element-wise OR Vectors `c(FALSE, FALSE) c(TRUE, FALSE)` TRUE FALSE
|| Logical OR First-element OR Vectors `c(FALSE, TRUE) c(TRUE, FALSE)` TRUE

These logical operators are fundamental for filtering rows in data frames, applying conditions, and building control flows in R programming.

Certainly! Here’s a detailed explanation of Assignment Operators in R with examples using proper sentences:


Assignment Operators in R

Assignment operators in R are used to assign values to variables. R provides multiple ways to assign values, including leftward, rightward, and equal sign assignments.


1. <- (Leftward Assignment)

  • Description: This is the most commonly used assignment operator in R. It assigns a value to a variable by pointing the arrow from the value to the variable.
  • Example:
x <- 10
print(x)
[1] 10
  • Explanation: The value 10 is assigned to the variable x. When you print x, it outputs 10.

2. <<- (Global or Upward Assignment)

  • Description: This operator assigns a value to a variable in a parent environment (used inside functions to update global variables).
  • Example:
updateVar <- function() {
  y <<- 50
}
updateVar()
print(y)
[1] 50
  • Explanation: The variable y is created in the global environment even though it is assigned inside a function using <<-.

3. = (Equal Sign Assignment)

  • Description: It assigns a value just like <-, but it’s more commonly used when passing arguments to functions.
  • Example:
  z = 20
  print(z)
[1] 20
  • Explanation: The value 20 is assigned to the variable z. Both <- and = can be used for simple assignments, but <- is preferred in R programming style.

Summary Table

Operator Usage Scope Common Use Case Example
<- Local/Global General assignment a <- 5
<<- Parent/Global env Inside functions (global) x <<- 10
= Local Assign values or function args x = 20

Certainly! Here’s a detailed explanation of Functions in R with proper structure, examples, and best practices:


FUNCTIONS IN R


What is a Function in R?

A function in R is a block of reusable code designed to perform a specific task. R provides many built-in functions, and it also allows users to define their own functions.


Why Use Functions?

  • To avoid repetition of code
  • To make code modular, clean, and readable
  • To simplify debugging and testing
  • To break complex problems into manageable parts

Syntax of a Function in R

function_name <- function(arg1, arg2, ...) {
  # Code block
  return(result)
}

Example: Simple Function in R

add_numbers <- function(a, b) {
  result <- a + b
  return(result)
}
add_numbers(5, 3)
[1] 8

Components of a Function

Component Description
function_name Name of the function
function() Keyword to define a function
arguments Input values passed to the function
return() Optional – returns output to the calling environment
Body Block of code that defines what the function does

Built-in Functions in R

R comes with many built-in functions:

Function Description Example
sum() Calculates sum of values sum(c(1, 2, 3))
mean() Calculates average mean(c(4, 5, 6))
sqrt() Square root sqrt(25)
length() Number of elements in a vector length(c(1,2,3))

User-defined Function with Default Values

greet <- function(name = "User") {
  paste("Hello", name)
}
greet("Srinivas")
[1] "Hello Srinivas"
greet()
[1] "Hello User"

Function Without Return Statement

If return() is not used, R will return the last evaluated expression:

multiply <- function(a, b) {
  a * b
}
multiply(4, 5)
[1] 20

Nested Function Example

outer <- function(x) {
  inner <- function(y) {
    return(y + 1)
  }
  return(inner(x) * 2)
}
outer(4)  # returns (4+1)*2 = 10
[1] 10

Anonymous Functions (No Name)

sapply(c(1, 2, 3), function(x) x^2)
[1] 1 4 9

Variable Scope in Functions

Variables created inside a function are local to that function:

myfunc <- function() {
  x <- 10  # Local variable
  print(x)
}
myfunc()
[1] 10
print(x)  # Error: object 'x' not found
[1] 10

Summary Table

Type Description Example
Built-in Predefined R functions sum(), mean(), sqrt()
User-defined Functions created by users my_function <- function() {...}
With defaults Functions with default argument vals greet <- function(name = "User")
Anonymous Functions without name function(x) x+1 in sapply()

