##################################################################.
#
# TOPICS
#
# - functions: sqrt abs max min ceiling floor sum mean
# trunc round
#
# - vector arithmetic and recycling rule
#
# - combining vectors with c function
#
# - functions: c length sum rep seq range
#
# - colon operator (e.g. 3:5 5:-3)
#
##################################################################.
# It's recommended to start coding sessions by removing all variables that
# you may have stored from the last time you've used R. This prevents confusion
# in case you may have a variable from last time.
rm( list=ls() )
4 4. Using some built-in functions
4.1 rm(list=ls())
4.2 sqrt() abs() NaN nesting function calls
############################################################.
#
# Intro to functions
#
# Intro to R vectors
#
############################################################.
#-----------------------------------------------------------.
# sqrt function - eg. sqrt(49) ####
#
# abs function - eg. abs(-49) ####
#
# NaN is "not a number" - eg. sqrt(-49) ####
#
# nesting function calls - eg. sqrt(abs(-49)) ####
#----------------------------------------------------------.
# To take the square-root of a number in R, use the sqrt function
# For example:
sqrt(25) # get the square root of 25
[1] 5
sqrt(10) # get the square root of 10
[1] 3.162278
sqrt(-5) # square roots of negative numbers return NaN (i.e. "not a number")
Warning in sqrt(-5): NaNs produced
[1] NaN
# sqrt is an example of a "function".
# A function takes some information as input (e.g. 25)
# and returns a value as output, (e.g. 5)
#
# To see R's help page for the sqrt function type the following:
#
# ?sqrt # show the help page for sqrt ####
# Some R help pages show information for multiple functions.
# The help page for sqrt also show information about the abs function.
#
# abs gives you the absolute value of a number (i.e. the positive version of the number)
abs(2) # 2
[1] 2
abs(-2) # 2
[1] 2
# We can "nest" one function call inside another function call.
#
# When we do so the value that is "returned" by the "inner" function call
# is then "passed" to the "outer" function call.
sqrt(-49) # NaN
Warning in sqrt(-49): NaNs produced
[1] NaN
sqrt(abs(-49)) # 7
[1] 7
4.4 function call
#.......................................................................
# A particular use of a function is known as a "function call" ####
#.......................................................................
sqrt(100) # this is a function call of the sqrt function
[1] 10
sqrt(64) # this is a different function call of the sqrt function
[1] 8
4.5 return value
#.......................................................................
# The output of a function is known as the "return value" of the function. ####
#.......................................................................
sqrt(64) # The "return value" of this "function call" is 8
[1] 8
4.6 max() min() ceiling() floor() sum()
# Some functions can take more than one argument.
# However, all functions return exactly one item.
# (we will describe an exception to this later).
#
# max and min functions return the maximum and minimum value of all of their arguments. ####
# For example:
max(4,10,2,5) # four arguments, 4,10,2,5 - one return value, i.e. 10
[1] 10
min(4,10,2,5) # four arguments, 4,10,2,5 - one return value, i.e. 2
[1] 2
# another example
<- 50
joesSalary <- 70
suesSalary <- 60
bobsSalary
# three arguments - joesSalary, suesSalary, bobsSalary
# one return value, i.e. 70
max(joesSalary, suesSalary, bobsSalary)
[1] 70
4.7 arguments (AKA parameters)
#.......................................................................
# The input values to a function are known as the argument(s) or the parameter(s) of
# a function. (Some people/books may draw a distinction between the word argument
# and the word parameter but for our purposes they mean the same thing.)
#.......................................................................
# In the following code:
# 36 is an argument (or parameter), i.e. 36 is "passed" to the sqrt function.
# the return value is 6
sqrt(36)
[1] 6
4.8 “passing values” to a function
#.......................................................................
# Specifying a value as an argument to a function is known as "passing" that value to the function. ####
#.......................................................................
sqrt(36) # 36 is being "passed" to the sqrt function.
[1] 6
#.......................................................................
# The arguments to a function may be expressions, not just single value. ####
#.......................................................................
2 * max ( pi ^ 2 , pi * 2) # 1st argument: pi^2 , 2nd argument: pi*2
[1] 19.73921
4.9 more functions: ceiling, floor, sum
ceiling(3.2) # ceiling rounds up to next higher number ####
[1] 4
ceiling(-3.2) # ... be careful with negatives
[1] -3
floor (3.2) # floor rounds down to nearest whole number ####
[1] 3
floor(-3.2) # ... be careful with negatives
[1] -4
sum(2,10,4) # sum returns the sum of its arguments ####
[1] 16
# we will speak about averages, or the "mean function" later ...
4.10 R’s “help” system ?someFunction ??anyWord
########################################################.
#
# R's "help" system ####
#
########################################################.
#----------------------------------------------------------------------------.
# To get more information about a particular function, you
# use the "help" function. You must put the name of the R function you
# want help with in "quotes". The "help page" or "manual page" for
# that function (or group of functions) will appear in the "help"
# window.
#
# help("sum") # show the R documentation page for the sum function.
#
# help(sum) # same thing - you don't need the quotes
#
# ?sum # same thing - ? is shorthand for the help function
#
# ?help # you can even get help on the help function
#
# ??max # The double question mark ?? searches for a particular word in any help page.
#----------------------------------------------------------------------------.
####
# Some help pages describe several different R functions in single page
#
# ?ceiling # this describes ceiling, floor and several other functions all in one help page
#
# ?floor # this shows the same thing
# NOTE:
#
# In posit.cloud you can press F1 when the cursor is on the name of a function ####
# (this only works in the "script" window)
4.11 pi
# pi is a built-in variable that contains the first few digits of the value of pi
# value of pi pi
[1] 3.141593
* 2 # pi times 2 pi
[1] 6.283185
^ 2 # pi quared pi
[1] 9.869604
4.12 trunc()
#-----------------------------------------------------------------------------.
# trunc function ####
#
# trunc stands for "truncate" which means to "shorten" or to "chop off"
# The trunc function "chops off" the values after the deicmal point.
#-----------------------------------------------------------------------------.
trunc(3.2) # chops off the decimal points
[1] 3
trunc(-3.2) # compare this with "floor and ceiling" ... how are they different?
[1] -3
4.13 round() function
#-----------------------------------------------------------------------------.
# round function ####
#
# first arugment - value to round
# second argument - which position to round
#-----------------------------------------------------------------------------.
# round a value to a particular number of decimal places
round(1.129, 2) # 1.13
[1] 1.13
round(1.129, 1) # 1.1
[1] 1.1
# display the value of pi #### pi
[1] 3.141593
round(pi, 2) # round a value to a particular number of dcimal places
[1] 3.14
round(pi, 3) # round a value to a particular number of dcimal places
[1] 3.142
# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# if 2nd argument is 0, the number is rounded to the closest whole number
# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
round(pi, 0) # round pi to the closest whole number
[1] 3
#..........................................................................
# You can also supply a negative value for digits
#..........................................................................
round(1939, -1) # negative values are allowed, e.g. round to closest multiple of 10
[1] 1940
round(1939, -2) # round to closest multiple of 100
[1] 1900
round(1939.1598, 2) # 1939.16
[1] 1939.16
round(1939.1598, -2) # 1900
[1] 1900
#.................................................................
# Default value for the digits argument of the round function
#.................................................................
# Some arguments for some functions have a "default value".
# The default value is used when the argument does not appear in the function call.
# For example, 0 is the "default value" for the digits argument of the round function.
#
# This is described in the Usage section on the help page for the round function (?round)
# The usage section includes the following information:
#
# USAGE:
# round(x, digits = 0)
#
# "digits = 0" means that the defualt value of the
# digits argument (i.e. the 2nd argument) is 0 (zero).
#
round(pi) # answer is 3 because 0 is the default number of digits
[1] 3
round(1.234) # answer is 1 because 0 is the default number of digits
[1] 1
# ?round # view the help page for the round function
4.14 Default values for arguments
#--------------------------------------------------------------------------------.
# NAMES AND DEFAULT VALUES OF ARGUMENTS ARE SHOWN ON THE HELP PAGES ####
#
# The arguments for each function have "names"
#
# Some arguments have "default values". The default value for an argument is
# used when the function call does NOT specify a value for that argument.
# (see examples below).
#--------------------------------------------------------------------------------.
# Every argument for every function in R has a "name".
# SOME arguments for SOME functions have a "default value".
# All of this information is shown in the "Usage" section on the help page
# for the function.
#
# FOR EXAMPLE
# Look at the help page for the round function (i.e. ?round).
# The "Usage" section includes the following information:
#
# USAGE:
# round(x, digits = 0)
#
# This means
#
# - The name of the 1st argument is "x"
#
# - The name of the 2nd argument is "digits".
# The default value for the "digits" function is 0.
# This is shown in the documentation as "digits = 0".
#
# - Note that the first argument, x, does NOT have a default value.
#
# View the help page by typing:
#
# ?round # arguments are "x" and "digits", the default value for digits is 0
4.15 specifying arguments in function calls
# You may specify the names of the arguments when calling a function,
# but you don't have to (see examples below).
#
# Specifying the names of the arguments allows you to:
#
# (a) type the arguments out of order (see below) and/or
#
# (b) skip some arguments (examples of this to be shown later ...)
# The following function call will round 12345 to the
# nearest hundred (i.e. the 2nd argument is -2) to result in 12300.
#
# The arguments must be specified in same order as specified on the help page
# (see ?round). i.e. first the number to be rounded (12345 in this case)
# and then the position to round it to (-2 in this case).
round( 12345, -2) # round 12345 to the nearest hundred
[1] 12300
# You don't have to but you may specify the names of the arguments if you like.
round ( x=12345, digits=-2 )
[1] 12300
# If you specify the names of the arguments (see below), then you
# may write the arguments out of order.
#
# Otherwise, the arguments must be typed in the same order as they appear
# in the "Usage" section on the help page.
#
# In the following command the arguments are not in the order as specified
# on the help page. However, that is OK since we specified the names of the
# arguments.
round( digits = -2, x=12345) # specify arguments out of order, same result as above
[1] 12300
# You may omit the names of the first few arguments in a function call.
# If you do so then the first few arguments, without names in the function call,
# are assumed to be the first few arguments as specified on the help page.
#
# For example, in the following command the first argument, 12345,
# does not include a name. Since this is the first argument in the function
# call, it is assumed to be the "x" argument (which is the first argument
# specified in the help page (?round)).
round (12345, digits = -2) # you can specify names for some args but not others
[1] 12300
# If you want to, you MAY always specify the names of the arguments
# However, it is not necessary to type the names of the arguments as long as
# you type the arguments in the expected order (as defined in the help pages).
#
# Many R programmers choose to leave out names for the first argument or
# two and then specify names for the subsequent arguments,
# e.g. seq(2, 10, by=2) (this returns 2 4 6 8 10 - see ?seq).
#
# The reason for this is that the first argument or two
# of most functions are obvious as to their meaning. After that, it becomes
# less clear as to what the additional arguments mean. By specifying the names
# of these additional arguments it becomes easier to read the code.
4.16 What’s a vector? The is.vector() function.
###############################################.
#
# VECTORS
#
###############################################.
#-----------------------------------------------------------------------.
# A "vector" is a collection of values that can be processed as a group. ####
#-----------------------------------------------------------------------.
#-----------------------------------------------------------------------.
# The is.vector function returns TRUE if its argument is a vector ####
# and FALSE otherwise.
#-----------------------------------------------------------------------.
#-----------------------------------------------------------------------.
# The simplest vector is just a single value ... ####
# (it is technically a collection of just one value).
#-----------------------------------------------------------------------.
# View the help page by typing:
#
# ?is.vector
is.vector ( 3 )
[1] TRUE
is.vector( 99923141.32412431 )
[1] TRUE
# A variable that contains a vector is a vector ####
= 1.99
priceOfApple is.vector(priceOfApple) # TRUE
[1] TRUE
# The c() function is used to combine multiple values into a single vector. ####
#
# You can think of the "c" as standing for the word "combine".
# "c" actually stands for the word "concatenate" which
# is a technical fancy shmancy word for "combine things together".
# The following is a vector with mutliple values.
# The c function combines (i.e. "concatenates") the multiple values into a
# single "vector"
c(100,200,300, 50, -2, 25)
[1] 100 200 300 50 -2 25
is.vector(c(100,200,300, 50, -2, 25)) # this works
[1] TRUE
is.vector(100,200,300, 50, -2, 25) # ERROR: use c() to tie together different values
Error in is.vector(100, 200, 300, 50, -2, 25): unused arguments (300, 50, -2, 25)
= c(100,200,300, 50, -2, 25) # combine (or concatenate) values into one vector
someNumbers someNumbers
[1] 100 200 300 50 -2 25
is.vector(someNumbers) # TRUE
[1] TRUE
4.17 range() function
#-----------------------------------------------------.
# Other functions can also create vectors.
#-----------------------------------------------------.
#.............................................................................
# The range function returns a vector
#
# The range function returns the minimum and maximum values that are in a vector ####
#.............................................................................
range(someNumbers)
[1] -2 300
is.vector(range(someNumbers))
[1] TRUE
# You can also capture the result in a variable
= range(someNumbers)
lowestAndHighest # -2 300 lowestAndHighest
[1] -2 300
is.vector(lowestAndHighest) # TRUE
[1] TRUE
4.18 seq() function
#.............................................................................
# The seq function returns a vector. In its simplest use,
# seq returns the sequence starting with the 1st argument, ending with the 2nd argument ####
#
# NOTE - we will come back to the seq function to learn about
# much more complex ways of using it.
#.............................................................................
# Example 1
seq(5,10) # 5 6 7 8 9 10
[1] 5 6 7 8 9 10
is.vector( seq(5, 10) ) # TRUE
[1] TRUE
# Example 2
seq(10,5) # 10 9 8 7 6 5
[1] 10 9 8 7 6 5
is.vector( seq(10,5) ) # TRUE
[1] TRUE
# Example 3
seq(0.5, 2.5) # 0.5 1.5 2.5
[1] 0.5 1.5 2.5
is.vector( seq(0.5, 2.5) ) # TRUE
[1] TRUE
# We can also capture the results in variables
= seq(5,10)
example1 # 5 6 7 8 9 10 example1
[1] 5 6 7 8 9 10
is.vector(example1) # TRUE
[1] TRUE
= seq(10,5)
example2 # 10 9 8 7 6 5 example2
[1] 10 9 8 7 6 5
is.vector(example2) # TRUE
[1] TRUE
= seq(.5, 2.5)
example3 # 0.5 1.5 2.5 example3
[1] 0.5 1.5 2.5
is.vector(example3) # TRUE
[1] TRUE
seq(0.5, 3)
[1] 0.5 1.5 2.5
4.19 rep() function
#.............................................................................
# The rep function returns a vector ####
#
# In its simplest use, the rep function returns a vector of it's first
# argument repeated the number of times specified by its 2nd argument.
#
# NOTE - we will come back to the rep function to learn about
# more complex ways of using it.
#.............................................................................
rep(100,3) # 100 100 100
[1] 100 100 100
rep( seq(1,3) , 2) # 1 2 3 1 2 3
[1] 1 2 3 1 2 3
# QUESTION
# Create a vector that has the numbers 1 3 1 3 1 3 etc. for a total
# of 20 numbers. Store the resulting vector into a variable named nums.
= rep( c(1,3) , 10) # ANSWER nums
4.20 Use c() to combine vectors
#-------------------------------------------------------------------------------------.
# DO NOT WRITE INDIVIDUAL VALUES WITHOUT COMBINING THEM TOGETHER WITH A FUNCTION CALL!
#-------------------------------------------------------------------------------------.
100,200,300 # ERROR - individual values separated by commas are meaningless to R ####
# REMEMBER - if no other function call is being used, you can use the
# c function to combine individual values
c(100,200,300) # 100 200 300 (no error)
#-------------------------------------------------------------.
#
# More about the c function ####
#
#-------------------------------------------------------------.
#..............................................................................
# If you "nest" calls to "c", ie. if you combine one vector inside of another
# vector by using the c function, the result is a single vector
#..............................................................................
c(100, 200, c(30, 20, 10), 600) # same as c(100,200,300,400,500,600)
c(100, 200, 30, 20, 10, 600) # same thing
#..............................................................................
# You can use the c function to combine multiple vectors into a single vector.
#..............................................................................
<- c(10,20,30)
x <- c(40, 50)
y <- c(x, y) # combine the values from x and y into z
z
z
<- x, y # ERROR - use the c function to combine vectors into a single vector z
Error: <text>:5:4: unexpected ','
4:
5: 100,
^
# QUESTION ####
# Find the sum of all the values that are in x and y, without using z
# ANSWER
sum(c(x,y)) # This works
Error in eval(expr, envir, enclos): object 'x' not found
sum(x,y) # This works too - sum allows multiple vectors to be summed
Error in eval(expr, envir, enclos): object 'x' not found
# QUESTION ####
#
# Find the average (i.e. mean) of all the values that are in x and y,
# without using z
# ANSWER
mean(c(x,y)) #This works
Error in eval(expr, envir, enclos): object 'x' not found
mean(x,y) # ERROR
Error in eval(expr, envir, enclos): object 'x' not found
# QUESTION
# Why did we get an error in the last example?
# ANSWER
#
# From the documentation for sum and mean (i.e. ?sum and ?mean) we can
# see that the sum function allows multiple vectors that contain the numbers to be
# to be passed as separate arguments. However, the mean function requires
# all of the numbers to be averaged to be in a single vector that is passed
# to the argument named x. It's true that one might expect these functions
# to be more similar in how they are called. However, the designers of the
# language decided otherwise. The underlying reasons for the difference in
# the design of these functions is irrelevant - bottom line is you
# need to know how to call the functions. The place to learn this is
# in the documentation for the functions (i.e. ?sum and ?mean)
#
# Look at the documentation for sum and for mean (i.e. ?sum and ?mean).
# The "Usage" section shows the names of the arguments and their default values.
# The "Arguments" section explains what each argument is expected to contain.
# The "Value" section explains how the return value for the function is calculated.
#
# It takes some time and practice to be proficient at reading R's help pages.
# However, understanding how to read and interpret R's help pages
# is a critical skill that allows you to become familiar with R's built in
# functions.
#
# An "ellipsis" (i.e. three periods, ... ) in the help pages
# stands for the ability to type several values in place of the
# ellipsis. For example, the ... in the help page for sum, indicates
# the ability to type several different values to be summed. This is
# described in the ARGUMENTS section where it explains that ... stands
# for "numeric or complex or logical vectors".
# View the help page by typing:
#
# ?sum
# USAGE: sum(..., na.rm = FALSE)
# ARGUMENTS:
# ... numeric or complex or logical vectors
# na.rm (see the help page)
# However, for the mean function, there is a single argument named x that
# is expected to contain the values to be averaged. The ellipsis shown
# in the help page for mean is used for a more subtle reason. It shows where
# additional arguments, not listed on this help page, might be specified
# (this is an advanced concept that we'll return to later).
# View the help page by typing:
#
# ?mean
# USAGE: mean(x, trim = 0, na.rm = FALSE, ...)
# ARGUMENTS:
# x An R object. (i.e. a vector - these are the numbers)
# trim (see help page)
# na.rm (see help page)
# ... further arguments passed to or from other methods.
# You can use the c function to combine values from different functions.
# Make sure that you match parentheses correctly.
c( rep(100,3) , seq(-5,-7) ) # 100 100 100 -5 -6 -7
[1] 100 100 100 -5 -6 -7
# DON'T FORGET THE c( ... )
#rep(100,3), seq(-5,-7) # ERROR
range( rep(100,3) , seq(990,1005) , seq(-5,-7) ) #
[1] -7 1005
range( c( rep(100,3) , seq(990,1005) , seq(-5,-7) ) )
[1] -7 1005
4.21 — Practice —
#----------------------------------------------------.
# QUESTION
# Write R code that takes the average of the first
# 200 even numbers.
#----------------------------------------------------.
4.22 non-vectors (later in the course).
#----------------------------------------------------------------------------.
# Things that aren't vectors (e.g. dataframes, factors, matrices, etc) ####
#----------------------------------------------------------------------------.
# A vector is the simplest arrangement of values in R.
# R allows for more complex arrangements of data, which we will learn about
# later in the course, such as factors, matrices, dataframes, etc.
# These more complex arrangements of data are created from vectors but are
# technically not vectors themselves. One example of such an arrangement
# of data is a data.frame.
# We will cover dataframes later in the course.
# For now, I just want to demonstrate that R has structures that are NOT vectors.
# A dataframe is made up of vectors, but it itself is NOT a vector.
= data.frame(students = c("joe", "sue", "bob"),
example test1 = c(71,85,90),
test2 = c(83, 92, 95), stringsAsFactors = FALSE)
example
students test1 test2
1 joe 71 83
2 sue 85 92
3 bob 90 95
is.vector(example) # FALSE
[1] FALSE
is.data.frame(example) # TRUE
[1] TRUE
4.23 Vector arithmetic
#--------------------.
# Vector arithmetic ####
#--------------------.
# When you perform arithmetic with a vector each item in the vector is operated upon
c(100,200,300) + 5 # return a vector that contains c(105, 205, 305)
[1] 105 205 305
# vector arithmetic also respects the order of operations
# In the following example the multiplication is done before the addition
# to yield the value c(205, 405, 605)
5 + c(100, 200, 300) * 2 # do the multiplication first
[1] 205 405 605
# This works as follows
#
# original: 5 + c(100, 200, 300) * 2
#
# do the *: 5 + c(200, 400, 600)
#
# then do the +: c(205, 405, 605)
#
# result is displayed as: 205 405 605
# we can change the order of operations with parentheses
# This yields a different result.
5 + c(100,200,300)) * 2 # pay close attention to the parenthesis!!! (
[1] 210 410 610
# This works follows
#
# original: (5 + c(100, 200, 300)) * 2
#
# do the +: c(105, 205, 305) * 2
#
# then do the *: c(210, 410, 610)
#
# result is displayed as: 210 410 610
###########################################.
#
# You may assign a vector to a variable
#
###########################################.
<- c(72,95,79,85)
grades
# show the values grades
[1] 72 95 79 85
# QUESTION:
#
# Modify the grades variable by adding 2 points to each grade
# ANSWER ####
= grades + 2 # you must assign the answer back to grades
grades grades
[1] 74 97 81 87
4.24 length( SOME_VECTOR )
#-----------------------------------------------------------------------.
#
# length(vector) returns the number of values in the vector ####
#
#-----------------------------------------------------------------------.
# Set the value of grades
<- c(72,95,79,85)
grades
# the length function returns the number of values in a vector
length(grades) #4
[1] 4
length(c(25, 10)) #2
[1] 2
length(c(100,200,300)) #3
[1] 3
# A single value is a vector - but it doesn't need to be surrounded with c()
length(c(100)) # the length of a vector that contains a single item is 1
[1] 1
length(100) # ... same thing ... don't use the c - it's not necessary
[1] 1
c(100) # this is the same as just 100, the "c" is not necessary if you have just one value.
[1] 100
100 # same thing - don't use the c for a single value
[1] 100
# show all grades grades
[1] 72 95 79 85
+ 5 # show what the values would be if we added 5 to each grade grades
[1] 77 100 84 90
# however, grades did NOT actually change grades
[1] 72 95 79 85
# If you want to change the value of grades, you need to
# use the = sign or the <- or the ->. For example:
# show grades grades
[1] 72 95 79 85
<- grades + 10 # add 10 to each grade and update grades with the new values
grades # grades now has the new values grades
[1] 82 105 89 95
= c(1.99, 2.99, 3.99)
prices = 2 * prices
doublePrices doublePrices
[1] 3.98 5.98 7.98
4.25 Counting arguments
#############################################################.
#
# Arguments (AKA "parameters") to a function. ####
#
# It is important to know how many arguments are being passed
# to a function. The arguments to a function appear in the (parentheses)
# next to the function name and are separated from each other with commas.
#
#########################################################################.
# Remember that the round function takes TWO arguments
#
# x is the values to round
#
# digits is the position to round to
round(100.729, 1) # 100.7
[1] 100.7
round(100.729, 2) # 100.73
[1] 100.73
round (100.729) # 101
[1] 101
# The first argument is allowed to be a vector with multiple values
round ( c(100.729, 200.618) , 1) # 100.7 200.6
[1] 100.7 200.6
= c(82, 105, 89, 95)
grades
sum(grades) # one argument - add up all grades (not very useful for grading ...)
[1] 371
sum(c(82,105,89,95)) # also one argument - same exact thing, sum is given 1 vector
[1] 371
sum(82,105,89,95) # four arguments - same result, HOWEVER sum is given 4 different vectors - same answer
[1] 371
# The sum function will sum all of the values in all
# of its arguments. The following all produce the same
# result (i.e. 306) but in different ways.
sum( c(100,200) , c(1,2,3)) # 2 arguments
[1] 306
sum( c(100,200,1,2,3) ) # 1 argument
[1] 306
sum( 100,200,1,2,3 ) # 5 arguments
[1] 306
4.26 To get an average use the mean function
# IMPORTANT: the mean function works a little differently than the sum function.
#
# The mean function requires that all values being averaged are passed as a single vector. ####
# show all the grades grades
[1] 82 105 89 95
= c(82, 105, 89, 95)
grades mean(grades) # get the average
[1] 92.75
mean( c(82,105,89,95) ) # same thing - there is ONE vector, ie a vector, c(72,95,79,85)
[1] 92.75
mean(82,105,89,95) # I didn't use the c() function here - there are 4 vectors!!!
[1] 82
# To summarize:
# sum and mean are not consistent in the way they handle multiple arguments
sum(1,2,3) # works as expected
[1] 6
mean(1,2,3) # does not work as most people would expect - answer is 1
[1] 1
# View the help page by typing:
#
# ?mean
# Examine the documentation for mean to see why. The Usage section of the
# documentation includes the following: mean(x, trim = 0, na.rm = FALSE, ...)
# The "x" corresponds to a single vector that contains
# the values to be averaged. If you pass the values without
# the c() function, then the 2nd value listed is actually
# passed to the "trim" argument of mean. If you want to know
# what the "trim" argument is used for, see the help
# page for "mean". If you don't specify any value for "trim"
# then "mean" will work as you expect.
# "x"
#
# View the help page by typing:
#
# ?mean # see the documentation for mean
# Arguments passed to mean:
#
# x - a vector that contains the values to be averaged
#
# trim - a fraction (0 to 0.5) of observations to be ignored (i.e. trimmed) from the beginning and end of the vector
#
# na.rm - WE WILL DISCUSS THIS LATER ...
# Return the average of the numbers in the vector.
#
# Return value is 400 , i.e. (100+200+300+500+900) / 5
mean(c(100,200,300,500,900))
[1] 400
# the code above does the same as the next line
sum(c(100,200,300,500,900)) / 5
[1] 400
# DO NOT DO THE FOLLOWING !!!!
# The mean function is being passed a SINGLE value and
# does nothing meaningful in this case.
mean(sum(100,200,300,500,900) / 5) # basically same as: sum(100,200,300,500,900) / 5
[1] 400
# This is because by the time, the mean function
# starts working the value: sum(100,200,300,500,900) / 5
# has already been calculated as 400.
# It would be just as ridiculous as running the following code
# which just returns the number 400 - the mean function does
# nothing meaningful in this case.
mean ( 400 ) # This is the same as 400 / 1
[1] 400
trim argument to mean
# the "trim" argument to mean ####
#
# trim (ie. remove) 0.2 (ie. 1/5) of the values (ie. 1 value)
# from the beginning and end of the vector
#
# Return value is 333.333, ie. mean(c(200,300,500))
mean(c(100,200,300,500,900), 0.2)
[1] 333.3333
mean(c(200,300,500)) # same result
[1] 333.3333
= c(5, 82, 85, 89, 105)
grades
mean(grades) # mean ( c(5,82,85,89,105))
[1] 73.2
mean(grades, trim = 0.2) # mean(c(82,85,89))
[1] 85.33333
grades
[1] 5 82 85 89 105
# trim (ie. remove) 0.4 (ie. 2/5) of the values (i.e. 2 values)
# from the beginning and end of the vector
#
# Return value is 300, i.e. mean(300)
mean(c(100,200,300,500,900), 0.4) # trim 0.4 = 2/5 of the values from the beginning and end
[1] 300
mean(c(500,200,300,900,100), 0.4) # trim 0.4 = 2/5 of the values from the beginning and end
[1] 300
# In the following the result is 100
# This is because the arguments are assigned in the following order
#
# x, ie. the values to be averaged = first argument = 100
# trim = second argument = 200
# na.rm = 3rd argument = 300
# ... = all other arguments = c(500,900)
#
# Other than the x=100, the other arguments are really meaningless so
# the result is the average of 100, which is 100.
mean(100,200,300,500,900)
[1] 100
# PROBLEM:
#
# REMEMBER that mean requires that all values being averaged are in a SINGLE vector
# Therefore to take the average of the values in x and in y the following WILL NOT WORK:
mean(x, y) # will not work - will just show the mean of the values in x
Error in eval(expr, envir, enclos): object 'x' not found
# SOLUTION:
#
# Remember that you can combine multiple vectors into a single vector with
# the c function.
<- c(10,20,30)
x <- c(40, 50)
y
mean(c(x,y)) # combine x and y into a single vector and take the mean of that vector
[1] 30
#--------------------------------------------------------.
# QUESTION : ####
#
# Grades for class1 and class2 are as shown below.
#
# class1grades <- c(80,90,100)
# class2grades <- c(85, 88)
#
# (a) get the two averages, one for each class
# (b) get the average for all the students in both classes
#--------------------------------------------------------.
::: {.callout-note icon=false collapse=“true”} ### Click here for the answer
# ANSWER
<- c(80,90,100) # ANSWER
class1grades <- c(85, 88) # ANSWER
class2grades <- mean(class1grades) # ANSWER
class1average <- mean(class2grades) # ANSWER
class2average <- mean ( c(class1grades, class2grades)) # ANSWER - remember the c(...)
allStudentsAverage # ANSWER class1average
[1] 90
# ANSWER class2average
[1] 86.5
# ANSWER allStudentsAverage
[1] 88.6
4.27 The “recycling” rule.
#-------------------------------------------------------------.
#
# VECTOR ARITHMETIC WITH TWO VECTORS
#
# RECYCLING RULE - used in vector arithmetic with TWO vectors when vecors are different lengths
#
#-------------------------------------------------------------.
# when you perform arithmetic between two vectors and they are the same length
# then you view each correpsonding set of values as being operated on.
# For example
c(100,200,300) + c(1, 2, 3) # 101 202 303
[1] 101 202 303
c(100 + 1 , 200 + 2 , 300 + 3) # ... same thing
[1] 101 202 303
# Another example : remember the order of operations
c(40, 20, 30) - c(4,5,6) * c(1, 2, 3) # remember the order of operations!
[1] 36 10 12
# original : c(40, 20, 30) - c(4,5,6) * c(1, 2, 3)
# do the * : c(40, 20, 30) - c(4*1,5*2,6*3)
# : c(40, 20, 30) - c(4,10,18)
# do the - : c(40-4, 20-10, 30-18)
# : c(36 , 10, 12)
# if one vector is shorter than the other then ...
#
# step 1: In R's memory (you don't see this) the values from the
# shorter vector are repeated over and over until the shorter vector
# is the same length as the longer vector.
#
# Step 2: The math happens ...
#
# For example the following
c(10,20,30,40) + c(1,7) # same as c(10,20,30,40)+c(1,7,1,7) ... ie. 11 27 31 47
[1] 11 27 31 47
# Original: c(10,20,30,40) + c(1,7)
# Recycling rule: c(10,20,30,40) + c(1,7,1,7)
# addition: c(10+1,20+7,30+1,40+7)
# final result: c(11, 27, 31, 47)
# It doesn't matter if the shorter vector is first or last
c(1,7) + c(10,20,30,40) # same as c(1,7,1,7)+c(10,20,30,40) ... i.e. 11 27 31 47
[1] 11 27 31 47
# Doing math with a vector that contains a single value is just a special case of
# this rule. Example:
3 * c(2,3,4) # same as c(3,3,3) * c(2,3,4)
[1] 6 9 12
# original: 3 * c(2,3,4)
# recycling rule: c(3,3,3) * c(2,3,4)
# multiplication: c(3*2, 3*3, 3*4)
# final answer: c(6,9,12)
# REMEMBER THE ORDER OF OPERATIONS!!!
c(1,2) + 3 * c(10,20,30,40) # 31 62 91 122
[1] 31 62 91 122
# original : c(1,2) + 3 * c(10,20,30,40)
# do the * : c(1,2) + c(3,3,3,3) * c(10,20,30,40)
# : c(1,2) + c(3*10 , 3*20, 3*30, 3*40)
# : c(1,2) + c(30 , 60, 90, 120)
# do the + : c(1,2,1,2) + c(30 , 60, 90, 120)
# : c(1+30 , 2+60 , 1+90 , 2+120)
# : c(31, 62, 91, 122)
c(1,2) + 3 * sum(10,20,30,40)
[1] 301 302
# orignal : c(1,2) + 3 * sum(10,20,30,40)
# sum function : c(1,2) + 3 * 100
# multiplication: c(1,2) + 300
# addition : c(1+300 , 2+300)
# : c(301, 302)
c(1,2) + 3 * sum(2 ^ 3 , 3-4*5)
[1] -26 -25
# original: c(1,2) + 3 * sum(2 ^ 3 , 3-4*5)
# figure out the values of the arguments: c(1,2) + 3 * sum( 8 , -17)
# do the sum function: c(1,2) + 3 * -9
# do the * : c(1,2) + -27
# recycling rule: c(1,2) + c(-27,-27)
# final answer: c(1 + -27 , 2 + -27)
# final answer: c( -26 , -25)
# If the length of the longer vector is not a multiple of the length of
# the shorter vector, it WILL work, but you will get a WARNING.
#
# The warning is to alert you to the fact that you might have done something
# that you didn't intend to, however, it will still work.
#
# The recycling will continue as usual for the full length of the longer
# vector.
# Example:
c(1,2) + c(100,200,300,400,500, 600) # 101 202 301 402 501 602
[1] 101 202 301 402 501 602
c(1,2) + c(100,200,300,400,500) # 101 202 301 402 501 (with a warning)
Warning in c(1, 2) + c(100, 200, 300, 400, 500): longer object length is not a
multiple of shorter object length
[1] 101 202 301 402 501
# The above command is processed as follows
# original: c(1,2) + c(100,200,300,400,500)
# recycling rule: c(1,2,1,2,1) + c(100,200,300,400,500)
# final answer : c (101, 202, 301, 402, 501)
#
# Because the first vector was recycled a non-whole-number of times
# R displays a warning.
4.28 ERRORs vs WARNINGs
# When you get an ERROR, the command doesn't have ANY effect. ####
#
# When you have an error, the command ends at the time the error happens.
# Therefore, in the following command no value is assigned to the variable.
#
# NOTE - just in case you have a variable, x, that was already created, we
# will "remove" that variable on the next line. This helps us to prove our point.
# this is here just to prove a point.
= 100
x
+ 5 x
[1] 105
rm(x) # remove x (just in case it already exists) to help us prove our point
+ 5 # ERROR - x doesn't exist (GOOD - that's what we wanted) x
Error in eval(expr, envir, enclos): object 'x' not found
= combine_stuff(10,20,30) + c(1,2,3) # ERROR - the function combine_stuff doesn't exist x
Error in combine_stuff(10, 20, 30): could not find function "combine_stuff"
# ERROR - x still doesn't exist, i.e. the above command did NOTHING x
Error in eval(expr, envir, enclos): object 'x' not found
# When you get a WARNING, the command DOES have an effect. ####
#
# If you assign the result of something that produces a warning the assignment
# will happen and you can use that value without getting anymore warnings.
rm(nums)
nums
Error in eval(expr, envir, enclos): object 'nums' not found
<- c(1,2) + c(100,200,300,400,500) # 101 202 301 402 501 (with a warning) nums
Warning in c(1, 2) + c(100, 200, 300, 400, 500): longer object length is not a
multiple of shorter object length
nums
[1] 101 202 301 402 501
# The value of nums was still assigned and can be used normally
# this will NOT generate a warning nums
[1] 101 202 301 402 501
- 50 # this will NOT generate a warning nums
[1] 51 152 251 352 451
4.29 — PRACTICE —
#----------------------------------------------------------------------------.
# QUESTION
#
# Use the recycling rule to generate the first ten multiples of 5 in a single
# command.
# The result should be as shown below.
#
# > YOUR COMMAND GOES HERE # replace this line with your command
#
# [1] 5 10 15 20 25 30 35 40 45 50
#----------------------------------------------------------------------------.
#----------------------------------------------------------------------------.
# QUESTION
#
# numValues is a variable that contains a number.
# Write code that produces numValues of the 5 times table.
#
# EXAMPLE 1
# > numValues = 3
# > YOUR CODE GOES HERE
# [1] 5 10 15
#
# EXAMPLE 2
# > numValues = 7
# > YOUR CODE GOES HERE
# [1] 5 10 15 20 25 30 35
#----------------------------------------------------------------------------.
# THINK ABOUT THIS - to answer the question you need ONE command that will
# use the numValues variable to create the following results
5 * c(1,2,3) # when numValues is 3
[1] 5 10 15
5 * c(1,2,3,4,5,6,7) # when numValues is 7
[1] 5 10 15 20 25 30 35
#----------------------------------------------------------------------------.
# QUESTION ####
#
# Use the recycling rule to generate the first five multiples
# of 2 and 100 using a single command. The result should be as shown below.
#
# > YOUR COMMAND GOES HERE # replace this line with your command
#
# [1] 2 100 4 200 6 300 8 400 10 500
#----------------------------------------------------------------------------.
#----------------------------------------------------------------------------.
# QUESTION ####
#
# Use the recycling rule to generate the first five multiples
# of 2 and 100 using a single command. The result should be as shown below.
# Write the command using the least amount of typing.
#
# > YOUR COMMAND GOES HERE # replace this line with your command
# [1] 2 4 6 8 10 100 200 300 400 500
#
# The following is NOT the answer but helps you to think about how to generate
# the same thing using shorter code.
# c( 2 * 1 , 2 * 2 , 2*3 , 2*4 , 2*5 , 100 * 1 , 100 * 2 , 100 * 3 , 100 * 4, 100*5)
#----------------------------------------------------------------------------.
4.30 More about the rep function.
###########################################################.
#
# More about the rep function.
#
###########################################################.
# The rep function can be used in several ways.
# In the simplest use of the rep function, rep returns a vector
# that contains the values from the first argument to the function
# repeated the number of times specified in the 2nd argument.
# Examples:
# Repeat the number 3 five times
rep(3, 5)
[1] 3 3 3 3 3
# repeat the number 5 three times:
rep(5, 3)
[1] 5 5 5
# View the help page by typing:
#
# ?rep # see the documentation for rep function
# since the rep function returns a vector, you can do anything with the
# return value that you can do with any other vector
<- rep(5,3)
threeFives threeFives
[1] 5 5 5
* 10 threeFives
[1] 50 50 50
# The default value for the number of repetitions is 1 (i.e. one)
rep(100) # same as 100 ... why would you do this ??? you probably wouldn't ... yet ...
[1] 100
# You can use rep to repeat entire vectors
<- c(10,20,30,40)
nums rep(nums, 2) # 10 20 30 40 10 20 30 40
[1] 10 20 30 40 10 20 30 40
times argument
########################################################################.
# Other arguments of the rep function.
#
# The "Details" section of the rep documentation shows the
# following:
#
# rep(x, times = 1, length.out = NA, each = 1)
#
# See the "Arguments" section of the rep documentation for
# an explanation of what each of the arguments mean.
########################################################################.
# Let's start with some data:
<- c(10,20)
nums # show the value in nums nums
[1] 10 20
# The rep documentation shows the following:
#
# rep(x, times = 1, length.out = NA, each = 1)
#
# "x" is the first argument - "x" is the vector that will be repeated.
# "times" is the 2nd argument - "times" is the number of times to repeat "x" (default is 1 time)
#
# Therefore the following are the same thing:
rep(nums, 5) # 5 is the value of 2nd argument to rep
[1] 10 20 10 20 10 20 10 20 10 20
rep(x=nums, times=5) # same thing - specify 5 as the value of the "times" argument
[1] 10 20 10 20 10 20 10 20 10 20
rep(times=5, x=nums) # same thing - specify 5 as the value of the "times" argument
[1] 10 20 10 20 10 20 10 20 10 20
length.out argument
#--------------------------------------------------------.
# rep ( SOME_VECTOR, length.out=SOME_NUMBER )
#
# is the same as
#
# rep_len ( SOME_VECTOR, SOME_NUMBER )
#--------------------------------------------------------.
# The length.out argument causes the values in the x argument, i.e. the 1st argument, to be repeated to the specified length. ####
# nums didn't change nums
[1] 10 20
rep(nums, length.out=5) # repeat the values in nums to a length of 5
[1] 10 20 10 20 10
# The rep_len is just a shorthand for using the length.out argument in the rep function ####
# to accomplish the same thing
# show the values in nums nums
[1] 10 20
rep(nums, length.out=5) # repeat the values in nums to a length of 5
[1] 10 20 10 20 10
rep_len(nums,5) # same thing, another way
[1] 10 20 10 20 10
rep(nums, length.out=15) # repeat the values in nums to a length of 15
[1] 10 20 10 20 10 20 10 20 10 20 10 20 10 20 10
rep_len(nums,15) # same thing, another way
[1] 10 20 10 20 10 20 10 20 10 20 10 20 10 20 10
each argument
#-----------------------------------------------------------.
# rep ( SOME_VECTOR, each = SOME_NUMBER )
#-----------------------------------------------------------.
# The each argument causes each value the x argument to be repeated sequentially the specified number of times ####
# show the values in nums nums
[1] 10 20
rep(nums, each=5) # repeat each value of nums 5 times ####
[1] 10 10 10 10 10 20 20 20 20 20
#-----------------------------------------------------------------------------------------.
# Sometimes it's hard to know what a function will do. The help page
# doesn't really explain what will happen for all the different possible combinations of
# the arguments, times, length.out and each.
#
# We can experiment to find out ...
#-----------------------------------------------------------------------------------------.
times and each
# rep with times and each
nums
[1] 10 20
rep(nums, times=2, each=3) # 10 10 10 20 20 20 10 10 10 20 20 20 ####
[1] 10 10 10 20 20 20 10 10 10 20 20 20
length.out and each
# rep with length.out and each
nums
[1] 10 20
rep(nums, length.out=8, each=3) # 10 10 10 20 20 20 10 10 ####
[1] 10 10 10 20 20 20 10 10
rep(nums, each=3, length.out=8 ) # same results : 10 10 10 20 20 20 10 10
[1] 10 10 10 20 20 20 10 10
times, length.out, each
# rep with times, length.out and each
nums
[1] 10 20
rep(nums, times=2, length.out=5, each=3) # 10 10 10 20 20 ####
[1] 10 10 10 20 20
# Look at the help file for specifics ...
#
# ?rep
4.31 Understanding R’s help files
#######################################################################################.
# Understanding R's help files ####
#
# R functions can be used in many many different ways. You must become familiar with
# the R help files in order to understand how each function can be used.
#
# Pay attention to the following in the R help files
#
# - what arguments can be specified
#
# - what are the names of the arguments
#
# - what are the default values (if any) of the arguments. The default values of
# an argument appear after an = sign next to the argument in the help file.
#
# - how the function works when different arguments are specified
#######################################################################################.
4.32 sort function
############################################################################.
#
# sort( SOME_VECTOR ) # returns the vector sorted in increasing oder ####
#
# sort( SOME_VECTOR, decreasing=TRUE ) # returns the vector sorted in decreasing order ####
#
############################################################################.
= c(93, 76 , 69, 83, 77, 98, 100, 25, 89, 92, 91, 52)
grades grades
[1] 93 76 69 83 77 98 100 25 89 92 91 52
sort(grades) # show the grades in sorted order, i.e. 25 52 69 76 ... etc
[1] 25 52 69 76 77 83 89 91 92 93 98 100
# The variable grades is still in the original order
# the variable grades is still in the original order grades
[1] 93 76 69 83 77 98 100 25 89 92 91 52
# REMEMBER - as always, if you want to change a variable, you MUST use an
# assignment statement.
#
# If you want to change the value of the grades variable, then you must
# assign the result back to the grades variable.
= sort(grades) # now the variable grades contains the sorted values
grades grades
[1] 25 52 69 76 77 83 89 91 92 93 98 100
#---------------------------------------------------------------------.
# The decreasing argument may be TRUE or FALSE (default is FALSE) ####
#---------------------------------------------------------------------.
sort(grades, decreasing = FALSE) # same thing (default for decreasing is FALSE)
[1] 25 52 69 76 77 83 89 91 92 93 98 100
sort(grades, decreasing = TRUE)
[1] 100 98 93 92 91 89 83 77 76 69 52 25
# See the help page for advanced options that can be used with sort
#
# ?sort
4.33 More about the seq function.
############################################################################.
#
# More about the seq function. ####
#
############################################################################.
# Review of the basic use of seq
# We already covered the following:
#
# ?seq # see the help page for seq
seq(from=8, to=10) # 8 9 10 count up
[1] 8 9 10
seq(from=10, to=8) # 10 9 8 count down
[1] 10 9 8
seq(10,8) # 10 9 8 - same thing - the names aren't necessary if you write the arguments in the expected order
[1] 10 9 8
# ... seq can also accept other arguments:
#-----------------------------------------------------------------------------.
# seq( ... by=SOME_POSITIVE_OR_NEGATIVE_NUMBER .... ) ####
#
# The by argument tells seq what number to "count by". (by can be positive or negative) ####
#
# 1st value in the output vector is the "from" value.
# 2nd value in the output vector is "from" + "by"
# 3rd value in the output vector is "from" + "by" + "by"
# 4th value in the output vector is "from" + "by" + "by" + "by"
# etc ...
#
# By default, the value of by is 1.
#
# See examples below.
#-----------------------------------------------------------------------------.
# count by threes ... up until but not past the to value
seq(from=20, to=30, by=3) # 20 23 26 29
[1] 20 23 26 29
# To count down by any number other than 1 you must use a negative value for by.
#
# In the following command we count down
# from 10 to 3 by threes, so by must be MINUS three (i.e. by = -3)
seq(from=30, to=20, by=-3) # 30 27 24 21 count down by threes
[1] 30 27 24 21
# if you use the wrong sign (+ or -) for by you'll get an error
#seq(from=30, to=20, by=3) # ERROR - counting down - must have negative value for by
#seq(from=20, to=30, by=-3) # ERROR - counting up - must have positive value for by
#-----------------------------------------------------------------------------.
# The return value of seq always starts with the "from" value and goes no further than the "to" value. ####
#
# NOTE that the result might not actually include the "to" value if the "to" value
# doesn't naturally arise from the implied sequence.
#
# See the examples below.
#-----------------------------------------------------------------------------.
seq( from = 10 , to = 20, by=4 ) # 10 14 18 - result does NOT include 20.
[1] 10 14 18
#-----------------------------------------------------------------------------.
#
# the arguments "from" and "to" do NOT have to be whole numbers ####
#
#-----------------------------------------------------------------------------.
seq( from = .5, to = 3.5) # 0.5 1.5 2.5 3.5
[1] 0.5 1.5 2.5 3.5
seq( from = 0.75 , to = 3 ) # 0.75 1.75 2.75 - result does NOT include 3. ####
[1] 0.75 1.75 2.75
4.34 — PRACTICE —
#########################################################################.
# QUESTION ####
#########################################################################.
# Write code to generate the number -5 until -200 but no further. Count down by 5's
# The code should produce
# -5 -10 -15 .... -200
#########################################################################.
#########################################################################.
# QUESTION ####
#########################################################################.
#
# Based on the documentation for seq, what will the following command display?
#
# > seq() # what will this display???
#
# How did you figure out your answer from the documentation?
#########################################################################.
4.35 Even more about the seq function.
#-----------------------------------------------------------------------------.
# Other arguments:
#
# length.out - total length of the resulting vector
#
# along.with - specify a vector whose length should be used as the length of the result
#
# See examples below
#-----------------------------------------------------------------------------.
# View the help page by typing:
#
# ?seq
# from,to,length.out (without by) - start with from, end with to, total of 5 numbers
seq( from=1, to=2, length.out=5) #
[1] 1.00 1.25 1.50 1.75 2.00
# from,by,length.out (without to) - start with from, keep adding by, for a total of length.out numbers
seq(from=2, by=3, length.out=20) # start from 2, add 3 each time until you get 20 numbers
[1] 2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 47 50 53 56 59
#-----------------------------------------------------------------------------.
# QUESTION: What will the following produce ?
#
# I guess you can run it to find out but you should know how to answer this
# WIHTOUT needing to run the code.
#-----------------------------------------------------------------------------.
seq(2, 3, length.out=20)
[1] 2.000000 2.052632 2.105263 2.157895 2.210526 2.263158 2.315789 2.368421
[9] 2.421053 2.473684 2.526316 2.578947 2.631579 2.684211 2.736842 2.789474
[17] 2.842105 2.894737 2.947368 3.000000
# to,by,length.out (without from)
seq(to=100, by=3, length.out=4) # generate 4 numbers each one 3 greater than the next until you get to 100
[1] 91 94 97 100
#-----------------------------------------------------------------------------.
# QUESTION: What will the following produce ?
#
# See if you can figure out what each of the following will
# display BEFORE running the command
#-----------------------------------------------------------------------------.
seq(from=2, to=3, by=0.2)
[1] 2.0 2.2 2.4 2.6 2.8 3.0
seq(from=1, to=3, length.out=6)
[1] 1.0 1.4 1.8 2.2 2.6 3.0
seq(from=0.5, to=1, by=.05)
[1] 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
length( seq(from=0.5, to=1, by=.05) )
[1] 11
seq(10, 1000, by=10)
[1] 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150
[16] 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300
[31] 310 320 330 340 350 360 370 380 390 400 410 420 430 440 450
[46] 460 470 480 490 500 510 520 530 540 550 560 570 580 590 600
[61] 610 620 630 640 650 660 670 680 690 700 710 720 730 740 750
[76] 760 770 780 790 800 810 820 830 840 850 860 870 880 890 900
[91] 910 920 930 940 950 960 970 980 990 1000
length.out argument
#--------------------------------------------------------------------.
#
# length.out argument ####
#
#--------------------------------------------------------------------.
# length.out is similar to the length.out for the rep function.
# If you specify length.out you do not have to specify the to argument
seq(3, length.out=7, by=-1) # 3 2 1 0 -1 -2 -3
[1] 3 2 1 0 -1 -2 -3
seq(3, length.out=7) # 3 4 5 6 7 8 9
[1] 3 4 5 6 7 8 9
seq(3, length.out=7, by=-2) # 3 1 -1 -3 -5 -7 -9
[1] 3 1 -1 -3 -5 -7 -9
seq(3, length.out=7, by=10) # 3 13 23 33 43 53 63
[1] 3 13 23 33 43 53 63
along.with=SOME_VECTOR
#--------------------------------------------------------------------.
#
# along.with=SOME_VECTOR ####
#
# This argument is the same as length.out=length(SOME_VECTOR)
#
# see examples below
#--------------------------------------------------------------------.
# Example: suppose a professor wanted to give a curve so that people
# with lower grades got a higher curve.
#
# The professor could do the following:
# Here are the original grades
= c(98,77,64,79,76, 84, 92, 78)
grades grades
[1] 98 77 64 79 76 84 92 78
length(grades) # how many grades are there?
[1] 8
# Sort the grades in decreasing order
= sort(grades, decreasing=TRUE)
sortedGrades sortedGrades
[1] 98 92 84 79 78 77 76 64
# Generate a vector with the amount to curve each grade.
# Highest grade has a curve of 1 point, 2nd highest grade has a curve
# of 2 points, etc.
= seq(from=1, along.with=sortedGrades)
curveAmounts curveAmounts
[1] 1 2 3 4 5 6 7 8
= sortedGrades + curveAmounts
curvedGrades
# original grades sortedGrades
[1] 98 92 84 79 78 77 76 64
# curved grades curvedGrades
[1] 99 94 87 83 83 83 83 72
4.36 Skipping arguments
#############################################################################.
# You can skip an argument by repeating commas.
# This works but it is not usually how R programmers write code. Therefore
# others might not understand your code if you do this. You should know
# that it works but I recommend that you don't do it in practice.
#############################################################################.
# value of 1st argument (ie. "from") is 2
# value of 2nd argument (ie. "to") is 3
seq(2, 3, length.out=20)
[1] 2.000000 2.052632 2.105263 2.157895 2.210526 2.263158 2.315789 2.368421
[9] 2.421053 2.473684 2.526316 2.578947 2.631579 2.684211 2.736842 2.789474
[17] 2.842105 2.894737 2.947368 3.000000
# value of 1st argument (ie. "from") is 2
# value of 2nd argument (ie. "to") is blank - i.e. the default is used
# value of 3rd argument (ie. "by") is 3 - i.e. the default is used
seq(2, , 3, length.out=20) # now the 3 is being passed to 3rd argument, ie. by
[1] 2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 47 50 53 56 59
4.37 — PRACTICE —
#######################################################.
# QUESTION ####
#
# A professor wants to curve the grades of his students.
# The grades are in the variable named grades.
#
# The highest grade should get a 1 point curve,
# ... the next lower grade a 3 point curve
# ... the next lower grade a 5 point curve
# etc.
#
# Write R code to store the curved grades in a variable named curvedGrades.
# Your code should work unchanged no matter what values are stored in
# the grades vector.
#
# EXAMPLE 1
# > grades = c(98,77,64,79,76, 84, 92, 78)
# > # YOUR CODE GOES HERE
# > curvedGrades
# [1] 99 95 89 86 87 88 89 79
#
# EXAMPLE 2
# > grades = c(70, 90, 60, 80)
# > # YOUR CODE GOES HERE
# > curvedGrades
# [1] 91 83 75 67
#######################################################.
#######################################################.
# QUESTION ####
#
# Do the same as the previous question, however, this time
# the professor wants to give the highest 25% of the class no curve.
# The first grade below the highest 25% of the class a 1 point curve,
# ... the next lower grade a 3 point curve
# ... the next lower grade a 5 point curve
# etc.
#
# EXAMPLE 1
# > grades = c(98,77,64,79,76, 84, 92, 78)
# > # YOUR CODE GOES HERE
# [1] 98 92 85 82 83 84 85 75
#
# EXAMPLE 2
# > grades = c(70, 90, 60, 80)
# > # YOUR CODE GOES HERE
# [1] 90 81 73 65
#
#######################################################.
#################################################################.
# QUESTION ####
#
# This time make the curve amounts the square root of 100-grade for each
# person's grade. All students should get this curve.
#
# EXAMPLE 1
# > grades = c(100, 99, 96, 91, 86, 75, 19, 0)
# # YOUR CODE GOES HERE
# [1] 100 100 98 94 88 80 28 10
#
#
# EXAMPLE 2
# > grades = c(98,77,64,79,76, 84, 92, 78)
# # YOUR CODE GOES HERE
# [1] 99.41421 81.79583 70.00000 83.58258 80.89898 88.00000 94.82843 82.69042
#################################################################.
4.38 Practice with NESTING functions one inside the other
#################################################################.
# QUESTION ####
#
# create a vector that contains the even #rs from 2 through 30 followed by
# the odd #rs from from 1 through 30. Write your command using the least amount
# of typing possible.
#
# The result should be as shown below.
#
# > # YOUR COMMAND GOES HERE
#
# [1] 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
#################################################################.
#################################################################.
# QUESTION ####
#
# Use R's functions that we learned about to
# create a vector of the evens from 2 through 10 repeated to a length of 27
# DO NOT SIMPLY JUST TYPE THE NUMBERS IN A c(). One or more functions other than just c().
#
# The result should be as shown below.
#
# > # YOUR COMMAND GOES HERE
#
# [1] 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4
#################################################################.
#################################################################.
# QUESTION ####
#
# Generate a vector that contains the even #rs from 1 through 10
# followed by the odd #rs from 1 through 10.
# All of these numbers should be repeated 3 times
#
# The output should look like this:
#
# > # YOUR COMMAND GOES HERE
#
# [1] 2 4 6 8 10 1 3 5 7 9 2 4 6
# [14] 8 10 1 3 5 7 9 2 4 6 8 10 1
# [27] 3 5 7 9
#
# MAKE SURE YOU
# - use the c function when necessary to combine the evens and odds into a single vector
# - put the commas in the correct place
# - put all parentheses in the correct places
#################################################################.
#################################################################.
# QUESTION ####
#
# Create vector that has the numbers 0.3, 0.6, 0.9, 1.2, 1.5 ... for a total of 300 values
#################################################################.
4.39 The : operator
#----------------------------------------------------------------------------.
# The : operator is a shorthand for a basic usage of the seq function that only uses from and to arguments ####
#
# For example:
#
# > 3:6 # is the same as seq(from=3, to=6)
#
# [1] 3 4 5 6
#
# See more examples below.
#----------------------------------------------------------------------------.
3:5 # 3 4 5
[1] 3 4 5
seq(3,5) # 3 4 5 (same thing)
[1] 3 4 5
5:3 # 5 4 3
[1] 5 4 3
seq(5,3) # 5 4 3 (same thing)
[1] 5 4 3
-3:5 # -3 -2 -1 0 1 2 3 4 5
[1] -3 -2 -1 0 1 2 3 4 5
seq(-3,5) # -3 -2 -1 0 1 2 3 4 5 (same thing)
[1] -3 -2 -1 0 1 2 3 4 5
3:-5 # 3 2 1 0 -1 -2 -3 -4 -5
[1] 3 2 1 0 -1 -2 -3 -4 -5
seq(3,-5) # 3 2 1 0 -1 -2 -3 -4 -5 (same thing)
[1] 3 2 1 0 -1 -2 -3 -4 -5
4.40 Order of operations in R
###########################################################################.
# Order of operations in R ####
#
# To see the full list of the order of operations for R (or "operator precedence")
# type the following (notice the CAPITAL "S" in ?Syntax).
#
# ?Syntax # Operators that appear higher in the list are done first. ####
#
# or see the following webpage:
#
# https://stat.ethz.ch/R-manual/R-devel/library/base/html/Syntax.html
###########################################################################.
# Let's look at the complete order of operations for R's operators.
# as mentioned above operators that appear higher in the list are done
# before operators that appear lower in the list.
# View the help page by typing:
#
# ?Syntax
# Notice that the colon operator is done AFTER exponentiation but
# BEFORE multiplication, division, addition and subtraction are done!
#
# Be careful of the order of operations!
# The colon operator is done BEFORE the subtraction operator
15-4:2 # result is 11 12 13 (might not be what you would have thought)
[1] 11 12 13
#original: 15-4:2
# colon is first: 15-c(4,3,2)
# minus is next: c(15-4, 15-3, 15-2)
# c(11, 12, 13)
15-4):2 # this is different (
[1] 11 10 9 8 7 6 5 4 3 2
#original: (15-4):2
# minus first: 11:2
# colon is next: c(11,10,9,8,7,6,54,4,3,2)
# View the help page by typing:
#
# ?Syntax
4.41 Help pages for R’s operators
#------------------------------------------------------------------.
#
# ?`:` # type this to see the help page for the colon operator (e..g 3:5) ####
#
#------------------------------------------------------------------.
# To get more info about the colon operator,
# you can read the R help documentation for the : operator.
#
# To do so, you must enclose the colon in `backticks` (also known as `grave accents`).
# The backtick (or grave accent) character is on most USA keyboards
# in the upper left hand corner under the ESC key.
# It is on the same key as the "~" (tilde) character.
# ?`:` # You must enclose the colon in `backticks` (also known as `grave accents`) ####
# help(`:`) # this does the same thing
# If you leave out the `backticks` (AKA `grave accents`) you will get an error.
# Note the red "x" in the left margin in RStudio next to the following command.
#
# ?: # ERROR
# You can also use backticks for help topics that contain other symbols or spaces
#
# ?`+` # Shows help topic for + (and other arithmetic operators)
#############################################################################.
# The following was added in 2022
# NOTE - in recent versions of R, in addition to `backticks`
# 'single quotes' (i.e. 'apostrophes')
# and "double quotes" (i.e. "quotes")
# also work.
#############################################################################.
# as of 2022, 'single quotes' "double quotes" and `backticks` all work
#
# ?':' # single quotes
# ?":" # double quotes
# ?`:` # backticks
#
# ?'+' # single quotes
# ?"+" # double quotes
# ?`+` # backticks