12  12a. Indexing with logicals, e.g. someVector [ c(TRUE, FALSE, TRUE) ]

12.1 Some review

Quick review of “indexing”

Remember that “indexing a vector” means to use one vector to identify the values that you want from anther vector. The “index” (which is a vector) gets placed in the [square brackets], e.g. someVector [ theIndex ]

So far we learned two different types of indexes (a) positive numbers, e.g. someVector[c(1,3)] gets only the 1st and 3rd values. (b) negative numbers, e.g. someVector[c(-2,-4,-6)] gets all values EXCEPT for the 2nd, 4th and 6th values.

In this section we cover a new way to index, ie. using logical values as the index e.g. someVector[c(TRUE, FALSE, TRUE)] - this is explained below.

Quick review of logical vectors

Remember that a “logical vector” is contains only the values TRUE and/or FALSE. See examples below:

3 > 2
[1] TRUE
c(100,200) < c(50,400)  # FALSE TRUE
[1] FALSE  TRUE
c(1,2,3,4)  <= c(1,2)  # recycling rule: c(1,2,3,4) <= c(1,2,1,2) 
[1]  TRUE  TRUE FALSE FALSE

12.2 Indexing with logicals, e.g. someVector [ c(TRUE, FALSE, TRUE) ]

You can index any vector by using a logical vector to identify which values you want.

For example assume that x is a vector. The expression x[c(TRUE, FALSE, TRUE)] will return the 1st and 3rd values from x but not the 2nd value (see examples below). In general the values returned from the vector outside the [square brackets] are the values in those positions that correspond to the TRUE values in the vector inside the [square brackets].

# an example
grades <- c(70, 72, 80, 85, 88, 95)

grades
[1] 70 72 80 85 88 95
grades[c(TRUE, TRUE, FALSE, FALSE, FALSE, TRUE)]   # 70 72 95
[1] 70 72 95

As usual, the value of the grades vector doesn’t change unless you assign a new value to it (which we did not do in this example).

grades    # grades didn't change
[1] 70 72 80 85 88 95
# another example
grades[c(FALSE, FALSE, FALSE, TRUE, TRUE, FALSE)]  # 85 88
[1] 85 88

12.3 recycling rule

Remember that a command that involves more than one vector is often affected by R’s “recycling rule”. In this case, the recycling rule works to repeat the logical vector as many times as necessary to equal the length of the vector being indexed. See examples below.

grades <- c(70, 72, 80, 85, 88, 95)
grades
[1] 70 72 80 85 88 95
grades[c(TRUE, FALSE)]   # 70  80  88 (i.e. every other grade starting with 1st grade)
[1] 70 80 88
# original      : grades[c(TRUE, FALSE)]
# recycling rule: grades[c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE)]
#               : 70 80 88

grades                   # 70 72 80 85 88 95
[1] 70 72 80 85 88 95
grades[c(FALSE, TRUE)]   #    72    85    95 (every other grade starting with 2nd grade)
[1] 72 85 95
# original      : grades[c(FALSE, TRUE)]
# recycling rule: grades[c(FALSE, TRUE, FALSE, TRUE, FALSE, TRUE)]
#               : 72 85 95

grades                               # 70 72 80 85 88 95
[1] 70 72 80 85 88 95
grades[c(TRUE, FALSE, TRUE, TRUE)]   # 70    80 85 88 
[1] 70 80 85 88
# original      : grades[c(TRUE, FALSE, TRUE, TRUE)]
# recycling rule: grades[c(TRUE, FALSE, TRUE, TRUE, TRUE, FALSE)]
#               : 70 80 85 88

12.4 Generating TRUE/FALSE values (i.e. a logical vector) to describe some condition.

We can generate a logical vector with a comparison operator (i.e. > < >= <= == !=).

grades <- c(70, 72, 80, 85, 88, 95)
grades
[1] 70 72 80 85 88 95
grades >= 80   # FALSE FALSE TRUE TRUE TRUE TRUE
[1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE
# original       : grades >= 80
#                : c(70,72,80,85,88,95) >= 80
# recycling rule : c(70,72,80,85,88,95) >= c(80,80,80,80,80,80)
#                : c(70>=80 , 72>=80 , 80>=80 , 85>=80 , 88>=80 , 95>=80)
#                : c(FALSE,   FALSE,   TRUE,    TRUE,    TRUE,    TRUE)
#                : FALSE FALSE TRUE TRUE TRUE TRUE


grades >= c(80, 90)  # FALSE FALSE TRUE FALSE TRUE TRUE
[1] FALSE FALSE  TRUE FALSE  TRUE  TRUE
# original       : grades >= c(80, 90)
#                : c(70,72,80,85,88,95) >= c(80,90)
# recycling rule : c(70,72,80,85,88,95) >= c(80,90,80,90,80,90)
#                : c(70>=80,72>=90,80>=80,85>=90,88>=80,95>=90)
#                : c(FALSE, FALSE ,TRUE  ,FALSE  ,TRUE ,TRUE)

12.5 Retrieving values that match the condition

Often we want to get only those values from a vector which match a specific condition. To do so use a logical vector that represents that condition as the [index].

For example, if we want to see all the grades that are 80 or above. We use a logical vector INSIDE the brackets that represents the condition - see the examples below.

grades <- c(70, 72, 80, 85, 88, 95)
grades
[1] 70 72 80 85 88 95
# EXAMPLE: Display only those grades that are 80 or greater
grades[ grades >= 80 ]   #       80 85 88 95
[1] 80 85 88 95
# original                                     : grades[ grades >= 80 ]  
# create the logical vector                    : grades[ c(FALSE,FALSE,TRUE,TRUE,TRUE,TRUE)]
# extract the grades that are in TRUE positions: 80 85 88 95
# ANOTHER EXAMPLE: Show grades that are above average.
grades                           # 70 72 80 85 88 95
[1] 70 72 80 85 88 95
grades[grades >  mean(grades)]   # 85 88 95
[1] 85 88 95
# original      : grades[grades > mean(grades)]
# mean          : grades[grades > 81.66667 ]
# logical vector: grades[c(FALSE, FALSE, FALSE, TRUE, TRUE, TRUE)]
#               :  85 88 95

12.6 — Practice —

QUESTION

#################################################################.
# QUESTION: grades is a vector that contains student grades, e.g.
#
#               grades <- c(70, 72, 80, 85, 88, 95)
#
# (a) Write a command that will only show those grades that are 
#     an even multiple of 10 (i.e. that end in zero).
#     HINT - think about using the %% operator as part of your
#     answer. 
#
# (b) Write a command that displays all the other grades.
#################################################################.
##########.
# ANSWER
##########.

# Start with some sample grades
grades <- c(70, 72, 80, 85, 88, 95)

# ANSWER
grades [ grades %% 10 == 0    ]
[1] 70 80
# REMEMBER - you must use two == signs for comparisons. 
# Using one = sign is an error:
grades [ grades %% 10 = 0    ]   # ERROR - remember - use == for comparisons
Error: <text>:3:23: unexpected '='
2: # Using one = sign is an error:
3: grades [ grades %% 10 =
                         ^
###################################.
# How to think through the problem
###################################.

# step 1: You need to get some values from the grades vector. Therefore the command
#         needs to specify 
#                    grades[ SOME_VECTOR ]   
#         You need a vector in the [brackets] that identifies which grades you want.
#
#         Exactly which values you need does not depend on position numbers. Rather
#         the grades that you want depends on a particular condition, ie. those
#         grades that are evenly divisible by 10. Therefore formulate the command
#         in the following way:
# 
#             grades[ LOGICAL-VECTOR-THAT-HAS-TRUEs-AND-FALSEs-IN-THE-RIGHT-PLACES ]
#
# step 2: REMEMBER - the %% operator gets you a remainder so
#                   42 %% 10 is 2
#                   75 %% 10 is 5
#                       but
#                   40 %% 10 is 0
#                   70 %% 10 is also 0
#                   0 %% 0   is also 0
#         The condition we are looking for is the grade%%10 is 0 (i.e. the grade is divisible by 10)
#         The condition we are looking for is    :    grades %% 10 == 0
#
# step 3: put it all together
#           grades [ grades %% 10 == 0]

# show all the grades
grades
[1] 70 72 80 85 88 95
# show the grades that are an even multiple of 10
grades [  grades %% 10 == 0]
[1] 70 80
# original                   : grades [  grades %% 10 == 0]
# expand grades inside the []: grades [ c(70,72,80,85,88,95) %% 10 == 0]
# apply the %% to each grade : grades [ c(70%%10,72%%10,80%%10,85%%10,88%%10,95%%10) == 0]
#                            : grades [ c(0     ,2     ,0     ,5     ,8     ,5     ) == 0]
# apply the == to each value : grades [ c(0==0,2==0,0==0,5==0,8==0,5==0) ]
#                            : grades [ c(TRUE,FALSE,TRUE,FALSE,FALSE,FALSE) ]
# pull out the values of grades that are in the TRUE positions: 70 80 
##########.
# ANSWER
##########.
grades [ grades %% 10 != 0    ]
[1] 72 85 88 95

QUESTION

###########################################################################.
# QUESTION: Write a command to display the values in the grades vector 
#           that are at least 5 points above average
#
#
# EXAMPLE 1: 
#   > grades <- c(70, 72, 80, 85, 88, 95)  # average is 81.6667
#   > YOUR CODE GOES HERE     
#   [1] 88 95
#
#
# EXAMPLE 2:
#   > grades <- c(50, 70, 72, 80, 85, 88, 95)  # average is 77.143
#   > YOUR CODE GOES HERE     
#   [1] 85 88 95
#########################################################################.
#---------.
# ANSWER
#---------.
grades <- c(70, 72, 80, 85, 88, 95)  # average is 81.6667
grades [ grades >= mean(grades) + 5 ]
[1] 88 95

QUESTION

#########################################################################.
# QUESTION: Show grades that are within 5 points of the highest grade
# HINT: use max function and a little math
#
# 
# EXAMPLE 1: 
#   > grades <- c(70, 72, 80, 85, 91, 93, 93, 95)
#   > YOUR CODE GOES HERE     
#  [1] 91 93 93 95
#
#
# EXAMPLE 2: 
#   > grades <- c(95, 92, 89, 89, 88, 80, 75)
#   > YOUR CODE GOES HERE     
#  [1] 95 92
#
#
# EXAMPLE 3: 
#   > grades <- c(95, 89, 89, 88, 80, 75)
#   > YOUR CODE GOES HERE     
#  [1] 95
#########################################################################.
#---------.
# ANSWER
#---------.
grades <- c(70, 72, 80, 85, 91, 93, 93, 95)
grades[ grades >= max(grades) - 5]
[1] 91 93 93 95