6. Functions

Although there are many functions available in Splus, there are times when a function tailored to a specific situation is needed. Functions are written in the form:

          function name_function(arguments){expressions}
where arguments is a list of arguments separated by commas which may be used by the function, and expressions are any legal Splus expressions. The following is an example of a function in Splus:

std.dev_function(x)

{
      std.x_sqrt(var(x))
      std.x
}
The last line in the function gives the name of the value to be returned by the function.

> std.dev(c(1,2,3,4,5))
> [1] 1.581139

Conditional Computations

if, else

     if( condition ) expr1

            or

     if( condition ) expr1 else expr2
If the logical expression condition is evaluated to TRUE then the expression expr1 is evaluated. If condition is evaluated to FALSE, then either the value of the whole expression is NULL, or when an else part is specified, expr2 is evaluated:

> x_2

> if (x > 0) log(x)
[1] 0.6931472

> if (x > 100) (x - 100)
NULL

> if (x < 0) x else log(x)
[1] 0.6931472

Logical Operators

   &&    sequential and
   ||    sequential or
It is not always necessary to evaluate both sides of a logical expression using & or |. The sequential operators && and || take advantage of this to avoid errors which could crash the function. If the left operand is F, && returns FALSE and the rest of the expression is not evaluated. If the left operand is T, || returns TRUE and the rest of the expression is not evaluated. Both operators can be used only with scalars.

> y_"Jo"
> if (is.numeric(y) && (y > 0)) log(y)
> NULL
If the first logical expression is FALSE, evaluating the second expression would cause an error.

stop, warning

stop() issues an error or warning message from a function and causes the function to be exited:

     if (y > 0) log(y) else stop("y is not greater than 0")
warning() prints a warning message but does not cause the function to be exited:

>  x_c(1,2,3,4,5,NA)

>  if (sum(is.na(x)) > 0) {x_x[!is.na(x)]
+                          warning("Missing values have been deleted")}

Warning messages:
  Missing values have been deleted

> x
[1] 1 2 3 4 5

switch

The function switch() takes as its first argument an expression which is then matched to the names assigned to the remaining arguments to decide on the course of action to take. The first argument may be either a character string or a single number.

     y_switch(transformation,
            logarithm = y_log(x),
            "square root" = y_sqrt(x),
            stop("Invalid transformation"))
Here, transformation should be a character string matching one of the two named arguments. The argument whose name matches the charater string transformation is evaluated. If there is no match, an unnamed argument, if any, is taken to be the default expression and evaluated. Notice the quotes used around the argument name square root. Without the quotes, the string would be treated as two separate variables as opposed to a single string.

Iteration

for

for() loops are written in the form:

               for( index in values) { expressions }
where the curly brackets are optional when only one expression is specified.

> x_0
> for(i in 1:10) x_x+i
> x
[1] 55
for() loops may also be used to construct data objects by using index as a subscript for the variable being created. In this case, the variable must be initiated outside the for() loop.

> rm(x)
> for(i in 1:10) x[i]_i
Error: Object "x" not found

> x_NULL
> for(i in 1:10) x[i]_i
> x
 [1]  1  2  3  4  5  6  7  8  9 10
To create a matrix using two for() loops, a null matrix of the correct dimensions must be created outside the for() loops.

> x_matrix(0, nrow=3, ncol=4)

> for(i in 1:3) {
+                for(j in 1:4) { x[i,j]_i+j }}

> x
     [,1] [,2] [,3] [,4]
[1,]    2    3    4    5
[2,]    3    4    5    6
[3,]    4    5    6    7

while

The while() expression keeps testing a condition and, so long as the condition is TRUE, evaluates the expressions provided. while() iterations are written in the form:

              while(condition) { expressions}


> while(x*2 < 1000000) { x_x*2
+                        i_i+1}

> x
[1] 524288

> i
[1] 18
Notice that in order for the last value of x to be less than 1000000, the condition in the while() expression must be x*2 < 1000000 and NOT x < 1000000.

repeat

A repeat() expression could also have been used in the previous example:

> x_2
> i_0
> repeat{ i_i+1
          x_x*2
          if(x*2 > 1000000) break}

> x
[1] 524288

> i
[1] 18
repeat() will keep repeating the expressions forever unless some condition is set within the iteration to break out of the loop. Here, a break expression terminates the loop. When Splus evaluates a break, it exits the innermost enclosing for(), while(), or repeat() loop.

Default arguments, return

As mentionned in previous sections, a function may have a default setting for some of its arguments. An example of this was the mean() function, where the default for the argument trim= is trim=0. The first line of the function mean() looks like this:


 mean_function(x, trim = 0, na.rm = F)

Specifying default values for some of the arguments in a function allows the user to omit those arguments when making a call to the function. As seen in section 4, typing

> mean(x)

returns a regular mean, and specifying the trim= argument, for example

> mean(x, trim=0.2)

will return a trimmed mean. The same arguments can be applied to the na.rm= argument.

Objects which are created or modified inside a function are temporary, have no effect outside the function, and disappear when the evaluation of the function is complete. There are several ways of returning the value of a function, one of which is to name the value to be returned on the last line of the function. This value can be stored permanently by assigning a name to the function evaluation. For example:

> x.mean_mean(x)
creates a data object, x.mean with the mean of the variable x. It is possible to return more than one data object using the return() function. The return() function creates a list with its arguments as the components of the list.

example_function(x)
{
        x.mean <- mean(x)
        x.var <- var(x)
        x.sum <- sum(x)
        return(x.mean, x.var, x.sum)
}

> x_c(1,2,3,4,5)
> example(x)

$x.mean:
[1] 3

$x.var:
[1] 2.5

$x.sum:
[1] 15

Input/Output

cat

The cat() function coerces its arguments to mode character and prints the result. This function is very useful for keeping track of the progress of loops.

> y_NULL
> for(i in 1:3){
+ y_y+i
+ cat("i is", i, "\n")}
i is 1
i is 2
i is 3

readline

The readline() function returns a character string read in from a terminal. The input ends after a return or enter.

  example_function(){
  cat("Type in the name of the distribution from which to generate","\n")
  cat("10 random variables","\n")
  cat("normal or uniform:")
  distrib_readline()
  switch(distrib,
         normal = rnorm(10),
         uniform = runif(10),
         stop("Invalid choice"))}

scan

The scan() function reads in numeric data interactively or from a file. The argument n= specifies how many data values are to be read in.

  example_function(){
  cat("Enter 10 numbers","\n")
  x_scan(n=10)
  total_sum(x)
  cat("The total is",total,"\n")}
The scan() function can be used to read in numeric data into a matrix. Suppose the file rain contains rainfall for five months from 1900-1950, and that the data is stored in 5 columns. The data would be read in as follows:

> rain_matrix(scan("rain"),ncol=5, byrow=T)
To read in character data, the argument what=character() must be specified. More information on the options for the scan(), readline(), and cat() functions is available in the help documentation.

source

Simple functions may be typed directly into Splus, but this is not adviseable in the case of longer functions. These should be typed in a UNIX file and read into Splus using the source() function. Suppose the function example was stored in a file called ex1. The following command would read the function into Splus:

> source("ex1")
The source() function may be used to read in any Splus expression. Rather than typing the expressions directly into Splus, they can be stored in a file and evaluated using the source() function. Suppose the file ex2 contains the following Splus expressions:

     friends_c("Jack","Jill")
     phones_c(5554321,5551234)
     cat(friends[1],phones[1],"\n")
     cat(friends[2],phones[2],"\n")
The expressions are then evaluated using the source() function:

> source("ex2")
Jack 5554321
Jill 5551234
Unlike data objects created within a function, those created from a file outside Splus and evaluated using source() remain in the .Data directory after the expressions have been evaluated:

> friends
[1] "Jack" "Jill"

Further Reading

Richard A. Becker, John M. Chambers, Allan R. Wilks, The New S Language. A Programming Environmnent for Data Analysis and Graphics, Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, California, 1988, pp. 24, 25, 151-233.

Where to now?

Table of Contents

Data Frames