> x_c(1,2,3,3,3,4,7,8,9,NA) * when there are missing values in the data, the functions max(), min(), range(), mean(), and median() return NA, and the functions var(), cor(), and quantile() return an error message > max(x, na.rm=T) [1] 9 * specifying na.rm=T in the function max() forces Splus to remove any missing values from the vector x and to return the maximum value in x > min(x, na.rm=T) [1] 1 > range(x, na.rm=T) [1] 1 9 > mean(x, na.rm=T) [1] 4.444444 > mean(x, trim=0.2, na.rm=T) [1] 4.285714 * the argument trim can take any value between 0 and 0.5 inclusive to be trimmed from each end of the ordered data * if trim=0.5, the result is the median > median(x, na.rm=T) [1] 3 > quantile(x, probs=c(0,0.1,0.9), na.rm=T) 0% 10% 90% * the function quantile() returns the 1 1.8 8.2 quantiles of x specified in the argument probsIf there are no missing values in the vector x, it is not necessary to specify na.rm=T - simply use min(x), max(x), etc.
These functions may also be used on matrices; they will not be applied to the rows or columns individually but rather will find the max, min, etc. of the whole matrix
> var(x[!is.na(x)]) [1] 8.027778 * missing values are removed from the vector x using the subscript !is.na(x) * specifying two arguments to the var() function, var(x,y) returns the covariance between the two arguments * arguments may be vectors or matrices > y_c(1,2,3,4,5,6,7,8,9,10) > cor(x[!is.na(x)],y[!is.na(x)] [1] 0.9504597 * because the cor() function requires x and y to be of the same length, it is necessary to remove the value of y corresponding to the missing value in x; this is done using y[!is.na(x)] > summary(x) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 1 3 3 4.444 7 9 1 > z_c(5,4,3,2,1,9,8,7,6,5) > pmax(x,y,z) [1] 5 4 3 4 5 9 8 8 9 NA > pmin(x,y,z) [1] 1 2 3 2 1 4 7 7 6 NA * pmax() returns the maximum value for each position in a number of vectors * likewise, pmin() returns the minimum value * na.rm=T may also be specified to remove missing values
< dist > Parameters Defaults Distributions beta shape1, shape2 -, - Beta binom size,prob -, - Binomial cauchy location, scale 0, 1 Cauchy chisq df - Chisquare exp rate (1/mean ) 1 Exponential f df1, df2 -, - F gamma shape - GAMMA geom prob - Geometric hyper m, n, k -, -, - Hypergeometric lnorm mean, sd (of log) 0, 1 Lognormal logis location, scale 0, 1 Logistic norm mean, sd 0, 1 Normal nrange size, nevals -, 200 Normal Range -, - for rnrange pois lambda - Poisson t df - Student's t unif min, max 0, 1 Uniform weibull shape - Weibull wilcox m, n -, - WilcoxonFor help on the use of the d < dist > (), p < dist > (), q < dist > (), and r < dist > () functions for each of these distributions, use help with the name of the distribution as it appears in the column Distribution, (eg.: help(GAMMA)) with the following exceptions: for logis type help(dlogis), for nrange type help(dnrange), for the F distribution and Student's t distribution, type help.start(gui='motif'), click on Probability Distributions and Random Numbers under the column Categories, then click on F or T in the left-hand column
> dnorm(0) [1] 0.3989423 * returns the density at 0 for the normal distribution > X11() > plot(seq(-3,3,0.1), dnorm(seq(-3,3,0.1)), type="l") * the d < dist > () functions can be used to plot the density function for each of the above distributions > pnorm(1.96) [1] 0.9750021 * returns the cumulative probability at 1.96 for the normal distribution > qnorm(0.9750021) [1] 1.96 * returns the 97.5th percentile for the normal distribution > rnorm(5) [1] -0.7160094 0.3953744 1.2587492 0.3022640 -0.4109508 * generates 5 random standard normal variables > rexp(5,1/3) [1] 0.1204068 0.1937435 9.3637550 0.8051347 1.0450249 * this could also have been written as > rexp(5, rate=1/3)