Row means with min amount of valid values

This function is similar to the SPSS MEAN.n function and computes row means from a data.frame or matrix if at least n values of a row are valid (and not NA).

mean_n(dat, n, digits = 2)

Arguments

dat

A data frame with at least two columns, where row means are applied.

n

May either be

a numeric value that indicates the amount of valid values per row to calculate the row mean;
or a value between 0 and 1, indicating a proportion of valid values per row to calculate the row mean (see 'Details').

If a row's sum of valid values is less than n, NA will be returned as row mean value.

digits

Numeric value indicating the number of decimal places to be used for rounding mean value. Negative values are allowed (see 'Details').

Value

A vector with row mean values of df for those rows with at least n

valid values. Else, NA is returned.

Details

Rounding to a negative number of digits means rounding to a power of ten, so for example mean_n(df, 3, digits = -2) rounds to the nearest hundred.

For n, must be a numeric value from 0 to ncol(dat). If a row in dat has at least n non-missing values, the row mean is returned. If n is a non-integer value from 0 to 1, n is considered to indicate the proportion of necessary non-missing values per row. E.g., if n = .75, a row must have at least ncol(dat) * n non-missing values for the row mean to be calculated. See 'Examples'.

References

r4stats.com

Examples

dat <- data.frame(c1 = c(1,2,NA,4),
                  c2 = c(NA,2,NA,5),
                  c3 = c(NA,4,NA,NA),
                  c4 = c(2,3,7,8))

# needs at least 4 non-missing values per row
mean_n(dat, 4) # 1 valid return value
#> [1]   NA 2.75   NA   NA

# needs at least 3 non-missing values per row
mean_n(dat, 3) # 2 valid return values
#> [1]   NA 2.75   NA 5.67

# needs at least 2 non-missing values per row
mean_n(dat, 2)
#> [1] 1.50 2.75   NA 5.67

# needs at least 1 non-missing value per row
mean_n(dat, 1) # all means are shown
#> [1] 1.50 2.75 7.00 5.67

# needs at least 50% of non-missing values per row
mean_n(dat, .5) # 3 valid return values
#> [1] 1.50 2.75   NA 5.67

# needs at least 75% of non-missing values per row
mean_n(dat, .75) # 2 valid return values
#> [1]   NA 2.75   NA 5.67