This function provides a number of descriptives about your data, similar to what SPSS's DESCRIPTIVES (often called with DESCR) does.

descr(
  x,
  items = names(x),
  varLabels = NULL,
  mean = TRUE,
  meanCI = TRUE,
  median = TRUE,
  mode = TRUE,
  var = TRUE,
  sd = TRUE,
  se = FALSE,
  min = TRUE,
  max = TRUE,
  q1 = FALSE,
  q3 = FALSE,
  IQR = FALSE,
  skewness = TRUE,
  kurtosis = TRUE,
  dip = TRUE,
  totalN = TRUE,
  missingN = TRUE,
  validN = TRUE,
  histogram = FALSE,
  boxplot = FALSE,
  digits = 2,
  errorOnFactor = FALSE,
  convertFactor = FALSE,
  maxModes = 1,
  maxPlotCols = 4,
  t = FALSE,
  headingLevel = 3,
  conf.level = 0.95,
  quantileType = 2
)

rosettaDescr_partial(
  x,
  digits = attr(x, "digits"),
  show = attr(x, "show"),
  headingLevel = attr(x, "headingLevel"),
  maxPlotCols = attr(x, "maxPlotCols"),
  echoPartial = FALSE,
  partialFile = NULL,
  quiet = TRUE,
  ...
)

# S3 method for rosettaDescr
knit_print(
  x,
  digits = attr(x, "digits"),
  show = attr(x, "show"),
  headingLevel = attr(x, "headingLevel"),
  maxPlotCols = attr(x, "maxPlotCols"),
  echoPartial = FALSE,
  partialFile = NULL,
  quiet = TRUE,
  ...
)

# S3 method for rosettaDescr
print(
  x,
  digits = attr(x, "digits"),
  show = attr(x, "show"),
  maxPlotCols = attr(x, "maxPlotCols"),
  headingLevel = attr(x, "headingLevel"),
  forceKnitrOutput = FALSE,
  ...
)

Arguments

x

The object to print (i.e. as produced by descr).

items

Optionally, if x is a data frame, the variable names for which to produce the descriptives.

varLabels

Optionally, a named vector with 'pretty labels' to show for the variables. This has to be a vector of the same length as items, and if it is not a named vector with the names corresponding to the items, it has to be in the same order.

mean, meanCI, median, mode

Whether to compute the mean, its confidence interval, the median, and/or the mode (all logical, so TRUE or FALSE).

var, sd, se

Whether to compute the variance, standard deviation, and standard error (all logical, so TRUE or FALSE).

min, max, q1, q3, IQR

Whether to compute the minimum, maximum, first and third quartile, and inter-quartile range (all logical, so TRUE or FALSE).

skewness, kurtosis, dip

Whether to compute the skewness, kurtosis and dip test (all logical, so TRUE or FALSE).

totalN, missingN, validN

Whether to show the total sample size, the number of missing values, and the number of valid (i.e. non-missing) values (all logical, so TRUE or FALSE).

histogram, boxplot

Whether to show a histogram and/or boxplot

digits

The number of digits to round the results to when showing them.

errorOnFactor, convertFactor

If errorOnFactor is TRUE, factors throw an error. If not, if convertFactor is TRUE, they will be converted to numeric values using as.numeric(as.character(x)), and then the same output will be generated as for numeric variables. If convertFactor is false, the frequency table will be produced.

maxModes

Maximum number of modes to display: displays "multi" if more than this number of modes if found.

maxPlotCols

The maximum number of columns when plotting multiple histograms and/or boxplots.

t

Whether to transpose the dataframes when printing them to the screen (this is easier for users relying on screen readers). Note: this functionality has not yet been implemented!

headingLevel

The number of hashes to print in front of the headings when printing while knitting

conf.level

Confidence of confidence interval around the mean in the central tendency measures.

quantileType

The type of quantiles to be used to compute the interquartile range (IQR). See quantile for more information.

show

A vector of elements to show in the results, based on the arguments that activate/deactivate the descriptives (from mean to validN).

echoPartial

Whether to show the executed code in the R Markdown partial (TRUE) or not (FALSE).

partialFile

This can be used to specify a custom partial file. The file will have object x available.

quiet

Passed on to knitr::knit() whether it should b chatty (FALSE) or quiet (TRUE).

...

Any additional arguments are passed to the default print method by the print method, and to rmdpartials::partial() when knitting an RMarkdown partial.

forceKnitrOutput

Force knitr output.

Value

A list of dataframes with the requested values.

Details

Note that R (of course) has many similar functions, such as summary, psych::describe() in the excellent psych::psych package.

The Hartigans' Dip Test may be unfamiliar to users; it is a measure of uni- vs. multimodality, computed by the dip.test() function from the {diptest} package from the. Depending on the sample size, values over .025 can be seen as mildly indicative of multimodality, while values over .05 probably warrant closer inspection (the p-value can be obtained using that dip.test() function from {diptest}; also see Table 1 of Hartigan & Hartigan (1985) for an indication as to critical values).

References

Hartigan, J. A.; Hartigan, P. M. The Dip Test of Unimodality. Ann. Statist. 13 (1985), no. 1, 70--84. doi:10.1214/aos/1176346577. https://projecteuclid.org/euclid.aos/1176346577.

See also

summary, [psych::describe()

Author

Gjalt-Jorn Peters

Maintainer: Gjalt-Jorn Peters gjalt-jorn@userfriendlyscience.com

Examples

### Simplest example with default settings
descr(mtcars$mpg);
#> Descriptives for mtcars$mpg
#> 
#>                 Mean :  20.09
#>   95% Conf. Interval :  [17.92; 22.26]
#>               Median :  19.20
#>                 Mode :  (multi)
#>             Variance :  36.32
#>   Standard Deviation :  6.03
#>              Minimum :  10.40
#>              Maximum :  33.90
#>             Skewness :  0.67
#>             Kurtosis : -0.02
#>             Dip test :  0.06
#>    Total sample size :  32
#>       Missing values :  0
#>    Valid sample size :  32

### Also requesting a histogram and boxplot
descr(mtcars$mpg, histogram=TRUE, boxplot=TRUE);
#> Descriptives for mtcars$mpg
#> 
#>                 Mean :  20.09
#>   95% Conf. Interval :  [17.92; 22.26]
#>               Median :  19.20
#>                 Mode :  (multi)
#>             Variance :  36.32
#>   Standard Deviation :  6.03
#>              Minimum :  10.40
#>              Maximum :  33.90
#>             Skewness :  0.67
#>             Kurtosis : -0.02
#>             Dip test :  0.06
#>    Total sample size :  32
#>       Missing values :  0
#>    Valid sample size :  32

### To show the output as Rmd Partial in the viewer
rosetta::rosettaDescr_partial(
  rosetta::descr(
    mtcars$mpg
  )
);
#> No viewer found, probably documenting or testing
#> 
#> 
#> <div style="display:block;clear:both;" class="rosetta-descr-start"></div>
#> <div class="rosetta-descr-container">
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> ### Descriptives for mtcars$mpg
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> <table style='border:0px solid black !important; font-family: "Arial Narrow", "Source Sans Pro", sans-serif; margin-left: auto; margin-right: auto;' class="table table-condensed">
#> <tbody>
#>   <tr>
#>    <td style="text-align:right;"> Mean : </td>
#>    <td style="text-align:left;"> 20.09 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> 95% Conf. Interval : </td>
#>    <td style="text-align:left;"> [17.92; 22.26] </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Median : </td>
#>    <td style="text-align:left;"> 19.2 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Mode : </td>
#>    <td style="text-align:left;"> (multi) </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Variance : </td>
#>    <td style="text-align:left;"> 36.32 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Standard Deviation : </td>
#>    <td style="text-align:left;"> 6.03 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Minimum : </td>
#>    <td style="text-align:left;"> 10.4 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Maximum : </td>
#>    <td style="text-align:left;"> 33.9 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Skewness : </td>
#>    <td style="text-align:left;"> 0.67 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Kurtosis : </td>
#>    <td style="text-align:left;"> -0.02 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Dip test : </td>
#>    <td style="text-align:left;"> 0.06 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Total sample size : </td>
#>    <td style="text-align:left;"> 32 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Missing values : </td>
#>    <td style="text-align:left;"> 0 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:right;"> Valid sample size : </td>
#>    <td style="text-align:left;"> 32 </td>
#>   </tr>
#> </tbody>
#> </table>
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> </div>
#> <div style="display:block;clear:both;" class="rosetta-descr-end"></div>

### Multiple variables, including one factor
rosetta::rosettaDescr_partial(
  rosetta::descr(
    iris
  )
);
#> No viewer found, probably documenting or testing
#> 
#> 
#> <div style="display:block;clear:both;" class="rosetta-descr-start"></div>
#> <div class="rosetta-descr-container">
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> ### Descriptives for variables in data frame iris
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> <table style='border:0px solid black !important; font-family: "Arial Narrow", "Source Sans Pro", sans-serif; margin-left: auto; margin-right: auto;' class="table table-condensed">
#>  <thead>
#>   <tr>
#>    <th style="text-align:left;">   </th>
#>    <th style="text-align:right;"> mean </th>
#>    <th style="text-align:left;"> meanCI </th>
#>    <th style="text-align:right;"> median </th>
#>    <th style="text-align:left;"> mode </th>
#>    <th style="text-align:right;"> var </th>
#>    <th style="text-align:right;"> sd </th>
#>    <th style="text-align:right;"> min </th>
#>    <th style="text-align:right;"> max </th>
#>    <th style="text-align:right;"> skewness </th>
#>    <th style="text-align:right;"> kurtosis </th>
#>    <th style="text-align:right;"> dip </th>
#>    <th style="text-align:right;"> totalN </th>
#>    <th style="text-align:right;"> missingN </th>
#>    <th style="text-align:right;"> validN </th>
#>   </tr>
#>  </thead>
#> <tbody>
#>   <tr>
#>    <td style="text-align:left;"> Sepal.Length </td>
#>    <td style="text-align:right;"> 5.84 </td>
#>    <td style="text-align:left;"> [5.71; 5.98] </td>
#>    <td style="text-align:right;"> 5.80 </td>
#>    <td style="text-align:left;"> 5 </td>
#>    <td style="text-align:right;"> 0.69 </td>
#>    <td style="text-align:right;"> 0.83 </td>
#>    <td style="text-align:right;"> 4.3 </td>
#>    <td style="text-align:right;"> 7.9 </td>
#>    <td style="text-align:right;"> 0.31 </td>
#>    <td style="text-align:right;"> -0.55 </td>
#>    <td style="text-align:right;"> 0.04 </td>
#>    <td style="text-align:right;"> 150 </td>
#>    <td style="text-align:right;"> 0 </td>
#>    <td style="text-align:right;"> 150 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:left;"> Sepal.Width </td>
#>    <td style="text-align:right;"> 3.06 </td>
#>    <td style="text-align:left;"> [2.99; 3.13] </td>
#>    <td style="text-align:right;"> 3.00 </td>
#>    <td style="text-align:left;"> 3 </td>
#>    <td style="text-align:right;"> 0.19 </td>
#>    <td style="text-align:right;"> 0.44 </td>
#>    <td style="text-align:right;"> 2.0 </td>
#>    <td style="text-align:right;"> 4.4 </td>
#>    <td style="text-align:right;"> 0.32 </td>
#>    <td style="text-align:right;"> 0.23 </td>
#>    <td style="text-align:right;"> 0.05 </td>
#>    <td style="text-align:right;"> 150 </td>
#>    <td style="text-align:right;"> 0 </td>
#>    <td style="text-align:right;"> 150 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:left;"> Petal.Length </td>
#>    <td style="text-align:right;"> 3.76 </td>
#>    <td style="text-align:left;"> [3.47; 4.04] </td>
#>    <td style="text-align:right;"> 4.35 </td>
#>    <td style="text-align:left;"> (multi) </td>
#>    <td style="text-align:right;"> 3.12 </td>
#>    <td style="text-align:right;"> 1.77 </td>
#>    <td style="text-align:right;"> 1.0 </td>
#>    <td style="text-align:right;"> 6.9 </td>
#>    <td style="text-align:right;"> -0.27 </td>
#>    <td style="text-align:right;"> -1.40 </td>
#>    <td style="text-align:right;"> 0.12 </td>
#>    <td style="text-align:right;"> 150 </td>
#>    <td style="text-align:right;"> 0 </td>
#>    <td style="text-align:right;"> 150 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:left;"> Petal.Width </td>
#>    <td style="text-align:right;"> 1.20 </td>
#>    <td style="text-align:left;"> [1.08; 1.32] </td>
#>    <td style="text-align:right;"> 1.30 </td>
#>    <td style="text-align:left;"> 0.2 </td>
#>    <td style="text-align:right;"> 0.58 </td>
#>    <td style="text-align:right;"> 0.76 </td>
#>    <td style="text-align:right;"> 0.1 </td>
#>    <td style="text-align:right;"> 2.5 </td>
#>    <td style="text-align:right;"> -0.10 </td>
#>    <td style="text-align:right;"> -1.34 </td>
#>    <td style="text-align:right;"> 0.09 </td>
#>    <td style="text-align:right;"> 150 </td>
#>    <td style="text-align:right;"> 0 </td>
#>    <td style="text-align:right;"> 150 </td>
#>   </tr>
#> </tbody>
#> </table>
#> 
#> #### Frequencies for Species
#> 
#> <table style='border:0px solid black !important; font-family: "Arial Narrow", "Source Sans Pro", sans-serif; margin-left: auto; margin-right: auto;' class="table table-condensed">
#>  <thead>
#>   <tr>
#>    <th style="text-align:left;">   </th>
#>    <th style="text-align:right;"> Frequencies </th>
#>    <th style="text-align:right;"> Perc.Total </th>
#>    <th style="text-align:right;"> Perc.Valid </th>
#>    <th style="text-align:right;"> Cumulative </th>
#>   </tr>
#>  </thead>
#> <tbody>
#>   <tr>
#>    <td style="text-align:left;"> setosa </td>
#>    <td style="text-align:right;"> 50 </td>
#>    <td style="text-align:right;"> 33.3 </td>
#>    <td style="text-align:right;"> 33.3 </td>
#>    <td style="text-align:right;"> 33.3 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:left;"> versicolor </td>
#>    <td style="text-align:right;"> 50 </td>
#>    <td style="text-align:right;"> 33.3 </td>
#>    <td style="text-align:right;"> 33.3 </td>
#>    <td style="text-align:right;"> 66.7 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:left;"> virginica </td>
#>    <td style="text-align:right;"> 50 </td>
#>    <td style="text-align:right;"> 33.3 </td>
#>    <td style="text-align:right;"> 33.3 </td>
#>    <td style="text-align:right;"> 100.0 </td>
#>   </tr>
#>   <tr>
#>    <td style="text-align:left;"> Total valid </td>
#>    <td style="text-align:right;"> 150 </td>
#>    <td style="text-align:right;"> 100.0 </td>
#>    <td style="text-align:right;"> 100.0 </td>
#>    <td style="text-align:right;">  </td>
#>   </tr>
#> </tbody>
#> </table>
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> 
#> </div>
#> <div style="display:block;clear:both;" class="rosetta-descr-end"></div>