The regr
function wraps a number of linear regression functions into
one convenient interface that provides similar output to the regression
function in SPSS. It automatically provides confidence intervals and
standardized coefficients. Note that this function is meant for teaching
purposes, and therefore it's only for very basic regression analyses; for
more functionality, use the base R function lm
or e.g. the lme4
package.
regr(
formula,
data = NULL,
conf.level = 0.95,
digits = 2,
pvalueDigits = 3,
coefficients = c("raw", "scaled"),
plot = FALSE,
pointAlpha = 0.5,
collinearity = FALSE,
influential = FALSE,
ci.method = c("widest", "r.con", "olkinfinn"),
ci.method.note = FALSE,
headingLevel = 3,
env = parent.frame()
)
rosettaRegr_partial(
x,
digits = x$input$digits,
pvalueDigits = x$input$pvalueDigits,
headingLevel = x$input$headingLevel,
echoPartial = FALSE,
partialFile = NULL,
quiet = TRUE,
...
)
# S3 method for rosettaRegr
knit_print(
x,
digits = x$input$digits,
headingLevel = x$input$headingLevel,
pvalueDigits = x$input$pvalueDigits,
echoPartial = FALSE,
partialFile = NULL,
quiet = TRUE,
...
)
# S3 method for rosettaRegr
print(
x,
digits = x$input$digits,
pvalueDigits = x$input$pvalueDigits,
headingLevel = x$input$headingLevel,
forceKnitrOutput = FALSE,
...
)
# S3 method for rosettaRegr
pander(x, digits = x$input$digits, pvalueDigits = x$input$pvalueDigits, ...)
The formula of the regression analysis, of the form y ~
x1 + x2
, where y is the dependent variable and x1 and x2 are the
predictors.
If the terms in the formula aren't vectors but variable names, this should be the dataframe where those variables are stored.
The confidence of the confidence interval around the regression coefficients.
Number of digits to round the output to.
The number of digits to show for p-values; smaller p-values will be shown as <.001 or <.0001 etc.
Which coefficients to show; can be "raw" to only show the raw (unstandardized) coefficients; "scaled" to only show the scaled (standardized) coefficients), or c("raw", "scaled') to show both.
For regression analyses with only one predictor (also sometimes confusingly referred to as 'univariate' regression analyses), scatterplots with regression lines and their standard errors can be produced.
The alpha channel (transparency, or rather: 'opaqueness') of the points drawn in the plot.
Whether to compute and show collinearity diagnostics (specifically, the tolerance (1 - R^2, where R^2 is the one obtained when regressing each predictor on all the other predictors) and the Variance Inflation Factor (VIF), which is the reciprocal of the tolerance, i.e. VIF = 1 / tolerance).
Whether to compute diagnostics for influential cases.
These are stored in the returned object in the lm.influence.raw
and
lm.influence.scaled
objects in the intermediate
object. They
are not printed.
Which method to use for the confidence interval around R squared, and whether to display a note about this choice.
The number of hashes to print in front of the headings when printing while knitting
The enviroment where to evaluate the formula.
The object to print (i.e. as produced by regr
).
Whether to show the executed code in the R Markdown
partial (TRUE
) or not (FALSE
).
This can be used to specify a custom partial file. The
file will have object x
available.
Passed on to knitr::knit()
whether it should b
chatty (FALSE
) or quiet (TRUE
).
Any additional arguments are passed to the default print method
by the print method, and to rmdpartials::partial()
when knitting an
RMarkdown partial.
Force knitr output.
A list of three elements:
List with input arguments
List of intermediate objects, such as the lm and confint objects.
List with two dataframes, one with the raw coefficients, and one with the scaled coefficients.
### Do a simple regression analysis
rosetta::regr(age ~ circumference, dat=Orange);
#> Regression analysis
#> Formula: age ~ circumference
#> Sample size: 35
#>
#> Significance test of the entire model (all predictors together):
#> Multiple R-squared: [.74; .92] (point estimate = 0.83, adjusted = 0.83)
#> Test for significance: F[1, 33] = 166.42, p < .001
#>
#> Raw regression coefficients (unstandardized beta values, called 'B' in SPSS):
#>
#> 95% conf. int. estimate se t p
#> (Intercept) [-142.37; 175.58] 16.60 78.14 0.21 .833
#> circumference [6.58; 9.05] 7.82 0.61 12.90 <.001
#>
#> Scaled regression coefficients (standardized beta values, called 'Beta' in SPSS):
#>
#> 95% conf. int. estimate se t p
#> (Intercept) [-0.14; 0.14] 0.00 0.07 0.0 1
#> circumference [0.77; 1.06] 0.91 0.07 12.9 <.001
### Show more digits for the p-value
rosetta::regr(Orange$age ~ Orange$circumference, pvalueDigits=18);
#> Regression analysis
#> Formula: age ~ circumference
#> Sample size: 35
#>
#> Significance test of the entire model (all predictors together):
#> Multiple R-squared: [.74; .92] (point estimate = 0.83, adjusted = 0.83)
#> Test for significance: F[1, 33] = 166.42, p = .0000000000000193059999999999993
#>
#> Raw regression coefficients (unstandardized beta values, called 'B' in SPSS):
#>
#> 95% conf. int. estimate se t
#> (Intercept) [-142.37; 175.58] 16.60 78.14 0.21
#> circumference [6.58; 9.05] 7.82 0.61 12.90
#> p
#> (Intercept) .833036764288624054
#> circumference .0000000000000193059999999999993
#>
#> Scaled regression coefficients (standardized beta values, called 'Beta' in SPSS):
#>
#> 95% conf. int. estimate se t
#> (Intercept) [-0.14; 0.14] 0.00 0.07 0.0
#> circumference [0.77; 1.06] 0.91 0.07 12.9
#> p
#> (Intercept) .999999999999999112
#> circumference .0000000000000193059999999999993
if (FALSE) {
### An example with an interaction term, showing in the
### viewer
rosetta::rosettaRegr_partial(
rosetta::regr(
mpg ~ wt + hp + wt:hp,
dat=mtcars,
coefficients = "raw",
plot=TRUE,
collinearity=TRUE
)
);
}