vignettes/ggcoef_model.Rmd
ggcoef_model.Rmd
library(GGally)
#> Loading required package: ggplot2
#> Registered S3 method overwritten by 'GGally':
#> method from
#> +.gg ggplot2
GGally::ggcoef_model()
The purpose of this function is to quickly plot the coefficients of a model. It is an updated and improved version of GGally::ggcoef()
based on broom.helpers::tidy_plus_plus()
. For displaying a nicely formatted table of the same models, look at gtsummary::tbl_regression()
.
To work automatically, this function requires the broom.helpers
package. Simply call ggcoef_model()
with a model object. It could be the result of stats::lm
, stats::glm
or any other model covered by broom.helpers
1.
data(tips, package = "reshape")
mod_simple <- lm(tip ~ day + time + total_bill, data = tips)
ggcoef_model(mod_simple)
In the case of a logistic regression (or any other model for which coefficients are usually exponentiated), simply indicated exponentiate = TRUE
. Note that a logarithmic scale will be used for the x-axis.
d_titanic <- as.data.frame(Titanic)
d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes"))
mod_titanic <- glm(
Survived ~ Sex * Age + Class,
weights = Freq,
data = d_titanic,
family = binomial
)
ggcoef_model(mod_titanic, exponentiate = TRUE)
You can use the labelled
package to define variable labels. They will be automatically used by ggcoef_model()
. Note that variable labels should be defined before computing the model.
library(labelled)
tips_labelled <- tips %>%
set_variable_labels(
day = "Day of the week",
time = "Lunch or Dinner",
total_bill = "Bill's total"
)
mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled)
ggcoef_model(mod_labelled)
You can also define custom variable labels directly by passing a named vector to the variable_labels
option.
ggcoef_model(
mod_simple,
variable_labels = c(
day = "Week day",
time = "Time (lunch or dinner ?)",
total_bill = "Total of the bill"
)
)
If variable labels are to long, you can pass ggplot2::label_wrap_gen()
or any other labeller function to facet_labeller.
ggcoef_model(
mod_simple,
variable_labels = c(
day = "Week day",
time = "Time (lunch or dinner ?)",
total_bill = "Total of the bill"
),
facet_labeller = label_wrap_gen(10)
)
Use facet_row = NULL
to hide variable names.
ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE)
Several options allows you to customize term labels.
ggcoef_model(mod_titanic, exponentiate = TRUE)
ggcoef_model(
mod_titanic,
exponentiate = TRUE,
show_p_values = FALSE,
signif_stars = FALSE,
add_reference_rows = FALSE,
categorical_terms_pattern = "{level} (ref: {reference_level})",
interaction_sep = " x "
) +
scale_y_discrete(labels = scales::label_wrap(15))
#> Scale for 'y' is already present. Adding another scale for 'y', which will
#> replace the existing scale.
By default, for categorical variables using treatment and sum contrasts, reference rows will be added and displayed on the graph.
mod_titanic2 <- glm(
Survived ~ Sex * Age + Class,
weights = Freq,
data = d_titanic,
family = binomial,
contrasts = list(Sex = contr.sum, Class = contr.treatment(4, base = 3))
)
ggcoef_model(mod_titanic2, exponentiate = TRUE)
Continuous variables with polynomial terms defined with stats::poly()
are also properly managed.
mod_poly <- lm(Sepal.Length ~ poly(Petal.Width, 3) + Petal.Length, data = iris)
ggcoef_model(mod_poly)
Use no_reference_row
to indicate which variables should not have a reference row added.
ggcoef_model(
mod_titanic2, exponentiate = TRUE,
no_reference_row = "Sex"
)
ggcoef_model(
mod_titanic2, exponentiate = TRUE,
no_reference_row = broom.helpers::all_dichotomous()
)
ggcoef_model(
mod_titanic2, exponentiate = TRUE,
no_reference_row = broom.helpers::all_categorical(),
categorical_terms_pattern = "{level}/{reference_level}"
)
Use intercept = TRUE
to display intercepts.
ggcoef_model(mod_simple, intercept = TRUE)
You can remove confidence intervals with conf.int = FALSE
.
ggcoef_model(mod_simple, conf.int = FALSE)
By default, significant terms (i.e. with a p-value below 5%) are highlighted using two types of dots. You can control the level of significance with significance
or remove it with significance = NULL
.
ggcoef_model(mod_simple, significance = NULL)
By default, dots are colored by variable. You can deactivate this behavior with colour = NULL
.
ggcoef_model(mod_simple, colour = NULL)
You can display only a subset of terms with include.
ggcoef_model(mod_simple, include = c("time", "total_bill"))
It is possible to use tidyselect
helpers.
ggcoef_model(mod_simple, include = dplyr::starts_with("t"))
You can remove stripped rows with stripped_rows = FALSE
.
ggcoef_model(mod_simple, stripped_rows = FALSE)
Do not hesitate to consult the help file of ggcoef_model()
to see all available options.
The plot returned by ggcoef_model()
is a classic ggplot2
plot. You can therefore apply ggplot2
functions to it.
ggcoef_model(mod_simple) +
xlab("Coefficients") +
ggtitle("Custom title") +
scale_color_brewer(palette = "Set1") +
theme(legend.position = "right")
#> Scale for 'colour' is already present. Adding another scale for 'colour',
#> which will replace the existing scale.
For multinomial models, simply use ggcoef_multinom()
. Two types of visualizations are available: "dodged"
and "faceted"
.
library(nnet)
data(happy)
mod <- multinom(happy ~ age + degree + sex, data = happy, trace = FALSE)
ggcoef_multinom(mod, exponentiate = TRUE)
ggcoef_multinom(mod, exponentiate = TRUE, type = "faceted")
ggcoef_multinom(
mod, exponentiate = TRUE,
y.level = c(
"pretty happy" = "pretty happy\n(ref: not too happy)",
"very happy" = "very happy"
)
)
You can easily compare several models with ggcoef_compare()
.
mod1 <- lm(Fertility ~ ., data = swiss)
mod2 <- step(mod1, trace = 0)
mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss)
models <- list("Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3)
ggcoef_compare(models)
ggcoef_compare(models, type = "faceted")
Advanced users could use their own dataset and pass it to ggcoef_plot()
. Such dataset could be produced by ggcoef_model()
, ggcoef_compare()
or ggcoef_multinom()
with the option return_data = TRUE
or by using broom::tidy()
or broom.helpers::tidy_plus_plus()
.