volkeR-Package volkeR package logo

Lifecycle: experimental R-CMD-check CRAN status

High-level functions for tabulating, charting and reporting survey data.

Getting started

# Install the package (see below), then load it
library(volker)

# Load example data from the package
data <- volker::chatgpt

# Create your first plot, counting answers to an item battery
plot_counts(data, starts_with("cg_adoption_social"))

# Create your first table, summarising the item battery
tab_metrics(data, starts_with("cg_adoption_social"))

See further examples in the introduction vignette.

Don’t miss the template feature: Within RStudio, create a new Markdown document, select From template, choose and finally knit the volkeR Report! It’s a blueprint for your own tidy reports.

Concept

The volkeR package is made for creating quick and easy overviews about datasets. It handles standard cases with a handful of functions. Basically you select one of the following functions and throw your data in:

Which one is best? That depends on your objective:

Examples

Metric Categorical
One variable Density plot Bar chart
Group comparison Group comparison Stacked bar chart
Multiple items Item battery boxplots Item battery bar chart


All functions take a data frame as their first argument, followed by column selections, and optionally a grouping column. Examples:

Examples:

Hint: replace tab_ by plot_ to reproduce the examples above. You’ll find different table, plot and report types in the introduction vignette. For further options to customize the results, see the builtin function help (F1 key).

After deciding whether to plot or tabulate, and whether to handle metric or counted data, the column selections determine which of the following methods are called under the hood. When you provide two sets of columns in the first two parameters, data is crossed. By default, the second parameter is handled as a categorical variable, resulting in grouped tables and plots. For handling metric variables and their correlations, set the metric-parameter to TRUE. (Note: Some are not implemented yet.)

# function implemented output scale columns crossings
1 tab_counts_one table counts one
2 tab_counts_one_grouped table counts one grouped
3 tab_counts_one_cor not yet table counts one correlated
4 tab_counts_items table counts multiple
5 tab_counts_items_grouped not yet table counts multiple grouped
6 tab_counts_items_cor not yet table counts multiple correlated
7 tab_metrics_one table metrics one
8 tab_metrics_one_grouped table metrics one grouped
9 tab_metrics_one_cor table metrics one correlated
10 tab_metrics_items table metrics multiple
11 tab_metrics_items_grouped table metrics multiple grouped
12 tab_metrics_items_cor table metrics multiple correlated
13 plot_counts_one plot counts one
14 plot_counts_one_grouped plot counts one grouped
15 plot_counts_one_cor not yet plot counts one correlated
16 plot_counts_items plot counts multiple
17 plot_counts_items_grouped not yet plot counts multiple grouped
18 plot_counts_items_cor not yet plot counts multiple correlated
19 plot_metrics_one plot metrics one
20 plot_metrics_one_grouped plot metrics one grouped
21 plot_metrics_one_cor plot metrics one correlated
22 plot_metrics_items plot metrics multiple
23 plot_metrics_items_grouped plot metrics multiple grouped
24 plot_metrics_items_cor plot metrics multiple correlated

Effect sizes and statistical tests

You can calculate effect sizes and conduct basic statistical tests using effect_counts() and effect_metrics(). Effect calculation is included in the reports if you request it by the effect-parameter of report_counts() or report_metrics().

A word of warning: Statistics is the world of uncertainty. All procedures require mindful interpretation. Counting stars might evoke illusions.

# function implemented effect size confidence intervals significance test
1 effect_counts_one not yet
2 effect_counts_one_grouped Cramér’s V proportions Chi squared
3 effect_counts_one_cor not yet
4 effect_counts_items not yet
5 effect_counts_items_grouped not yet
6 effect_counts_items_cor not yet
7 effect_metrics_one not yet
8 effect_metrics_one_grouped R squared means t-test
9 effect_metrics_one_cor Pearson’s r, Spearman’s rho correlation t-test
10 effect_metrics_items R squared means t-test
11 effect_metrics_items_grouped not yet
12 effect_metrics_items_cor Pearson’s r, Spearman’s rho correlation t-test

Where do all the labels go?

One of the strongest package features is labeling. You know the pain. Labels are stored in the column attributes. Inspect current labels of columns and values by the codebook()-function:

codebook(data)

This results in a table with item names, item values, value names and value labels. The same table format can be used to manually set labels with labs_apply():

newlabels <- tribble(
  ~item_name,                 ~item_label,
  "cg_adoption_advantage_01", "Allgemeine Vorteile",
  "cg_adoption_advantage_02", "Finanzielle Vorteile",
  "cg_adoption_advantage_03", "Vorteile bei der Arbeit",
  "cg_adoption_advantage_04", "Macht mehr Spaß"
)

data %>%
  labs_apply(newlabels) %>%
  tab_metrics(starts_with("cg_adoption_advantage_"))

Be aware that some data operations such as mutate() from the tidyverse loose labels on their way. In this case, store the labels (in the codebook attribute of the data frame) before the operation and restore them afterwards:

data %>%
  labs_store() %>%
  mutate(sd_age = 2024 - sd_age) %>% 
  labs_restore() %>% 
  
  tab_metrics(sd_age)

SoSci Survey integration

The labeling mechanisms follow a technique used, for example, on SoSci Survey. Sidenote for techies: Labels are stored in the column attributes. That’s why you can directly throw in labeled data from the SoSci Survey API:

library(volker)

# Get your API link from SoSci Survey with settings "Daten als CSV für R abrufen"
eval(parse("https://www.soscisurvey.de/YOURPROJECT/?act=YOURKEY&rScript", encoding="UTF-8"))

# Generate reportings
report_counts(ds, A002)

For best results, use sensible prefixes and captions for your SoSci questions. The labels come directly from your questionnaire.

Please note: The values -9 and [NA] nicht beantwortet are automatically recoded to missing values within all plot, tab, effect, and report functions. See the negatives-Parameter and the clean-parameter how to disable automatic residual removal.

Customization

You can change plot colors using the theme_vlkr()-function:

theme_set(
  theme_vlkr(
    base_fill = c("#F0983A","#3ABEF0","#95EF39","#E35FF5","#7A9B59"),
    base_gradient = c("#FAE2C4","#F0983A")
  )
)

Plot and table functions share a number of parameters that can be used to customize the outputs. Lookup the available parameters in the help of the specific function.

Data preparation

Calculations

Labeling

Tables

Plots

Installation

As with all other packages you’ll have to install the package first.

install.packages("strohne/volker")

You can try alternative versions:

2. After installing the package, load it:

library(volker)

3. Finally, use it:

# Example data
data <- volker::chatgpt

# Example table
tab_metrics(data, sd_age, sd_gender)

Special features

Troubleshooting

The kableExtra package produces an error in R 4.3 when knitting documents: .onLoad in loadNamespace() für 'kableExtra' fehlgeschlagen. As a work around, remove PDF and Word settings from the output options in you markdown document (the yml section at the top). Alternatively, install the latest development version:

remotes::install_github("kupietz/kableExtra")

Roadmap

Version Features Status
1.0 Descriptives work in progress
2.0 Regression tables work in progress
3.0 Topic modeling work in progress

Similar packages

The volker package is inspired by outputs used in the the textbook Einfache Datenauswertung mit R (Gehrau & Maubach et al., 2022), which provides an introduction to univariate and bivariate statistics and data representation using RStudio and R Markdown.

Other packages with high-level reporting functions:
- https://github.com/joon-e/tidycomm
- https://github.com/kassambara/rstatix

Authors and citation

Authors
Jakob Jünger (University of Münster)
Henrieke Kotthoff (University of Münster)

Contributers
Chantal Gärtner (University of Münster)

Citation
Jünger, J. & Kotthoff, H. (2024). volker: High-level functions for tabulating, charting and reporting survey data. R package version 2.0.