16  End

This book contains both practical guides on exploring missing data, as well as some of the deeper details of how naniar works to help you better explore your missing data. A large component of this book are the exercises that accompany each section in each chapter.

library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
✔ ggplot2 3.3.6     ✔ purrr   0.3.4
✔ tibble  3.1.7     ✔ dplyr   1.0.9
✔ tidyr   1.2.0     ✔ stringr 1.4.0
✔ readr   2.1.2     ✔ forcats 0.5.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
library(naniar)

16.0.1 This is only the beginning!

Now as they say, this is only the beginning. This course covered an often overlooked area of statistics - missing data, and inside the world of missing data, we also covered yet another area that is often overlooked: How to handle, explore, and visualise missing values.

To continue your journey, and learn more about missing data, you should check out the naniar package, which contains many useful functions to explore and evaluate your missing data, as well as numerous vignettes.

The visdat package provides more than just heatmaps of missing data, and is well worth looking into to learn more about pre exploratory visualisation:

From here, to continue your journey, you might want to explore other workflows for imputing your missing data.

There are many ways to decide how to impute data. We didn’t have time for it in the course, but multiple imputation is another great area of research - to learn more about multiple imputation, I highly recommend Stefan van Buuren’s package, mice, and his book, Flexible Imputation of Missing Data.

naniar.njtierney.com visdat.njtierney.com mice R package flexible imputation of missing data