Abstract
One of the most widely used approaches to explore and understand non-random structure in data in a largely assumption-free manner is clustering. In this paper, we detail two original Shiny apps written in R, openly developed at Github, and archived at Zenodo, for exploring and comparing major unsupervised algorithms for clustering applications: k-means and Gaussian mixture models via Expectation-Maximization. The first app leverages simulated data and the second uses Fisher’s Iris data set to visually and numerically compare the clustering algorithms using data familiar to many applied researchers. In addition to being valuable tools for comparing these clustering techniques, the open source architecture of our Shiny apps allows for wide engagement and extension by the broader open science community, such as including different data sets and algorithms.
DOI: https://doi.org/10.5334/jors.269 | Journal eISSN: 2049-9647
Language: English
Submitted on: Mar 19, 2019
Accepted on: Sep 16, 2020
Published on: Oct 7, 2020
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year
Keywords:
© 2020 Marc Lavielle, Philip D. Waggoner, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.
