A Package Usage

The purpose of this work is to have the segmentr to be as accessible as possible. Because of that, extra care was put into making it easy to install and use.

A.1 Installation

Given the algorithms and their applications discussed in (Castro et al. 2018), the segmentr package for R is proposed to help researchers segment their data sets. Installation can be done using the default install.packages command, as shown below.

install.packages("segmentr")

A.2 Usage

The package can be used with [segment()]. It takes a data argument containing a bi-dimensional matrix in which the rows represent different samples and the columns represent the comprehension of the data set we wish to segment. The function also accepts an algorithm argument, which can be exact, hierarchical or hybrid, that specifies the type of algorithm will be used when exploring the data set. Finally, it’s also necessary to specify a cost function as an argument, as it is used to compare and pick segments that are better fit according to the given cost’s chosen criteria.

library("segmentr")

data <- rbind(
  c(1, 1, 0, 0, 0, 1023, 134521, 12324),
  c(1, 1, 0, 0, 0, -20941, 1423, 14334),
  c(1, 1, 0, 0, 0, 2398439, 1254, 146324),
  c(1, 1, 0, 0, 0, 24134, 1, 15323),
  c(1, 1, 0, 0, 0, -231, 1256, 13445),
  c(1, 1, 0, 0, 0, 10000, 1121, 331)
)

segment(
  data,
  algorithm = "exact",
  cost = function(X) -multivariate(X) + 0.01*exp(ncol(X))
)
## Segments (total of 6):
## 
## 1:1
## 2:2
## 3:3
## 4:4
## 5:5
## 6:8

Also, a vignette version of Chapter 5 is provided, and the user of the package can open it directly in R with embedded code examples and follow along with the analysis of the data set.

References

Castro, Bruno M., Renan B. Lemes, Jonatas Cesar, Tábita Hünemeier, and Florencia Leonardi. 2018. “A Model Selection Approach for Multiple Sequence Segmentation and Dimensionality Reduction.” Journal of Multivariate Analysis 167: 319–30. https://doi.org/https://doi.org/10.1016/j.jmva.2018.05.006.