A Package Usage
The purpose of this work is to have the segmentr
to be as accessible as possible.
Because of that, extra care was put into making it easy to install and use.
A.1 Installation
Given the algorithms and their applications discussed in (Castro et al. 2018),
the segmentr
package for R is proposed to help researchers segment their
data sets. Installation can be done using the default install.packages
command,
as shown below.
A.2 Usage
The package can be used with [segment()]. It takes a data
argument containing a bi-dimensional matrix
in which the rows represent different samples and the columns represent the comprehension of the data set
we wish to segment. The function also accepts an algorithm
argument, which can be exact
, hierarchical
or hybrid
, that specifies the type of algorithm will be used when exploring the data set. Finally, it’s
also necessary to specify a cost
function as an argument, as it is used to compare and pick segments
that are better fit according to the given cost’s chosen criteria.
library("segmentr")
data <- rbind(
c(1, 1, 0, 0, 0, 1023, 134521, 12324),
c(1, 1, 0, 0, 0, -20941, 1423, 14334),
c(1, 1, 0, 0, 0, 2398439, 1254, 146324),
c(1, 1, 0, 0, 0, 24134, 1, 15323),
c(1, 1, 0, 0, 0, -231, 1256, 13445),
c(1, 1, 0, 0, 0, 10000, 1121, 331)
)
segment(
data,
algorithm = "exact",
cost = function(X) -multivariate(X) + 0.01*exp(ncol(X))
)
## Segments (total of 6):
##
## 1:1
## 2:2
## 3:3
## 4:4
## 5:5
## 6:8
Also, a vignette version of Chapter 5 is provided, and the user of the package can open it directly in R with embedded code examples and follow along with the analysis of the data set.
References
Castro, Bruno M., Renan B. Lemes, Jonatas Cesar, Tábita Hünemeier, and Florencia Leonardi. 2018. “A Model Selection Approach for Multiple Sequence Segmentation and Dimensionality Reduction.” Journal of Multivariate Analysis 167: 319–30. https://doi.org/https://doi.org/10.1016/j.jmva.2018.05.006.