This vignette of package
groupedHyperframe (CRAN, Github, RPubs)
documents the creation of groupedHyperframe object, the
batch processes for a groupedHyperframe, and aggregations
of various statistics over multi-level grouping structure.
Package groupedHyperframe may require
the development versions of the spatstat
family.
devtools::install_github('spatstat/spatstat')
devtools::install_github('spatstat/spatstat.data')
devtools::install_github('spatstat/spatstat.explore')
devtools::install_github('spatstat/spatstat.geom')
devtools::install_github('spatstat/spatstat.linnet')
devtools::install_github('spatstat/spatstat.model')
devtools::install_github('spatstat/spatstat.random')
devtools::install_github('spatstat/spatstat.sparse')
devtools::install_github('spatstat/spatstat.univar')
devtools::install_github('spatstat/spatstat.utils')Examples in this vignette require that the search path
has
library(groupedHyperframe)
library(spatstat.data)
library(survival) # to help hyperframe understand Surv objectUsers should remove the parameter mc.cores = 1L from all
examples to engage all CPU cores on the current host under macOS. The
authors of package groupedHyperframe are
forced to have mc.cores = 1L in this vignette to pass
CRAN’s submission check.
| Term / Abbreviation | Description | Reference |
|---|---|---|
| Forward pipe operator |
?base::pipeOp introduced in R 4.1.0
|
|
attr
|
Attributes |
base::attr; base::attributes
|
CRAN, R
|
The Comprehensive R Archive Network | https://cran.r-project.org |
data.frame
|
Data frame |
base::data.frame
|
formula
|
Formula |
stats::formula
|
fv, fv.object, fv.plot
|
(Plot of) function value table |
spatstat.explore::fv.object,
spatstat.explore::plot.fv
|
groupedData, ~ g1/.../gm
|
Grouped data frame; nested grouping structure |
nlme::groupedData; nlme::lme
|
hypercolumns, hyperframe
|
(Hyper columns of) hyper data frame |
spatstat.geom::hyperframe
|
inherits
|
Class inheritance |
base::inherits
|
kerndens
|
Kernel density |
stats::density.default()$y
|
mc.cores
|
Number of CPU cores to use |
parallel::mclapply; parallel::detectCores
|
multitype
|
Multitype object |
spatstat.geom::is.multitype
|
object.size
|
Memory allocation |
utils::object.size
|
pmean, pmedian
|
Parallel mean and median |
groupedHyperframe::pmean;
groupedHyperframe::pmedian
|
pmax, pmin
|
Parallel maxima and minima |
base::pmax; base::pmin
|
ppp, ppp.object
|
(Marked) point pattern |
spatstat.geom::ppp.object
|
quantile
|
Quantile |
stats::quantile
|
save, xz
|
Save with xz compression
|
base::save(., compress = 'xz');
base::saveRDS(., compress = 'xz'); https://en.wikipedia.org/wiki/XZ_Utils
|
S3, generic, methods
|
S3 object oriented system
|
base::UseMethod; utils::methods;
utils::getS3method; https://adv-r.hadley.nz/s3.html
|
search
|
Search path |
base::search
|
Surv
|
Survival object |
survival::Surv
|
trapz, cumtrapz
|
(Cumulative) trapezoidal integration |
pracma::trapz; pracma::cumtrapz; https://en.wikipedia.org/wiki/Trapezoidal_rule
|
This work supported by NCI R01CA222847 (I. Chervoneva, T. Zhan, and H. Rui) and R01CA253977 (H. Rui and I. Chervoneva).
groupedHyperframe ClassThe S3 class groupedHyperframe
inherits from the hyperframe class, in a
similar fashion as the groupedData class inherits from the
data.frame class.
A groupedHyperframe object, in addition to a
hyperframe object, has attribute(s)
attr(., 'group'), a formula to specify the
(nested) grouping structuregroupedHyperframehyperframeThe S3 method dispatch
as.groupedHyperframe.hyperframe() converts a
hyperframe to groupedHyperframe. Data set
spatstat.data::osteo has the serial number of sampling
volume brick nested in the bone sample id,
osteo |> as.groupedHyperframe(group = ~ id/brick)
#> Grouped Hyperframe: ~id/brick
#>
#> 40 brick nested in
#> 4 id
#>
#> id shortid brick pts depth
#> 1 c77za4 4 1 (pp3) 45
#> 2 c77za4 4 2 (pp3) 60
#> 3 c77za4 4 3 (pp3) 55
#> 4 c77za4 4 4 (pp3) 60
#> 5 c77za4 4 5 (pp3) 85
#> 6 c77za4 4 6 (pp3) 90
#> 7 c77za4 4 7 (pp3) 95
#> 8 c77za4 4 8 (pp3) 65
#> 9 c77za4 4 9 (pp3) 100
#> 10 c77za4 4 10 (pp3) 100data.frameThe S3 method dispatch
as.groupedHyperframe.data.frame() converts a
data.frame to a groupedHyperframe. This
function inspects the input by the (nested) grouping structure,
identifies the column(s) with elements not identical within the lowest
group, and converts them into hypercolumns. Data set
Ki67. in this package has non-identical
column logKi67 in the nested grouping structure
~ patientID/tissueID.
(Ki67g = Ki67. |> as.groupedHyperframe(group = ~ patientID/tissueID, mc.cores = 1L))
#> Grouped Hyperframe: ~patientID/tissueID
#>
#> 6 tissueID nested in
#> 6 patientID
#>
#> logKi67 tissueID Tstage PFS recfreesurv_mon recurrence adj_rad adj_chemo
#> 1 (numeric) TJUe_I17 2 100+ 100 0 FALSE FALSE
#> 2 (numeric) TJUe_G17 1 22 22 1 FALSE FALSE
#> 3 (numeric) TJUe_F17 1 99+ 99 0 FALSE NA
#> 4 (numeric) TJUe_D17 1 99+ 99 0 FALSE TRUE
#> 5 (numeric) TJUe_J18 1 112 112 1 TRUE TRUE
#> 6 (numeric) TJUe_N17 4 12 12 1 TRUE FALSE
#> histology Her2 HR node race age patientID
#> 1 3 TRUE TRUE TRUE White 66 PT00037
#> 2 3 FALSE TRUE FALSE Black 42 PT00039
#> 3 3 FALSE TRUE FALSE White 60 PT00040
#> 4 3 FALSE TRUE TRUE White 53 PT00042
#> 5 3 FALSE TRUE TRUE White 52 PT00054
#> 6 2 TRUE TRUE TRUE Black 51 PT00059Converting a data.frame with cell intensities, etc.,
into a groupedHyperframe reduces memory allocation, but
does not reduce much the saved files size if
xz compression is used.
groupedHyperframe with
ppp-hypercolumnFunction grouped_ppp() creates a
groupedHyperframe with one-and-only-one
ppp-hypercolumn. In the following example, the
argument formula specifies
numeric mark
hladr and multitype mark
phenotype, on the left-hand-sideOS, gender and
age, before the | separator on the
right-hand-sideimage_id nested
in patient_id, after the | separator
on the right-hand-side.(s = grouped_ppp(formula = hladr + phenotype ~ OS + gender + age | patient_id/image_id,
data = wrobel_lung, mc.cores = 1L))
#> Grouped Hyperframe: ~patient_id/image_id
#>
#> 25 image_id nested in
#> 5 patient_id
#>
#> OS gender age patient_id image_id ppp.
#> 1 3488+ F 85 #01 0-889-121 [40864,18015].im3 (ppp)
#> 2 3488+ F 85 #01 0-889-121 [42689,19214].im3 (ppp)
#> 3 3488+ F 85 #01 0-889-121 [42806,16718].im3 (ppp)
#> 4 3488+ F 85 #01 0-889-121 [44311,17766].im3 (ppp)
#> 5 3488+ F 85 #01 0-889-121 [45366,16647].im3 (ppp)
#> 6 1605 M 66 #02 1-037-393 [56576,16907].im3 (ppp)
#> 7 1605 M 66 #02 1-037-393 [56583,15235].im3 (ppp)
#> 8 1605 M 66 #02 1-037-393 [57130,16082].im3 (ppp)
#> 9 1605 M 66 #02 1-037-393 [57396,17896].im3 (ppp)
#> 10 1605 M 66 #02 1-037-393 [57403,16934].im3 (ppp)ppp-hypercolumnIn this section, we outline the batch processes of spatial point
pattern analyses applicable to the one-and-only-one
ppp-hypercolumn of a hyperframe.
These batch processes are not intended for a hyperframe
with multiple ppp-hypercolumns in the
foreseeable future, as that would require checking for name clashes in
the $marks from multiple
ppp-hypercolumns.
fv-hypercolumn| Batch Process | Workhorse in
spatstat.explore |
Applicable To | fv-hypercolumn Suffix |
|---|---|---|---|
Emark_() |
Emark() |
numeric marks |
.E |
Vmark_() |
Vmark() |
numeric marks |
.V |
markcorr_() |
markcorr() |
numeric marks |
.k |
markvario_() |
markvario() |
numeric marks |
.gamma |
Gcross_() |
Gcross() |
multitype marks |
.G |
Kcross_() |
Kcross() |
multitype marks |
.K |
Jcross_() |
Jcross() |
multitype marks |
.J |
numeric-hypercolumn| Batch Process | Workhorse in
spatstat.geom |
Applicable To | numeric-hypercolumn
Suffix |
|---|---|---|---|
nncross_() |
nncross.ppp(., what = 'dist') |
multitype marks |
.nncross |
Multiple batch processes may be applied to a hyperframe
(or groupedHyperframe) in a pipeline.
r = seq.int(from = 0, to = 250, by = 10)
out = s |>
Emark_(r = r, correction = 'best', mc.cores = 1L) |> # slow
# Vmark_(r = r, correction = 'best', mc.cores = 1L) |> # slow
# markcorr_(r = r, correction = 'best', mc.cores = 1L) |> # slow
# markvario_(r = r, correction = 'best', mc.cores = 1L) |> # slow
Gcross_(i = 'CK+.CD8-', j = 'CK-.CD8+', r = r, correction = 'best', mc.cores = 1L) |> # fast
# Kcross_(i = 'CK+.CD8-', j = 'CK-.CD8+', r = r, correction = 'best', mc.cores = 1L) |> # fast
nncross_(i = 'CK+.CD8-', j = 'CK-.CD8+', correction = 'best', mc.cores = 1L) # fast
#> The returned hyperframe (or
groupedHyperframe) has
fv-hypercolumn
hladr.E, created by function Emark_()
on numeric mark hladrfv-hypercolumn
phenotype.G, created by function
Gcross_() on multitype mark
phenotypenumeric-hypercolumn
phenotype.nncross, created by function
nncross_() on multitype mark
phenotypeout
#> Grouped Hyperframe: ~patient_id/image_id
#>
#> 25 image_id nested in
#> 5 patient_id
#>
#> OS gender age patient_id image_id ppp. hladr.E phenotype.G
#> 1 3488+ F 85 #01 0-889-121 [40864,18015].im3 (ppp) (fv) (fv)
#> 2 3488+ F 85 #01 0-889-121 [42689,19214].im3 (ppp) (fv) (fv)
#> 3 3488+ F 85 #01 0-889-121 [42806,16718].im3 (ppp) (fv) (fv)
#> 4 3488+ F 85 #01 0-889-121 [44311,17766].im3 (ppp) (fv) (fv)
#> 5 3488+ F 85 #01 0-889-121 [45366,16647].im3 (ppp) (fv) (fv)
#> 6 1605 M 66 #02 1-037-393 [56576,16907].im3 (ppp) (fv) (fv)
#> 7 1605 M 66 #02 1-037-393 [56583,15235].im3 (ppp) (fv) (fv)
#> 8 1605 M 66 #02 1-037-393 [57130,16082].im3 (ppp) (fv) (fv)
#> 9 1605 M 66 #02 1-037-393 [57396,17896].im3 (ppp) (fv) (fv)
#> 10 1605 M 66 #02 1-037-393 [57403,16934].im3 (ppp) (fv) (fv)
#> phenotype.nncross
#> 1 (numeric)
#> 2 (numeric)
#> 3 (numeric)
#> 4 (numeric)
#> 5 (numeric)
#> 6 (numeric)
#> 7 (numeric)
#> 8 (numeric)
#> 9 (numeric)
#> 10 (numeric)When nested grouping structure ~g1/g2/.../gm is present,
we may aggregate over the
fv-hypercolumnsnumeric-hypercolumnsnumeric marks in the
ppp-hypercolumnby either one of the grouping levels ~g1,
~g2, …, or ~gm. If the lowest grouping
~gm is specified, then no aggregation is performed.
fv-hypercolumnsFunction aggregate_fv() aggregates
fv.plot. In the following example, we have
numeric-hypercolumns
hladr.E.value and
phenotype.G.value, aggregated function values from
fv-hypercolumns hladr.E
and phenotype.Gnumeric-hypercolumns
hladr.E.cumtrapz and
phenotype.G.cumtrapz, aggregated cumulative
trapezoidal integration from fv-hypercolumns
hladr.E and phenotype.G(afv = out |>
aggregate_fv(by = ~ patient_id, f_aggr_ = pmean, mc.cores = 1L))
#> Column(s) image_id removed; as they are not identical per aggregation-group
#> Hyperframe:
#> OS gender age patient_id hladr.E.value hladr.E.cumtrapz
#> 1 3488+ F 85 #01 0-889-121 (numeric) (numeric)
#> 2 1605 M 66 #02 1-037-393 (numeric) (numeric)
#> 3 176 M 84 #03 2-080-378 (numeric) (numeric)
#> 4 2042+ M 79 #04 2-223-153 (numeric) (numeric)
#> 5 3747+ M 68 #05 2-286-740 (numeric) (numeric)
#> phenotype.G.value phenotype.G.cumtrapz
#> 1 (numeric) (numeric)
#> 2 (numeric) (numeric)
#> 3 (numeric) (numeric)
#> 4 (numeric) (numeric)
#> 5 (numeric) (numeric)Each of the numeric-hypercolumns contains
tabulated values on the common grid of r. One “slice” of
this grid may be extracted by
numeric-hypercolumns and
numeric mark(s) in
ppp-hypercolumnFunction aggregate_quantile() aggregates the quantile
of
numeric-hypercolumns. In the following
example, we have
numeric-hypercolumn
phenotype.nncross.quantile, aggregated quantile of
numeric-hypercolumn
phenotype.nncrossnumeric mark(s) in the
ppp-hypercolumn. In the following example, we
have
numeric-hypercolumn
hladr.quantile, aggregated quantile of
numeric mark hladr in
ppp-hypercolumnout |>
aggregate_quantile(by = ~ patient_id, probs = seq.int(from = 0, to = 1, by = .1), mc.cores = 1L)
#> Column(s) image_id removed; as they are not identical per aggregation-group
#> Hyperframe:
#> OS gender age patient_id phenotype.nncross.quantile hladr.quantile
#> 1 3488+ F 85 #01 0-889-121 (numeric) (numeric)
#> 2 1605 M 66 #02 1-037-393 (numeric) (numeric)
#> 3 176 M 84 #03 2-080-378 (numeric) (numeric)
#> 4 2042+ M 79 #04 2-223-153 (numeric) (numeric)
#> 5 3747+ M 68 #05 2-286-740 (numeric) (numeric)Function aggregate_kerndens() aggregates the kernel
density of
numeric-hypercolumns. In the following
example, we have
numeric-hypercolumn
phenotype.nncross.kerndens, aggregated kernel
density of numeric-hypercolumn
phenotype.nncrossnumeric mark(s) in the
ppp-hypercolumn. In the following example, we
have
numeric-hypercolumn
hladr.kerndens, aggregated kernel density of
numeric mark hladr in
ppp-hypercolumn(mdist = out$phenotype.nncross |> unlist() |> max())
#> [1] 354.2968
out |>
aggregate_kerndens(by = ~ patient_id, from = 0, to = mdist, mc.cores = 1L)
#> Column(s) image_id removed; as they are not identical per aggregation-group
#> Hyperframe:
#> OS gender age patient_id phenotype.nncross.kerndens hladr.kerndens
#> 1 3488+ F 85 #01 0-889-121 (numeric) (numeric)
#> 2 1605 M 66 #02 1-037-393 (numeric) (numeric)
#> 3 176 M 84 #03 2-080-378 (numeric) (numeric)
#> 4 2042+ M 79 #04 2-223-153 (numeric) (numeric)
#> 5 3747+ M 68 #05 2-286-740 (numeric) (numeric)