Generates combinations of specified columns and creates a nested data structure based on these pairs. Each nested subset renames the combined columns to value1, value2, ... (up to pairs_n) to support uniform iterative analyses such as genetic correlation estimation.

c2p_nest(data, cols2bind, by = NULL, pairs_n = 2L, sep = "-", nest_type = "dt")

Arguments

data

A data.frame or data.table to be transformed.

cols2bind

A character vector of column names or a numeric vector of column indices to be combined into pairs. Must not overlap with by.

by

A character vector of column names or a numeric vector of column indices to group by. Default is NULL.

pairs_n

A positive integer >= 2 indicating the size of each column combination (e.g., 2 for pairwise). Default is 2.

sep

A single character string used as a separator when constructing the pairs identifier column. Default is "-".

nest_type

A character string specifying the class of each nested object: "dt" (data.table, default) or "df" (data.frame).

Value

A data.table with columns:

pairs

Character. The column-combination identifier, e.g. "Sepal.Length-Sepal.Width".

...

Any by grouping columns, one per variable.

data

List-column. Each cell holds a data.table (or data.frame when nest_type = "df") containing value1, value2, ..., plus any extra columns that were neither in cols2bind nor by.

Details

The columns specified in cols2bind are renamed to value1, value2, ... within each nested subset. The original column names are preserved in the pairs column (e.g., "Sepal.Length-Sepal.Width"), ensuring full traceability for downstream iterative analyses such as genetic correlation estimation.

Columns that belong to neither cols2bind nor by (referred to internally as "extra columns") are retained inside the nested subsets so that covariates or ID fields remain accessible. Grouping columns (by) are not duplicated inside the nested data because they are already present as outer key columns in the returned table.

When the number of requested combinations exceeds 500 a message is emitted; above 5000 a warning is raised, as memory usage grows linearly with the combination count.

See also

combn for the underlying combination generator.

Examples

# Example data preparation: Define column names for combination
col_names <- c("Sepal.Length", "Sepal.Width", "Petal.Length")

# Example 1: Basic column-to-pairs nesting with custom separator
c2p_nest(
  iris,                   # Input iris dataset
  cols2bind = col_names,  # Columns to be combined as pairs
  pairs_n = 2,            # Create pairs of 2 columns
  sep = "&"               # Custom separator for pair names
)
#>                        pairs                data
#>                       <char>              <list>
#> 1:  Sepal.Length&Sepal.Width <data.table[150x4]>
#> 2: Sepal.Length&Petal.Length <data.table[150x4]>
#> 3:  Sepal.Width&Petal.Length <data.table[150x4]>
# Returns a nested data.table where:
# - pairs: combined column names (e.g., "Sepal.Length&Sepal.Width")
# - data: list column containing data.tables with value1, value2 columns

# Example 2: Column-to-pairs nesting with numeric indices and grouping
c2p_nest(
  iris,                   # Input iris dataset
  cols2bind = 1:3,        # First 3 columns to be combined
  pairs_n = 2,            # Create pairs of 2 columns
  by = 5                  # Group by 5th column (Species)
)
#>                        pairs    Species               data
#>                       <char>     <fctr>             <list>
#> 1:  Sepal.Length-Sepal.Width     setosa <data.table[50x3]>
#> 2:  Sepal.Length-Sepal.Width versicolor <data.table[50x3]>
#> 3:  Sepal.Length-Sepal.Width  virginica <data.table[50x3]>
#> 4: Sepal.Length-Petal.Length     setosa <data.table[50x3]>
#> 5: Sepal.Length-Petal.Length versicolor <data.table[50x3]>
#> 6: Sepal.Length-Petal.Length  virginica <data.table[50x3]>
#> 7:  Sepal.Width-Petal.Length     setosa <data.table[50x3]>
#> 8:  Sepal.Width-Petal.Length versicolor <data.table[50x3]>
#> 9:  Sepal.Width-Petal.Length  virginica <data.table[50x3]>
# Returns a nested data.table where:
# - pairs: combined column names
# - Species: grouping variable
# - data: list column containing data.tables grouped by Species

# Example data preparation: Define column names for combination
col_names <- c("Sepal.Length", "Sepal.Width", "Petal.Length")

# Example 1: Basic column-to-pairs nesting with custom separator
c2p_nest(
  iris,                   # Input iris dataset
  cols2bind = col_names,  # Columns to be combined as pairs
  pairs_n = 2,            # Create pairs of 2 columns
  sep = "&"               # Custom separator for pair names
)
#>                        pairs                data
#>                       <char>              <list>
#> 1:  Sepal.Length&Sepal.Width <data.table[150x4]>
#> 2: Sepal.Length&Petal.Length <data.table[150x4]>
#> 3:  Sepal.Width&Petal.Length <data.table[150x4]>
# Returns a nested data.table where:
# - pairs: combined column names (e.g., "Sepal.Length&Sepal.Width")
# - data: list column containing data.tables with value1, value2 columns

# Example 2: Column-to-pairs nesting with numeric indices and grouping
c2p_nest(
  iris,                   # Input iris dataset
  cols2bind = 1:3,        # First 3 columns to be combined
  pairs_n = 2,            # Create pairs of 2 columns
  by = 5                  # Group by 5th column (Species)
)
#>                        pairs    Species               data
#>                       <char>     <fctr>             <list>
#> 1:  Sepal.Length-Sepal.Width     setosa <data.table[50x3]>
#> 2:  Sepal.Length-Sepal.Width versicolor <data.table[50x3]>
#> 3:  Sepal.Length-Sepal.Width  virginica <data.table[50x3]>
#> 4: Sepal.Length-Petal.Length     setosa <data.table[50x3]>
#> 5: Sepal.Length-Petal.Length versicolor <data.table[50x3]>
#> 6: Sepal.Length-Petal.Length  virginica <data.table[50x3]>
#> 7:  Sepal.Width-Petal.Length     setosa <data.table[50x3]>
#> 8:  Sepal.Width-Petal.Length versicolor <data.table[50x3]>
#> 9:  Sepal.Width-Petal.Length  virginica <data.table[50x3]>
# Returns a nested data.table where:
# - pairs: combined column names
# - Species: grouping variable
# - data: list column containing data.tables grouped by Species