w2l_split reshapes a wide-format data.frame or data.table into long format, then splits the result into a named list keyed by the pivoted column identifier (variable) and any optional grouping variables supplied via by. List element names are derived directly from the grouping key combinations produced by split(), guaranteeing name-to-content alignment.

w2l_split(data, cols2l = NULL, by = NULL, split_type = "dt", sep = "_")

Arguments

data

data.frame or data.table. Wide-format input dataset. Converted in-place to data.table via setDT() if necessary (no copy).

cols2l

numeric or character. Columns to pivot from wide to long, specified as integer indices or column names. Default NULL: when NULL, by must be provided and the function splits the data as-is without melting.

by

numeric or character. Additional grouping variables used as secondary split keys, specified as integer indices or column names. Default NULL.

split_type

character. Class of each list element: "dt" for data.table (default) or "df" for data.frame.

sep

character. Separator used when concatenating multiple grouping key values into a single list-element name. Default "_".

Value

A named list of data.table or data.frame objects (controlled by split_type). Names reflect the key combination of variable (and by levels if provided), joined by sep.

  • If by is NULL, the list is keyed by the pivoted column names only.

  • If by is specified, the list is keyed by variable and all by level combinations.

Details

Name safety: list names are produced by data.table::split() itself using its by argument, not reconstructed from raw row order. This eliminates the name-to-content misalignment that arises when unique() on the original data and split()'s internal sort order diverge.

Column resolution: both cols2l and by accept integer column positions or character column names. Out-of-bounds indices and unknown names are caught early with informative error messages.

Overlap guard: columns appearing in both cols2l and by raise an error before melting to prevent id.vars / measure.vars conflicts.

Factor-free melting: melt() is called with variable.factor = FALSE so the variable column is always character, keeping split() sort order consistent with lexicographic expectations.

Memory efficiency:

  • setDT() converts data.frame inputs by reference — no full copy.

  • For split_type = "df", setattr(copy(x), "class", "data.frame") modifies the class on a shallow copy, avoiding the deep column-by-column duplication that as.data.frame() triggers.

Note

  • An empty input table (0 rows) triggers a warning() and returns an empty list immediately.

  • cols2l and by must not overlap; shared columns raise an error.

  • split_type values other than "dt" or "df" raise an error.

See also

tidytable::group_split() for a tidyverse-style equivalent.

Examples

# Example: Wide to long format splitting demonstrations

# Example 1: Basic splitting by Species
w2l_split(
  data = iris,                    # Input dataset
  by = "Species"                  # Split by Species column
) |> 
  lapply(head)                    # Show first 6 rows of each split
#> $setosa
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width
#>           <num>       <num>        <num>       <num>
#> 1:          5.1         3.5          1.4         0.2
#> 2:          4.9         3.0          1.4         0.2
#> 3:          4.7         3.2          1.3         0.2
#> 4:          4.6         3.1          1.5         0.2
#> 5:          5.0         3.6          1.4         0.2
#> 6:          5.4         3.9          1.7         0.4
#> 
#> $versicolor
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width
#>           <num>       <num>        <num>       <num>
#> 1:          7.0         3.2          4.7         1.4
#> 2:          6.4         3.2          4.5         1.5
#> 3:          6.9         3.1          4.9         1.5
#> 4:          5.5         2.3          4.0         1.3
#> 5:          6.5         2.8          4.6         1.5
#> 6:          5.7         2.8          4.5         1.3
#> 
#> $virginica
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width
#>           <num>       <num>        <num>       <num>
#> 1:          6.3         3.3          6.0         2.5
#> 2:          5.8         2.7          5.1         1.9
#> 3:          7.1         3.0          5.9         2.1
#> 4:          6.3         2.9          5.6         1.8
#> 5:          6.5         3.0          5.8         2.2
#> 6:          7.6         3.0          6.6         2.1
#> 

# Example 2: Split specific columns using numeric indices
w2l_split(
  data = iris,                    # Input dataset
  cols2l = 1:3,                   # Select first 3 columns to split
  by = 5                          # Split by column index 5 (Species)
) |> 
  lapply(head)                    # Show first 6 rows of each split
#> $Sepal.Length_setosa
#>    Petal.Width value
#>          <num> <num>
#> 1:         0.2   5.1
#> 2:         0.2   4.9
#> 3:         0.2   4.7
#> 4:         0.2   4.6
#> 5:         0.2   5.0
#> 6:         0.4   5.4
#> 
#> $Sepal.Length_versicolor
#>    Petal.Width value
#>          <num> <num>
#> 1:         1.4   7.0
#> 2:         1.5   6.4
#> 3:         1.5   6.9
#> 4:         1.3   5.5
#> 5:         1.5   6.5
#> 6:         1.3   5.7
#> 
#> $Sepal.Length_virginica
#>    Petal.Width value
#>          <num> <num>
#> 1:         2.5   6.3
#> 2:         1.9   5.8
#> 3:         2.1   7.1
#> 4:         1.8   6.3
#> 5:         2.2   6.5
#> 6:         2.1   7.6
#> 
#> $Sepal.Width_setosa
#>    Petal.Width value
#>          <num> <num>
#> 1:         0.2   3.5
#> 2:         0.2   3.0
#> 3:         0.2   3.2
#> 4:         0.2   3.1
#> 5:         0.2   3.6
#> 6:         0.4   3.9
#> 
#> $Sepal.Width_versicolor
#>    Petal.Width value
#>          <num> <num>
#> 1:         1.4   3.2
#> 2:         1.5   3.2
#> 3:         1.5   3.1
#> 4:         1.3   2.3
#> 5:         1.5   2.8
#> 6:         1.3   2.8
#> 
#> $Sepal.Width_virginica
#>    Petal.Width value
#>          <num> <num>
#> 1:         2.5   3.3
#> 2:         1.9   2.7
#> 3:         2.1   3.0
#> 4:         1.8   2.9
#> 5:         2.2   3.0
#> 6:         2.1   3.0
#> 
#> $Petal.Length_setosa
#>    Petal.Width value
#>          <num> <num>
#> 1:         0.2   1.4
#> 2:         0.2   1.4
#> 3:         0.2   1.3
#> 4:         0.2   1.5
#> 5:         0.2   1.4
#> 6:         0.4   1.7
#> 
#> $Petal.Length_versicolor
#>    Petal.Width value
#>          <num> <num>
#> 1:         1.4   4.7
#> 2:         1.5   4.5
#> 3:         1.5   4.9
#> 4:         1.3   4.0
#> 5:         1.5   4.6
#> 6:         1.3   4.5
#> 
#> $Petal.Length_virginica
#>    Petal.Width value
#>          <num> <num>
#> 1:         2.5   6.0
#> 2:         1.9   5.1
#> 3:         2.1   5.9
#> 4:         1.8   5.6
#> 5:         2.2   5.8
#> 6:         2.1   6.6
#> 

# Example 3: Split specific columns using column names
list_res <- w2l_split(
  data = iris,                    # Input dataset
  cols2l = c("Sepal.Length",      # Select columns by name
             "Sepal.Width"),
  by = "Species"                  # Split by Species column
)
lapply(list_res, head)            # Show first 6 rows of each split
#> $Sepal.Length_setosa
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          1.4         0.2   5.1
#> 2:          1.4         0.2   4.9
#> 3:          1.3         0.2   4.7
#> 4:          1.5         0.2   4.6
#> 5:          1.4         0.2   5.0
#> 6:          1.7         0.4   5.4
#> 
#> $Sepal.Length_versicolor
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          4.7         1.4   7.0
#> 2:          4.5         1.5   6.4
#> 3:          4.9         1.5   6.9
#> 4:          4.0         1.3   5.5
#> 5:          4.6         1.5   6.5
#> 6:          4.5         1.3   5.7
#> 
#> $Sepal.Length_virginica
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          6.0         2.5   6.3
#> 2:          5.1         1.9   5.8
#> 3:          5.9         2.1   7.1
#> 4:          5.6         1.8   6.3
#> 5:          5.8         2.2   6.5
#> 6:          6.6         2.1   7.6
#> 
#> $Sepal.Width_setosa
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          1.4         0.2   3.5
#> 2:          1.4         0.2   3.0
#> 3:          1.3         0.2   3.2
#> 4:          1.5         0.2   3.1
#> 5:          1.4         0.2   3.6
#> 6:          1.7         0.4   3.9
#> 
#> $Sepal.Width_versicolor
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          4.7         1.4   3.2
#> 2:          4.5         1.5   3.2
#> 3:          4.9         1.5   3.1
#> 4:          4.0         1.3   2.3
#> 5:          4.6         1.5   2.8
#> 6:          4.5         1.3   2.8
#> 
#> $Sepal.Width_virginica
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          6.0         2.5   3.3
#> 2:          5.1         1.9   2.7
#> 3:          5.9         2.1   3.0
#> 4:          5.6         1.8   2.9
#> 5:          5.8         2.2   3.0
#> 6:          6.6         2.1   3.0
#> 
# Returns similar structure to Example 2