Reshape Wide Data to Long Format and Split into List — w2l_split • mintyr

The w2l_split function reshapes wide-format data into long-format and splits it into a list by variable names and optional grouping columns. It handles both data.frame and data.table objects.

w2l_split(data, cols2l = NULL, by = NULL, split_type = "dt", sep = "_")

Arguments

data

data.frame or data.table

Input dataset in wide format
Automatically converted to data.table if necessary

cols2l

numeric or character columns to transform

Specifies columns for wide-to-long conversion
Can be column indices or column names
Default is NULL

by

numeric or character grouping variables

Optional columns for data splitting
Can be column indices or column names
Used to create hierarchical split structure
Default is NULL

split_type

character output data type

Defines split data object type
Possible values:
- "dt": split data.table objects
- "df": split data.frame objects
Default is "dt"

sep

character separator

Used for combining split names
Default is "_"

Value

A list of data.table or data.frame objects (depending on split_type), split by variable names and optional grouping columns.

If by is NULL, returns a list split by variable names only.
If by is specified, returns a list split by both variable names and grouping variables.

Details

The function melts the specified wide columns into long format and splits the resulting data into a list based on the variable names and any additional grouping variables specified in by. The split data can be in the form of data.table or data.frame objects, controlled by the split_type parameter.

Both cols2l and by parameters accept either column indices or column names, providing flexible ways to specify the columns for transformation and splitting.

Note

Both cols2l and by parameters can be specified using either numeric indices or character column names.
When using numeric indices, they must be valid column positions in the data (1 to ncol(data)).
When using character names, all specified columns must exist in the data.
The function converts data.frame to data.table if necessary.
The split_type parameter controls whether split data are data.table ("dt") or data.frame ("df") objects.
If split_type is not "dt" or "df", the function will stop with an error.

See also

Related functions and packages:

tidytable::group_split() Split data frame by groups

Examples

# Example: Wide to long format splitting demonstrations

# Example 1: Basic splitting by Species
w2l_split(
  data = iris,                    # Input dataset
  by = "Species"                  # Split by Species column
) |> 
  lapply(head)                    # Show first 6 rows of each split
#> $setosa
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width
#>           <num>       <num>        <num>       <num>
#> 1:          5.1         3.5          1.4         0.2
#> 2:          4.9         3.0          1.4         0.2
#> 3:          4.7         3.2          1.3         0.2
#> 4:          4.6         3.1          1.5         0.2
#> 5:          5.0         3.6          1.4         0.2
#> 6:          5.4         3.9          1.7         0.4
#> 
#> $versicolor
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width
#>           <num>       <num>        <num>       <num>
#> 1:          7.0         3.2          4.7         1.4
#> 2:          6.4         3.2          4.5         1.5
#> 3:          6.9         3.1          4.9         1.5
#> 4:          5.5         2.3          4.0         1.3
#> 5:          6.5         2.8          4.6         1.5
#> 6:          5.7         2.8          4.5         1.3
#> 
#> $virginica
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width
#>           <num>       <num>        <num>       <num>
#> 1:          6.3         3.3          6.0         2.5
#> 2:          5.8         2.7          5.1         1.9
#> 3:          7.1         3.0          5.9         2.1
#> 4:          6.3         2.9          5.6         1.8
#> 5:          6.5         3.0          5.8         2.2
#> 6:          7.6         3.0          6.6         2.1
#> 

# Example 2: Split specific columns using numeric indices
w2l_split(
  data = iris,                    # Input dataset
  cols2l = 1:3,                   # Select first 3 columns to split
  by = 5                          # Split by column index 5 (Species)
) |> 
  lapply(head)                    # Show first 6 rows of each split
#> $Sepal.Length_setosa
#>    Petal.Width value
#>          <num> <num>
#> 1:         0.2   5.1
#> 2:         0.2   4.9
#> 3:         0.2   4.7
#> 4:         0.2   4.6
#> 5:         0.2   5.0
#> 6:         0.4   5.4
#> 
#> $Sepal.Length_versicolor
#>    Petal.Width value
#>          <num> <num>
#> 1:         1.4   7.0
#> 2:         1.5   6.4
#> 3:         1.5   6.9
#> 4:         1.3   5.5
#> 5:         1.5   6.5
#> 6:         1.3   5.7
#> 
#> $Sepal.Length_virginica
#>    Petal.Width value
#>          <num> <num>
#> 1:         2.5   6.3
#> 2:         1.9   5.8
#> 3:         2.1   7.1
#> 4:         1.8   6.3
#> 5:         2.2   6.5
#> 6:         2.1   7.6
#> 
#> $Sepal.Width_setosa
#>    Petal.Width value
#>          <num> <num>
#> 1:         0.2   3.5
#> 2:         0.2   3.0
#> 3:         0.2   3.2
#> 4:         0.2   3.1
#> 5:         0.2   3.6
#> 6:         0.4   3.9
#> 
#> $Sepal.Width_versicolor
#>    Petal.Width value
#>          <num> <num>
#> 1:         1.4   3.2
#> 2:         1.5   3.2
#> 3:         1.5   3.1
#> 4:         1.3   2.3
#> 5:         1.5   2.8
#> 6:         1.3   2.8
#> 
#> $Sepal.Width_virginica
#>    Petal.Width value
#>          <num> <num>
#> 1:         2.5   3.3
#> 2:         1.9   2.7
#> 3:         2.1   3.0
#> 4:         1.8   2.9
#> 5:         2.2   3.0
#> 6:         2.1   3.0
#> 
#> $Petal.Length_setosa
#>    Petal.Width value
#>          <num> <num>
#> 1:         0.2   1.4
#> 2:         0.2   1.4
#> 3:         0.2   1.3
#> 4:         0.2   1.5
#> 5:         0.2   1.4
#> 6:         0.4   1.7
#> 
#> $Petal.Length_versicolor
#>    Petal.Width value
#>          <num> <num>
#> 1:         1.4   4.7
#> 2:         1.5   4.5
#> 3:         1.5   4.9
#> 4:         1.3   4.0
#> 5:         1.5   4.6
#> 6:         1.3   4.5
#> 
#> $Petal.Length_virginica
#>    Petal.Width value
#>          <num> <num>
#> 1:         2.5   6.0
#> 2:         1.9   5.1
#> 3:         2.1   5.9
#> 4:         1.8   5.6
#> 5:         2.2   5.8
#> 6:         2.1   6.6
#> 

# Example 3: Split specific columns using column names
list_res <- w2l_split(
  data = iris,                    # Input dataset
  cols2l = c("Sepal.Length",      # Select columns by name
             "Sepal.Width"),
  by = "Species"                  # Split by Species column
)
lapply(list_res, head)            # Show first 6 rows of each split
#> $Sepal.Length_setosa
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          1.4         0.2   5.1
#> 2:          1.4         0.2   4.9
#> 3:          1.3         0.2   4.7
#> 4:          1.5         0.2   4.6
#> 5:          1.4         0.2   5.0
#> 6:          1.7         0.4   5.4
#> 
#> $Sepal.Length_versicolor
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          4.7         1.4   7.0
#> 2:          4.5         1.5   6.4
#> 3:          4.9         1.5   6.9
#> 4:          4.0         1.3   5.5
#> 5:          4.6         1.5   6.5
#> 6:          4.5         1.3   5.7
#> 
#> $Sepal.Length_virginica
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          6.0         2.5   6.3
#> 2:          5.1         1.9   5.8
#> 3:          5.9         2.1   7.1
#> 4:          5.6         1.8   6.3
#> 5:          5.8         2.2   6.5
#> 6:          6.6         2.1   7.6
#> 
#> $Sepal.Width_setosa
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          1.4         0.2   3.5
#> 2:          1.4         0.2   3.0
#> 3:          1.3         0.2   3.2
#> 4:          1.5         0.2   3.1
#> 5:          1.4         0.2   3.6
#> 6:          1.7         0.4   3.9
#> 
#> $Sepal.Width_versicolor
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          4.7         1.4   3.2
#> 2:          4.5         1.5   3.2
#> 3:          4.9         1.5   3.1
#> 4:          4.0         1.3   2.3
#> 5:          4.6         1.5   2.8
#> 6:          4.5         1.3   2.8
#> 
#> $Sepal.Width_virginica
#>    Petal.Length Petal.Width value
#>           <num>       <num> <num>
#> 1:          6.0         2.5   3.3
#> 2:          5.1         1.9   2.7
#> 3:          5.9         2.1   3.0
#> 4:          5.6         1.8   2.9
#> 5:          5.8         2.2   3.0
#> 6:          6.6         2.1   3.0
#> 
# Returns similar structure to Example 2