The w2l_split
function reshapes wide-format data into long-format and splits it into a list
by variable names and optional grouping columns. It handles both data.frame
and data.table
objects.
w2l_split(data, cols2l = NULL, by = NULL, split_type = "dt", sep = "_")
data.frame
or data.table
Input dataset in wide format
Automatically converted to data.table
if necessary
numeric
or character
columns to transform
Specifies columns for wide-to-long conversion
Can be column indices or column names
Default is NULL
numeric
or character
grouping variables
Optional columns for data splitting
Can be column indices or column names
Used to create hierarchical split structure
Default is NULL
character
output data type
Defines split data object type
Possible values:
"dt"
: split data.table
objects
"df"
: split data.frame
objects
Default is "dt"
character
separator
Used for combining split names
Default is "_"
A list of data.table
or data.frame
objects (depending on split_type
), split by variable
names and optional grouping columns.
If by
is NULL
, returns a list split by variable names only.
If by
is specified, returns a list split by both variable names and grouping variables.
The function melts the specified wide columns into long format and splits the resulting data
into a list based on the variable names and any additional grouping variables specified in by
.
The split data can be in the form of data.table
or data.frame
objects, controlled by the
split_type
parameter.
Both cols2l
and by
parameters accept either column indices or column names, providing flexible ways
to specify the columns for transformation and splitting.
Both cols2l
and by
parameters can be specified using either numeric indices or character column names.
When using numeric indices, they must be valid column positions in the data (1 to ncol(data)).
When using character names, all specified columns must exist in the data.
The function converts data.frame
to data.table
if necessary.
The split_type
parameter controls whether split data are data.table
("dt"
) or data.frame
("df"
) objects.
If split_type
is not "dt"
or "df"
, the function will stop with an error.
Related functions and packages:
tidytable::group_split()
Split data frame by groups
# Example: Wide to long format splitting demonstrations
# Example 1: Basic splitting by Species
w2l_split(
data = iris, # Input dataset
by = "Species" # Split by Species column
) |>
lapply(head) # Show first 6 rows of each split
#> $setosa
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> <num> <num> <num> <num>
#> 1: 5.1 3.5 1.4 0.2
#> 2: 4.9 3.0 1.4 0.2
#> 3: 4.7 3.2 1.3 0.2
#> 4: 4.6 3.1 1.5 0.2
#> 5: 5.0 3.6 1.4 0.2
#> 6: 5.4 3.9 1.7 0.4
#>
#> $versicolor
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> <num> <num> <num> <num>
#> 1: 7.0 3.2 4.7 1.4
#> 2: 6.4 3.2 4.5 1.5
#> 3: 6.9 3.1 4.9 1.5
#> 4: 5.5 2.3 4.0 1.3
#> 5: 6.5 2.8 4.6 1.5
#> 6: 5.7 2.8 4.5 1.3
#>
#> $virginica
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> <num> <num> <num> <num>
#> 1: 6.3 3.3 6.0 2.5
#> 2: 5.8 2.7 5.1 1.9
#> 3: 7.1 3.0 5.9 2.1
#> 4: 6.3 2.9 5.6 1.8
#> 5: 6.5 3.0 5.8 2.2
#> 6: 7.6 3.0 6.6 2.1
#>
# Example 2: Split specific columns using numeric indices
w2l_split(
data = iris, # Input dataset
cols2l = 1:3, # Select first 3 columns to split
by = 5 # Split by column index 5 (Species)
) |>
lapply(head) # Show first 6 rows of each split
#> $Sepal.Length_setosa
#> Petal.Width value
#> <num> <num>
#> 1: 0.2 5.1
#> 2: 0.2 4.9
#> 3: 0.2 4.7
#> 4: 0.2 4.6
#> 5: 0.2 5.0
#> 6: 0.4 5.4
#>
#> $Sepal.Length_versicolor
#> Petal.Width value
#> <num> <num>
#> 1: 1.4 7.0
#> 2: 1.5 6.4
#> 3: 1.5 6.9
#> 4: 1.3 5.5
#> 5: 1.5 6.5
#> 6: 1.3 5.7
#>
#> $Sepal.Length_virginica
#> Petal.Width value
#> <num> <num>
#> 1: 2.5 6.3
#> 2: 1.9 5.8
#> 3: 2.1 7.1
#> 4: 1.8 6.3
#> 5: 2.2 6.5
#> 6: 2.1 7.6
#>
#> $Sepal.Width_setosa
#> Petal.Width value
#> <num> <num>
#> 1: 0.2 3.5
#> 2: 0.2 3.0
#> 3: 0.2 3.2
#> 4: 0.2 3.1
#> 5: 0.2 3.6
#> 6: 0.4 3.9
#>
#> $Sepal.Width_versicolor
#> Petal.Width value
#> <num> <num>
#> 1: 1.4 3.2
#> 2: 1.5 3.2
#> 3: 1.5 3.1
#> 4: 1.3 2.3
#> 5: 1.5 2.8
#> 6: 1.3 2.8
#>
#> $Sepal.Width_virginica
#> Petal.Width value
#> <num> <num>
#> 1: 2.5 3.3
#> 2: 1.9 2.7
#> 3: 2.1 3.0
#> 4: 1.8 2.9
#> 5: 2.2 3.0
#> 6: 2.1 3.0
#>
#> $Petal.Length_setosa
#> Petal.Width value
#> <num> <num>
#> 1: 0.2 1.4
#> 2: 0.2 1.4
#> 3: 0.2 1.3
#> 4: 0.2 1.5
#> 5: 0.2 1.4
#> 6: 0.4 1.7
#>
#> $Petal.Length_versicolor
#> Petal.Width value
#> <num> <num>
#> 1: 1.4 4.7
#> 2: 1.5 4.5
#> 3: 1.5 4.9
#> 4: 1.3 4.0
#> 5: 1.5 4.6
#> 6: 1.3 4.5
#>
#> $Petal.Length_virginica
#> Petal.Width value
#> <num> <num>
#> 1: 2.5 6.0
#> 2: 1.9 5.1
#> 3: 2.1 5.9
#> 4: 1.8 5.6
#> 5: 2.2 5.8
#> 6: 2.1 6.6
#>
# Example 3: Split specific columns using column names
list_res <- w2l_split(
data = iris, # Input dataset
cols2l = c("Sepal.Length", # Select columns by name
"Sepal.Width"),
by = "Species" # Split by Species column
)
lapply(list_res, head) # Show first 6 rows of each split
#> $Sepal.Length_setosa
#> Petal.Length Petal.Width value
#> <num> <num> <num>
#> 1: 1.4 0.2 5.1
#> 2: 1.4 0.2 4.9
#> 3: 1.3 0.2 4.7
#> 4: 1.5 0.2 4.6
#> 5: 1.4 0.2 5.0
#> 6: 1.7 0.4 5.4
#>
#> $Sepal.Length_versicolor
#> Petal.Length Petal.Width value
#> <num> <num> <num>
#> 1: 4.7 1.4 7.0
#> 2: 4.5 1.5 6.4
#> 3: 4.9 1.5 6.9
#> 4: 4.0 1.3 5.5
#> 5: 4.6 1.5 6.5
#> 6: 4.5 1.3 5.7
#>
#> $Sepal.Length_virginica
#> Petal.Length Petal.Width value
#> <num> <num> <num>
#> 1: 6.0 2.5 6.3
#> 2: 5.1 1.9 5.8
#> 3: 5.9 2.1 7.1
#> 4: 5.6 1.8 6.3
#> 5: 5.8 2.2 6.5
#> 6: 6.6 2.1 7.6
#>
#> $Sepal.Width_setosa
#> Petal.Length Petal.Width value
#> <num> <num> <num>
#> 1: 1.4 0.2 3.5
#> 2: 1.4 0.2 3.0
#> 3: 1.3 0.2 3.2
#> 4: 1.5 0.2 3.1
#> 5: 1.4 0.2 3.6
#> 6: 1.7 0.4 3.9
#>
#> $Sepal.Width_versicolor
#> Petal.Length Petal.Width value
#> <num> <num> <num>
#> 1: 4.7 1.4 3.2
#> 2: 4.5 1.5 3.2
#> 3: 4.9 1.5 3.1
#> 4: 4.0 1.3 2.3
#> 5: 4.6 1.5 2.8
#> 6: 4.5 1.3 2.8
#>
#> $Sepal.Width_virginica
#> Petal.Length Petal.Width value
#> <num> <num> <num>
#> 1: 6.0 2.5 3.3
#> 2: 5.1 1.9 2.7
#> 3: 5.9 2.1 3.0
#> 4: 5.6 1.8 2.9
#> 5: 5.8 2.2 3.0
#> 6: 6.6 2.1 3.0
#>
# Returns similar structure to Example 2