A robust and flexible function for importing data from one or multiple
XLSX files, offering comprehensive options for sheet selection,
data combination, and source tracking.
import_xlsx(file, rbind = TRUE, sheet = NULL, ...)A character vector of file paths to Excel files.
Must point to existing .xlsx or .xls files.
A logical value controlling data combination strategy:
TRUE: Combines all data into a single data.table
FALSE: Returns a list of data.tables
Default is TRUE.
A numeric vector or NULL specifying sheet import strategy:
NULL (default): Imports all sheets
numeric: Imports only specified sheet indices
Additional arguments passed to readxl::read_excel(),
such as col_types, skip, or na.
Depends on the rbind parameter:
If rbind = TRUE: A single data.table with additional tracking columns:
- excel_name: Source file name (without extension)
- sheet_name: Source sheet name
If rbind = FALSE: A named list of data.tables with format
"filename_sheetname"
The function provides a comprehensive solution for importing Excel data with the following features:
Supports multiple files and sheets
Automatic source tracking for files and sheets
Flexible combining options
Handles missing columns across sheets when combining
Preserves original data types through readxl
Critical Import Considerations:
Requires all specified files to be accessible Excel files
Sheet indices must be valid across input files
rbind = TRUE assumes compatible data structures
Missing columns are automatically filled with NA
File extensions are automatically removed in tracking columns
readxl::read_excel() for underlying Excel reading
data.table::rbindlist() for data combination
# Example: Excel file import demonstrations
# Setup test files
xlsx_files <- mintyr_example(
mintyr_examples("xlsx_test") # Get example Excel files
)
# Example 1: Import and combine all sheets from all files
import_xlsx(
xlsx_files, # Input Excel file paths
rbind = TRUE # Combine all sheets into one data.table
)
#> excel_name sheet_name col1 col2 col3
#> <char> <char> <num> <char> <lgcl>
#> 1: xlsx_test1 Sheet1 4 d FALSE
#> 2: xlsx_test1 Sheet1 5 f TRUE
#> 3: xlsx_test1 Sheet1 6 e TRUE
#> 4: xlsx_test1 Sheet2 1 a TRUE
#> 5: xlsx_test1 Sheet2 2 b FALSE
#> 6: xlsx_test1 Sheet2 3 c TRUE
#> 7: xlsx_test2 Sheet1 15 o FALSE
#> 8: xlsx_test2 Sheet1 16 p TRUE
#> 9: xlsx_test2 Sheet1 17 q FALSE
#> 10: xlsx_test2 a 7 g FALSE
#> 11: xlsx_test2 a 9 h TRUE
#> 12: xlsx_test2 a 8 i FALSE
#> 13: xlsx_test2 b 10 J FALSE
#> 14: xlsx_test2 b 11 K TRUE
#> 15: xlsx_test2 b 12 L FALSE
# Example 2: Import specific sheets separately
import_xlsx(
xlsx_files, # Input Excel file paths
rbind = FALSE, # Keep sheets as separate data.tables
sheet = 2 # Only import first sheet
)
#> $xlsx_test1_Sheet2
#> col1 col2 col3
#> <num> <char> <lgcl>
#> 1: 1 a TRUE
#> 2: 2 b FALSE
#> 3: 3 c TRUE
#>
#> $xlsx_test2_a
#> col1 col2 col3
#> <num> <char> <lgcl>
#> 1: 7 g FALSE
#> 2: 9 h TRUE
#> 3: 8 i FALSE
#>