A robust and flexible function for importing data from one or multiple
XLSX
files, offering comprehensive options for sheet selection,
data combination, and source tracking.
import_xlsx(file, rbind = TRUE, sheet = NULL, ...)
A character
vector of file paths to Excel
files.
Must point to existing .xlsx
or .xls
files.
A logical
value controlling data combination strategy:
TRUE
: Combines all data into a single data.table
FALSE
: Returns a list of data.table
s
Default is TRUE
.
A numeric
vector or NULL
specifying sheet import strategy:
NULL
(default): Imports all sheets
numeric
: Imports only specified sheet indices
Additional arguments passed to readxl::read_excel()
,
such as col_types
, skip
, or na
.
Depends on the rbind
parameter:
If rbind = TRUE
: A single data.table
with additional tracking columns:
- excel_name
: Source file name (without extension)
- sheet_name
: Source sheet name
If rbind = FALSE
: A named list of data.table
s with format
"filename_sheetname"
The function provides a comprehensive solution for importing Excel data with the following features:
Supports multiple files and sheets
Automatic source tracking for files and sheets
Flexible combining options
Handles missing columns across sheets when combining
Preserves original data types through readxl
Critical Import Considerations:
Requires all specified files to be accessible Excel
files
Sheet indices must be valid across input files
rbind = TRUE
assumes compatible data structures
Missing columns are automatically filled with NA
File extensions are automatically removed in tracking columns
readxl::read_excel()
for underlying Excel reading
data.table::rbindlist()
for data combination
# Example: Excel file import demonstrations
# Setup test files
xlsx_files <- mintyr_example(
mintyr_examples("xlsx_test") # Get example Excel files
)
# Example 1: Import and combine all sheets from all files
import_xlsx(
xlsx_files, # Input Excel file paths
rbind = TRUE # Combine all sheets into one data.table
)
#> excel_name sheet_name col1 col2 col3
#> <char> <char> <num> <char> <lgcl>
#> 1: xlsx_test1 Sheet1 4 d FALSE
#> 2: xlsx_test1 Sheet1 5 f TRUE
#> 3: xlsx_test1 Sheet1 6 e TRUE
#> 4: xlsx_test1 Sheet2 1 a TRUE
#> 5: xlsx_test1 Sheet2 2 b FALSE
#> 6: xlsx_test1 Sheet2 3 c TRUE
#> 7: xlsx_test2 Sheet1 15 o FALSE
#> 8: xlsx_test2 Sheet1 16 p TRUE
#> 9: xlsx_test2 Sheet1 17 q FALSE
#> 10: xlsx_test2 a 7 g FALSE
#> 11: xlsx_test2 a 9 h TRUE
#> 12: xlsx_test2 a 8 i FALSE
#> 13: xlsx_test2 b 10 J FALSE
#> 14: xlsx_test2 b 11 K TRUE
#> 15: xlsx_test2 b 12 L FALSE
# Example 2: Import specific sheets separately
import_xlsx(
xlsx_files, # Input Excel file paths
rbind = FALSE, # Keep sheets as separate data.tables
sheet = 2 # Only import first sheet
)
#> $xlsx_test1_Sheet2
#> col1 col2 col3
#> <num> <char> <lgcl>
#> 1: 1 a TRUE
#> 2: 2 b FALSE
#> 3: 3 c TRUE
#>
#> $xlsx_test2_a
#> col1 col2 col3
#> <num> <char> <lgcl>
#> 1: 7 g FALSE
#> 2: 9 h TRUE
#> 3: 8 i FALSE
#>