A robust and flexible function for importing data from one or multiple XLSX files, offering comprehensive options for sheet selection, data combination, and source tracking.

import_xlsx(file, rbind = TRUE, sheet = NULL, ...)

Arguments

file

A character vector of file paths to Excel files. Must point to existing .xlsx or .xls files.

rbind

A logical value controlling data combination strategy:

  • TRUE: Combines all data into a single data.table

  • FALSE: Returns a list of data.tables Default is TRUE.

sheet

A numeric vector or NULL specifying sheet import strategy:

  • NULL (default): Imports all sheets

  • numeric: Imports only specified sheet indices

...

Additional arguments passed to readxl::read_excel(), such as col_types, skip, or na.

Value

Depends on the rbind parameter:

  • If rbind = TRUE: A single data.table with additional tracking columns: - excel_name: Source file name (without extension) - sheet_name: Source sheet name

  • If rbind = FALSE: A named list of data.tables with format "filename_sheetname"

Details

The function provides a comprehensive solution for importing Excel data with the following features:

  • Supports multiple files and sheets

  • Automatic source tracking for files and sheets

  • Flexible combining options

  • Handles missing columns across sheets when combining

  • Preserves original data types through readxl

Note

Critical Import Considerations:

  • Requires all specified files to be accessible Excel files

  • Sheet indices must be valid across input files

  • rbind = TRUE assumes compatible data structures

  • Missing columns are automatically filled with NA

  • File extensions are automatically removed in tracking columns

See also

Examples

# Example: Excel file import demonstrations

# Setup test files
xlsx_files <- mintyr_example(
  mintyr_examples("xlsx_test")    # Get example Excel files
)

# Example 1: Import and combine all sheets from all files
import_xlsx(
  xlsx_files,                     # Input Excel file paths
  rbind = TRUE                    # Combine all sheets into one data.table
)
#>     excel_name sheet_name  col1   col2   col3
#>         <char>     <char> <num> <char> <lgcl>
#>  1: xlsx_test1     Sheet1     4      d  FALSE
#>  2: xlsx_test1     Sheet1     5      f   TRUE
#>  3: xlsx_test1     Sheet1     6      e   TRUE
#>  4: xlsx_test1     Sheet2     1      a   TRUE
#>  5: xlsx_test1     Sheet2     2      b  FALSE
#>  6: xlsx_test1     Sheet2     3      c   TRUE
#>  7: xlsx_test2     Sheet1    15      o  FALSE
#>  8: xlsx_test2     Sheet1    16      p   TRUE
#>  9: xlsx_test2     Sheet1    17      q  FALSE
#> 10: xlsx_test2          a     7      g  FALSE
#> 11: xlsx_test2          a     9      h   TRUE
#> 12: xlsx_test2          a     8      i  FALSE
#> 13: xlsx_test2          b    10      J  FALSE
#> 14: xlsx_test2          b    11      K   TRUE
#> 15: xlsx_test2          b    12      L  FALSE

# Example 2: Import specific sheets separately
import_xlsx(
  xlsx_files,                     # Input Excel file paths
  rbind = FALSE,                  # Keep sheets as separate data.tables
  sheet = 2                       # Only import first sheet
)
#> $xlsx_test1_Sheet2
#>     col1   col2   col3
#>    <num> <char> <lgcl>
#> 1:     1      a   TRUE
#> 2:     2      b  FALSE
#> 3:     3      c   TRUE
#> 
#> $xlsx_test2_a
#>     col1   col2   col3
#>    <num> <char> <lgcl>
#> 1:     7      g  FALSE
#> 2:     9      h   TRUE
#> 3:     8      i  FALSE
#>