A comprehensive CSV or TXT file import function offering advanced reading capabilities through data.table and arrow packages with intelligent data combination strategies.

import_csv(
  file,
  package = "data.table",
  rbind = TRUE,
  rbind_label = "_file",
  full_path = FALSE,
  keep_ext = FALSE,
  ...
)

Arguments

file

A character vector of file paths to CSV files. Must point to existing and accessible files.

package

A character string specifying the backend package:

rbind

A logical value controlling data combination strategy:

  • TRUE: Combines all files into a single data object (default)

  • FALSE: Returns a list of individual data objects

rbind_label

A character string or NULL for source file tracking:

  • character: Specifies the column name for file source labeling (default: "_file")

  • NULL: Disables source file tracking

full_path

A logical value controlling path display in file labels:

  • TRUE: Uses full file path

  • FALSE: Uses only filename (default)

keep_ext

A logical value controlling file extension in labels:

  • TRUE: Retains file extension (e.g., .csv)

  • FALSE: Removes file extension (default)

...

Additional arguments passed to backend-specific reading functions (e.g., col_types, na.strings, skip).

Value

Depends on the rbind parameter:

  • If rbind = TRUE: A single data object (from chosen package) containing all imported data, with source file information in rbind_label column

  • If rbind = FALSE: A named list of data objects with names derived from input file paths based on full_path and keep_ext settings

Details

The function provides a unified interface for reading CSV files using either data.table or arrow package. When reading multiple files, it can either combine them into a single data object or return them as a list. File source tracking is supported through the rbind_label parameter.

File labeling behavior is controlled by full_path and keep_ext parameters:

  • full_path = FALSE, keep_ext = FALSE: Filename without extension (e.g., "data")

  • full_path = FALSE, keep_ext = TRUE: Filename with extension (e.g., "data.csv")

  • full_path = TRUE, keep_ext = FALSE: Full path without extension (e.g., "/path/to/data")

  • full_path = TRUE, keep_ext = TRUE: Full path with extension (e.g., "/path/to/data.csv")

Note

Critical Import Considerations:

  • Requires all specified files to be accessible CSV/TXT files

  • Supports flexible backend selection via package parameter

  • rbind = TRUE assumes compatible data structures across files

  • Missing columns are automatically aligned when combining data

  • File labeling is customizable through full_path and keep_ext parameters

See also

Examples

# Example: CSV file import demonstrations

# Setup test files
csv_files <- mintyr_example(
  mintyr_examples("csv_test")     # Get example CSV files
)

# Example 1: Import and combine CSV files using data.table
import_csv(
  csv_files,                      # Input CSV file paths
  package = "data.table",         # Use data.table for reading
  rbind = TRUE,                   # Combine all files into one data.table
  rbind_label = "_file",          # Column name for file source
  keep_ext = TRUE,                # Include .csv extension in _file column
  full_path = TRUE                # Show complete file paths in _file column
)
#>                                                           _file  col1   col2
#>                                                          <char> <int> <char>
#> 1: /home/runner/work/_temp/Library/mintyr/extdata/csv_test1.csv     4      d
#> 2: /home/runner/work/_temp/Library/mintyr/extdata/csv_test1.csv     5      f
#> 3: /home/runner/work/_temp/Library/mintyr/extdata/csv_test1.csv     6      e
#> 4: /home/runner/work/_temp/Library/mintyr/extdata/csv_test2.csv    15      o
#> 5: /home/runner/work/_temp/Library/mintyr/extdata/csv_test2.csv    16      p
#> 6: /home/runner/work/_temp/Library/mintyr/extdata/csv_test2.csv    17      q
#>      col3
#>    <lgcl>
#> 1:  FALSE
#> 2:   TRUE
#> 3:   TRUE
#> 4:  FALSE
#> 5:   TRUE
#> 6:  FALSE

# Example 2: Import files separately using arrow
import_csv(
  csv_files,                      # Input CSV file paths
  package = "arrow",              # Use arrow for reading
  rbind = FALSE                   # Keep files as separate data.tables
)
#> $csv_test1
#> # A tibble: 3 × 3
#>    col1 col2  col3 
#>   <int> <chr> <lgl>
#> 1     4 d     FALSE
#> 2     5 f     TRUE 
#> 3     6 e     TRUE 
#> 
#> $csv_test2
#> # A tibble: 3 × 3
#>    col1 col2  col3 
#>   <int> <chr> <lgl>
#> 1    15 o     FALSE
#> 2    16 p     TRUE 
#> 3    17 q     FALSE
#>