A comprehensive CSV or TXT file import function offering advanced reading capabilities
through data.table and arrow packages with intelligent data combination strategies.
import_csv(
file,
package = "data.table",
rbind = TRUE,
rbind_label = "_file",
full_path = FALSE,
keep_ext = FALSE,
...
)A character vector of file paths to CSV files.
Must point to existing and accessible files.
A character string specifying the backend package:
"data.table": Uses data.table::fread() (default)
"arrow": Uses arrow::read_csv_arrow()
Determines the underlying reading mechanism.
A logical value controlling data combination strategy:
TRUE: Combines all files into a single data object (default)
FALSE: Returns a list of individual data objects
A character string or NULL for source file tracking:
character: Specifies the column name for file source labeling (default: "_file")
NULL: Disables source file tracking
A logical value controlling path display in file labels:
TRUE: Uses full file path
FALSE: Uses only filename (default)
A logical value controlling file extension in labels:
TRUE: Retains file extension (e.g., .csv)
FALSE: Removes file extension (default)
Additional arguments passed to backend-specific reading functions
(e.g., col_types, na.strings, skip).
Depends on the rbind parameter:
If rbind = TRUE: A single data object (from chosen package)
containing all imported data, with source file information in rbind_label column
If rbind = FALSE: A named list of data objects with names
derived from input file paths based on full_path and keep_ext settings
The function provides a unified interface for reading CSV files using either data.table
or arrow package. When reading multiple files, it can either combine them into a single
data object or return them as a list. File source tracking is supported through the
rbind_label parameter.
File labeling behavior is controlled by full_path and keep_ext parameters:
full_path = FALSE, keep_ext = FALSE: Filename without extension (e.g., "data")
full_path = FALSE, keep_ext = TRUE: Filename with extension (e.g., "data.csv")
full_path = TRUE, keep_ext = FALSE: Full path without extension (e.g., "/path/to/data")
full_path = TRUE, keep_ext = TRUE: Full path with extension (e.g., "/path/to/data.csv")
Critical Import Considerations:
Requires all specified files to be accessible CSV/TXT files
Supports flexible backend selection via package parameter
rbind = TRUE assumes compatible data structures across files
Missing columns are automatically aligned when combining data
File labeling is customizable through full_path and keep_ext parameters
data.table::fread() for data.table backend
arrow::read_csv_arrow() for arrow backend
data.table::rbindlist() for data combination
# Example: CSV file import demonstrations
# Setup test files
csv_files <- mintyr_example(
mintyr_examples("csv_test") # Get example CSV files
)
# Example 1: Import and combine CSV files using data.table
import_csv(
csv_files, # Input CSV file paths
package = "data.table", # Use data.table for reading
rbind = TRUE, # Combine all files into one data.table
rbind_label = "_file", # Column name for file source
keep_ext = TRUE, # Include .csv extension in _file column
full_path = TRUE # Show complete file paths in _file column
)
#> _file col1 col2
#> <char> <int> <char>
#> 1: /home/runner/work/_temp/Library/mintyr/extdata/csv_test1.csv 4 d
#> 2: /home/runner/work/_temp/Library/mintyr/extdata/csv_test1.csv 5 f
#> 3: /home/runner/work/_temp/Library/mintyr/extdata/csv_test1.csv 6 e
#> 4: /home/runner/work/_temp/Library/mintyr/extdata/csv_test2.csv 15 o
#> 5: /home/runner/work/_temp/Library/mintyr/extdata/csv_test2.csv 16 p
#> 6: /home/runner/work/_temp/Library/mintyr/extdata/csv_test2.csv 17 q
#> col3
#> <lgcl>
#> 1: FALSE
#> 2: TRUE
#> 3: TRUE
#> 4: FALSE
#> 5: TRUE
#> 6: FALSE
# Example 2: Import files separately using arrow
import_csv(
csv_files, # Input CSV file paths
package = "arrow", # Use arrow for reading
rbind = FALSE # Keep files as separate data.tables
)
#> $csv_test1
#> # A tibble: 3 × 3
#> col1 col2 col3
#> <int> <chr> <lgl>
#> 1 4 d FALSE
#> 2 5 f TRUE
#> 3 6 e TRUE
#>
#> $csv_test2
#> # A tibble: 3 × 3
#> col1 col2 col3
#> <int> <chr> <lgl>
#> 1 15 o FALSE
#> 2 16 p TRUE
#> 3 17 q FALSE
#>