get_path_info is a merged, upgraded replacement for get_path_segment and
get_filename. It operates in two modes:
Mode A (when n is specified): Extract a specific path segment by
position, supporting forward indexing, reverse indexing, and range extraction.
Mode B (when n = NULL): Extract the filename, with optional removal
of the file extension and/or the directory prefix.
get_path_info(paths, n = NULL, rm_extension = TRUE, rm_path = TRUE)A character vector of file system paths.
Supports mixed separators (/ and \\) and Windows drive letters (e.g. C:).
A numeric segment index. Defaults to NULL (enters filename mode).
Positive integer: forward index from the path start; 1 = first segment.
Negative integer: reverse index from the path end; -1 = last segment
(i.e. the filename segment).
Length-2 vector: extract a contiguous range, e.g. c(2, 4) or c(-3, -1).
0 is not allowed.
A logical(1) flag controlling extension removal.
Defaults to TRUE.
In Mode B (n = NULL): always applied.
In Mode A: only applied when n == -1 (explicitly targeting the filename
segment). Has no effect for intermediate directory segments (e.g. n = 2).
A logical(1) flag controlling whether the directory prefix is
stripped, keeping only the filename. Defaults to TRUE.
Only applies in Mode B (n = NULL); ignored when n is specified.
A character vector of the same length as paths:
Returns the extracted segment string when the segment exists.
Returns NA_character_ when the segment index exceeds the path depth,
the input element is NA, or the path reduces to empty after normalisation
(e.g. "C:/", "/").
Path normalisation (internal, fully vectorised):
All backslashes and consecutive slashes are collapsed to a single /.
Windows drive letter prefixes (C:, D:, etc.) are stripped.
Leading and trailing / characters are removed.
Paths that are empty after the above steps (e.g. original inputs "C:/",
"/", "") are coerced to NA_character_.
Extension-stripping behaviour (internal .strip_ext helper):
| Input | Output | Notes |
"report.txt" | "report" | Standard file — last extension removed |
"data.tar.gz" | "data.tar" | Compound extension — only last level removed |
".bashrc" | ".bashrc" | Pure dot-file (no second dot) — unchanged |
".report.xlsx" | ".report" | Dot-file with extension — extension removed |
"no_ext" | "no_ext" | No extension — returned as-is |
"file." | "file." | Trailing isolated dot — returned as-is |
NA safety:
strsplit(NA_character_, ...) returns list(NA) with length 1, not
character(0). Consequently, every vapply callback guards against NA paths
with an explicit anyNA(x) check rather than length(x) == 0.
paths <- c("C:/Users/foo/Documents/report.xlsx",
"/home/user/.bashrc",
"relative/path/to/data.csv",
".hidden.tar.gz",
NA_character_)
# Mode B: filename only, extension stripped (default)
get_path_info(paths)
#> [1] "report" ".bashrc" "data" ".hidden.tar" NA
# Mode B: filename only, extension preserved
get_path_info(paths, rm_extension = FALSE)
#> [1] "report.xlsx" ".bashrc" "data.csv" ".hidden.tar.gz"
#> [5] NA
# Mode B: full normalised path, extension stripped
get_path_info(paths, rm_path = FALSE)
#> [1] "Users/foo/Documents/report" "home/user/.bashrc"
#> [3] "relative/path/to/data" ".hidden.tar"
#> [5] NA
# Mode A: extract the 2nd path segment
get_path_info(paths, n = 2)
#> [1] "foo" "user" "path" NA NA
# Mode A: extract the last segment with extension stripped (n = -1 linkage)
get_path_info(paths, n = -1, rm_extension = TRUE)
#> [1] "report" ".bashrc" "data" ".hidden.tar" NA
# Mode A: range extraction
get_path_info(paths, n = c(2, 3))
#> [1] "foo/Documents" "user/.bashrc" "path/to" NA
#> [5] NA