get_path_info is a merged, upgraded replacement for get_path_segment and get_filename. It operates in two modes:

  • Mode A (when n is specified): Extract a specific path segment by position, supporting forward indexing, reverse indexing, and range extraction.

  • Mode B (when n = NULL): Extract the filename, with optional removal of the file extension and/or the directory prefix.

get_path_info(paths, n = NULL, rm_extension = TRUE, rm_path = TRUE)

Arguments

paths

A character vector of file system paths. Supports mixed separators (/ and \\) and Windows drive letters (e.g. C:).

n

A numeric segment index. Defaults to NULL (enters filename mode).

  • Positive integer: forward index from the path start; 1 = first segment.

  • Negative integer: reverse index from the path end; -1 = last segment (i.e. the filename segment).

  • Length-2 vector: extract a contiguous range, e.g. c(2, 4) or c(-3, -1).

  • 0 is not allowed.

rm_extension

A logical(1) flag controlling extension removal. Defaults to TRUE.

  • In Mode B (n = NULL): always applied.

  • In Mode A: only applied when n == -1 (explicitly targeting the filename segment). Has no effect for intermediate directory segments (e.g. n = 2).

rm_path

A logical(1) flag controlling whether the directory prefix is stripped, keeping only the filename. Defaults to TRUE. Only applies in Mode B (n = NULL); ignored when n is specified.

Value

A character vector of the same length as paths:

  • Returns the extracted segment string when the segment exists.

  • Returns NA_character_ when the segment index exceeds the path depth, the input element is NA, or the path reduces to empty after normalisation (e.g. "C:/", "/").

Details

Path normalisation (internal, fully vectorised):

  1. All backslashes and consecutive slashes are collapsed to a single /.

  2. Windows drive letter prefixes (C:, D:, etc.) are stripped.

  3. Leading and trailing / characters are removed.

  4. Paths that are empty after the above steps (e.g. original inputs "C:/", "/", "") are coerced to NA_character_.

Extension-stripping behaviour (internal .strip_ext helper):

InputOutputNotes
"report.txt""report"Standard file — last extension removed
"data.tar.gz""data.tar"Compound extension — only last level removed
".bashrc"".bashrc"Pure dot-file (no second dot) — unchanged
".report.xlsx"".report"Dot-file with extension — extension removed
"no_ext""no_ext"No extension — returned as-is
"file.""file."Trailing isolated dot — returned as-is

NA safety: strsplit(NA_character_, ...) returns list(NA) with length 1, not character(0). Consequently, every vapply callback guards against NA paths with an explicit anyNA(x) check rather than length(x) == 0.

Examples

paths <- c("C:/Users/foo/Documents/report.xlsx",
           "/home/user/.bashrc",
           "relative/path/to/data.csv",
           ".hidden.tar.gz",
           NA_character_)

# Mode B: filename only, extension stripped (default)
get_path_info(paths)
#> [1] "report"      ".bashrc"     "data"        ".hidden.tar" NA           

# Mode B: filename only, extension preserved
get_path_info(paths, rm_extension = FALSE)
#> [1] "report.xlsx"    ".bashrc"        "data.csv"       ".hidden.tar.gz"
#> [5] NA              

# Mode B: full normalised path, extension stripped
get_path_info(paths, rm_path = FALSE)
#> [1] "Users/foo/Documents/report" "home/user/.bashrc"         
#> [3] "relative/path/to/data"      ".hidden.tar"               
#> [5] NA                          

# Mode A: extract the 2nd path segment
get_path_info(paths, n = 2)
#> [1] "foo"  "user" "path" NA     NA    

# Mode A: extract the last segment with extension stripped (n = -1 linkage)
get_path_info(paths, n = -1, rm_extension = TRUE)
#> [1] "report"      ".bashrc"     "data"        ".hidden.tar" NA           

# Mode A: range extraction
get_path_info(paths, n = c(2, 3))
#> [1] "foo/Documents" "user/.bashrc"  "path/to"       NA             
#> [5] NA