mintyr is a high-performance data processing toolkit designed specifically for animal breeding and genomic selection. Leveraging the zero-copy and multi-threading capabilities of data.table, it significantly simplifies the construction of automated data pipelines in large-scale commercial breeding programs (e.g., coordinating data across nucleus and multiplier farms, or handling multi-trait growth test records).
The package is not only highly optimized for iterative analysis workflows with the ASReml-R package (supporting dynamic modeling and multi-trait/multi-breed nested grouping), but is also capable of generating and batch-exporting formatted phenotypic data files required for automated pipeline analyses in other mainstream command-line breeding software (e.g., HIBLUP, DMU).
mintyr covers five critical stages in the lifecycle of breeding data analysis:
🚀 High-Performance Data I/O (import_xlsx, import_csv, export_xlsx)
A transparent round-trip for multi-file, multi-sheet tabular data: import many files into one tidy data.table, transform freely, then write the original file/sheet structure back out — no bookkeeping required.
import_xlsx, import_csv): native support for merging multiple files and sheets simultaneously, with source tracking columns (excel_name, sheet_name) appended automatically to prevent data confusion across different farms or batches. In-place data.table conversion keeps the memory footprint minimal, and import_xlsx can spread the per-sheet parse across CPU cores on demand (opt-in via workers) — a fork pool on Linux/macOS, a PSOCK cluster on Windows.export_xlsx): the round-trip companion — a single path argument decides the destination. A directory writes one .xlsx per excel_name value (one sheet per sheet_name); a .xlsx file path writes everything into one workbook. Worksheet splitting follows the data automatically, and the tracking columns are stripped by default so exported sheets match the originals.🔄 Automated Data Reshaping & Nesting (w2l_nest, c2p_nest, r2p_nest)
c2p_nest: Column-to-pairs nested transformation that automatically renames feature columns, providing standard uniform inputs for iterative multi-trait genetic correlation evaluations.w2l_nest / w2l_split: Wide-to-long format transformations with subsetting and nesting by grouping variables (e.g., farm, breed, or line).🧪 Cross-Validation & Model Evaluation (split_cv, nest_cv)
data.table structures, facilitating the evaluation of breeding value prediction accuracy (GP).📊 Batch Exporting for Breeding Software (export_nest, export_list)
tempdir()/Line/Breed/data.txt), providing seamless text-file preparation to bridge the gap with command-line driven breeding evaluation software like HIBLUP and DMU.🛠️ Phenotypic Statistics & Preprocessing (top_perc, format_digits, get_path_info)
You can install this package from either CRAN or GitHub:
### From CRAN
install.packages("mintyr")
### From GitHub
pak::pak("tony2015116/mintyr")