basic-usage.Rmd
library(pptsda)
The import_csv()
function allows efficient importing of
multiple CSV files into a single R data object. It takes a vector of
file paths as the main argument, with an optional parameter to specify
the package to use for reading CSVs. Valid options are “data.table”,
“vroom”, and “readr”. By default, data.table::fread is used for fast
parsing. The function handles checking for valid input, loading the
required package, looping through the files, reading each one with the
appropriate CSV reader, and combining the results into a data.table or
data.frame depending on the package used. This provides a convenient
wrapper for batch importing CSVs without needing to write a for loop
each time. To use it, supply a character vector of CSV file paths to the
file_list argument. An optional second argument can choose the CSV
reading package, with “data.table” being the default. The result is a
consolidated data object containing the contents of all CSVs. This
simplifies workflows that involve wrangling multiple CSV files into an
analysis-ready dataset in R. By abstracting away the loop boilerplate,
it allows the user to focus on the analytical tasks instead of data
import details.
list_csv <- list.files("C:/Users/Dell/Downloads/location301", pattern = ".csv", full.names = TRUE)
data <- import_csv(list_csv)
#> Loading required package: data.table
The adg_get()
function is used to process raw pig feed
intake data to calculate average daily gain (ADG) and generate growth
curves. It takes the raw data frame as input, cleans invalid records in
multiple steps, runs robust regression to detect outliers, fits simple
linear models to estimate ADG over desired weight ranges, and produces
growth curve plots. The user can specify a weight range to segment the
ADG calculation via the my_break parameter. Outlier detection threshold
and output image save path can also be customized. After processing, it
returns ADG summary info and the analyzed dataset with outlier flags. To
use it, supply the raw data and optionally set parameters like my_break.
It handles data cleaning, analysis and plotting automatically. The
output contains ADG results that can be used for downstream modeling or
evaluation. This provides a streamlined workflow to go from raw feed
intake data to analyzed growth curves and ADG estimates in a single
function call.
nedap_csv_data <- mintyr::nedap
adg_results <- adg_get(data = nedap_csv_data)
#> • There are no duplicate responders in different locations.
#> • The removing of weight < 15kg will not delete responder.
#> • Removing records of missing will delete responders: 1
#> • Deleted responders:
#> c("15964")
#> • Running RANSAC Robust Regression:
#> • RANSAC Robust Regression succeeded!
#> • The outliers detected by Robust model will not delete responder.
#> • All responders' begin_test_weight are less than or equal to 60kg.
#> • Removing end_test_weight <85kg records will delete responders: 1
#> • Deleted responders:
#> c("15967")
#> • Running Simple Linear Regression
#> • Calculate ADG using Simple Linear Regression succeeded!
head(adg_results$adg_info)
#> Key: <responder>
#> responder location start_date_origin min_weight_origin end_date_origin
#> <char> <char> <Date> <num> <Date>
#> 1: 13913 101 2024-02-19 21866.22 2024-05-25
#> 2: 13918 101 2024-02-19 16529.06 2024-05-25
#> 3: 13935 102 2024-02-19 21414.80 2024-05-25
#> 4: 13954 101 2024-02-20 30876.24 2024-05-25
#> 5: 13996 101 2024-02-19 22753.31 2024-05-25
#> 6: 14260 102 2024-02-19 27757.93 2024-05-25
#> max_weight_origin r_squared lm_slope
#> <num> <num> <num>
#> 1: 132895.4 0.9914949 1180.5834
#> 2: 121102.8 0.9917870 1089.6744
#> 3: 138871.4 0.9976035 1233.5471
#> 4: 118602.7 0.9781564 941.2985
#> 5: 142956.6 0.9957625 1281.2277
#> 6: 142547.7 0.9962627 1221.0429
The adfi_get()
function processes raw pig feed intake
visit data to correct outliers and errors and calculate daily feed
intake (DFI). It takes the raw data and outputs from
adg_get()
as input. It handles data cleaning by flagging
invalid visit data based on predefined rules. Errors are corrected by
redistributing feed intake to valid visits. Final DFI is calculated
after removing errors. The user can optionally specify a weight range to
match the DFI calculation to segmented ADG. Key outputs are DFI summary
info and the analyzed dataset. To use it, supply the raw data and
adg_get()
outputs. It handles error detection, correction,
DFI calculation and matching to ADG segments automatically. The output
contains corrected DFI metrics ready for evaluation. This provides an
automated workflow to go from raw feed intake visit data to analyzed DFI
estimates matched to ADG weight segments in a single function.
nedap_csv_data <- mintyr::nedap
adg_results <- adg_get(data = nedap_csv_data)
#> • There are no duplicate responders in different locations.
#> • The removing of weight < 15kg will not delete responder.
#> • Removing records of missing will delete responders: 1
#> • Deleted responders:
#> c("15964")
#> • Running RANSAC Robust Regression:
#> • RANSAC Robust Regression succeeded!
#> • The outliers detected by Robust model will not delete responder.
#> • All responders' begin_test_weight are less than or equal to 60kg.
#> • Removing end_test_weight <85kg records will delete responders: 1
#> • Deleted responders:
#> c("15967")
#> • Running Simple Linear Regression
#> • Calculate ADG using Simple Linear Regression succeeded!
adfi_results <- adfi_get(data = nedap_csv_data, adg_res = adg_results)
#> • There are no duplicate responders in different locations.
#> • Successfully generated the following 3 variables:
#> - FIV:feed intake per visit;
#> - OTV:occupation time per visit;
#> - FRV:feeding rate per visit;
#> • Successfully generated 10 error types from 3 variables:
#> - FIV-lo; FIV-hi; FIV-0; OTV-lo; OTV-hi; FRV-hi-FIV-lo; FRV-hi-strict; FRV-hi; FRV-0; FRV-lo;
#> • Running linear mixed model with equation:
#> dfi_right_part ~ otd_2 + otd_6 + otd_9 + otd_10 + otv_hi_p + frv_hi_fiv_lo_p + frv_lo_p + location + lm_slope + weight + (1 | responder)
head(adfi_results$adfi_info)
#> Key: <responder, location>
#> responder location test_days origin_dfi corrected_dfi
#> <char> <char> <int> <num> <num>
#> 1: 13913 101 95 2562.365 2577.739
#> 2: 13918 101 96 2319.814 2305.710
#> 3: 13935 102 96 2703.371 2706.317
#> 4: 13954 101 95 2317.260 2308.288
#> 5: 13996 101 96 2909.309 2921.329
#> 6: 14260 102 96 2846.546 2819.889
The fcr_get()
function computes feed conversion ratio
(FCR) by combining average daily gain (ADG) and daily feed intake (DFI)
data. It takes the output lists from adg_get()
and
dfi_get()
as input, merges the ADG and DFI summary
dataframes by pig ID and location, calculates FCR as corrected DFI
divided by ADG slope, and optionally adjusts FCR for a specified weight
range. The output is an FCR dataframe with pig ID, location, FCR, and
adjusted FCR if weight range is provided. To use it, supply the
adg_get()
and dfi_get()
result lists. It
handles merging the data, FCR calculation, and adjustment automatically.
This provides a streamlined workflow to go from raw feed intake data to
analyzed FCR estimates matched to ADG weight segments. The output FCR
metrics can then be used for evaluation and modeling of feed efficiency.
By combining ADG and DFI data, fcr_get()
enables easy
computation of the key FCR feed efficiency metric.
nedap_csv_data <- mintyr::nedap
adg_results <- adg_get(data = nedap_csv_data)
#> • There are no duplicate responders in different locations.
#> • The removing of weight < 15kg will not delete responder.
#> • Removing records of missing will delete responders: 1
#> • Deleted responders:
#> c("15964")
#> • Running RANSAC Robust Regression:
#> • RANSAC Robust Regression succeeded!
#> • The outliers detected by Robust model will not delete responder.
#> • All responders' begin_test_weight are less than or equal to 60kg.
#> • Removing end_test_weight <85kg records will delete responders: 1
#> • Deleted responders:
#> c("15967")
#> • Running Simple Linear Regression
#> • Calculate ADG using Simple Linear Regression succeeded!
adfi_results <- adfi_get(data = nedap_csv_data, adg_res = adg_results)
#> • There are no duplicate responders in different locations.
#> • Successfully generated the following 3 variables:
#> - FIV:feed intake per visit;
#> - OTV:occupation time per visit;
#> - FRV:feeding rate per visit;
#> • Successfully generated 10 error types from 3 variables:
#> - FIV-lo; FIV-hi; FIV-0; OTV-lo; OTV-hi; FRV-hi-FIV-lo; FRV-hi-strict; FRV-hi; FRV-0; FRV-lo;
#> • Running linear mixed model with equation:
#> dfi_right_part ~ otd_2 + otd_6 + otd_9 + otd_10 + otv_hi_p + frv_hi_fiv_lo_p + frv_lo_p + location + lm_slope + weight + (1 | responder)
fcr_results <- fcr_get(adg_res = adg_results, adfi_res = adfi_results)
head(fcr_results$fcr_res)
#> Key: <responder, location>
#> responder location start_date_origin min_weight_origin end_date_origin
#> <char> <char> <Date> <num> <Date>
#> 1: 13913 101 2024-02-19 21866.22 2024-05-25
#> 2: 13918 101 2024-02-19 16529.06 2024-05-25
#> 3: 13935 102 2024-02-19 21414.80 2024-05-25
#> 4: 13954 101 2024-02-20 30876.24 2024-05-25
#> 5: 13996 101 2024-02-19 22753.31 2024-05-25
#> 6: 14260 102 2024-02-19 27757.93 2024-05-25
#> max_weight_origin r_squared lm_slope test_days origin_dfi corrected_dfi
#> <num> <num> <num> <int> <num> <num>
#> 1: 132895.4 0.9914949 1180.5834 95 2562.365 2577.739
#> 2: 121102.8 0.9917870 1089.6744 96 2319.814 2305.710
#> 3: 138871.4 0.9976035 1233.5471 96 2703.371 2706.317
#> 4: 118602.7 0.9781564 941.2985 95 2317.260 2308.288
#> 5: 142956.6 0.9957625 1281.2277 96 2909.309 2921.329
#> 6: 142547.7 0.9962627 1221.0429 96 2846.546 2819.889
#> fcr
#> <num>
#> 1: 2.183445
#> 2: 2.115962
#> 3: 2.193931
#> 4: 2.452238
#> 5: 2.280101
#> 6: 2.309410
head(fcr_results$fcr_summary)
#> traits N min max median mean sd cv
#> <char> <char> <char> <char> <char> <char> <char> <char>
#> 1: min_weight_origin 28 7.09 30.88 22.22 21.52 4.87 22.64%
#> 2: max_weight_origin 28 105.51 146.80 135.95 132.66 10.57 7.97%
#> 3: lm_slope 28 907.51 1321.20 1203.42 1173.66 109.56 9.33%
#> 4: test_days 28 95.00 96.00 96.00 95.68 0.48 0.50%
#> 5: origin_dfi 28 1867.24 3361.74 2823.68 2739.91 357.71 13.06%
#> 6: corrected_dfi 28 1878.89 3336.57 2817.39 2737.01 353.74 12.92%
The import_csv()
function provides easy importing and
consolidation of multiple CSV files, handling the repetitive workflow of
loading raw data files. The adg_get()
function then
processes the imported data to calculate average daily gain over desired
weight ranges, with data cleaning, outlier detection and growth curve
plotting. Building on adg_get()
, dfi_get()
focuses on visit feed intake data, correcting errors and computing daily
feed intake matched to ADG segments. Finally, fcr_get()
combines the ADG and DFI results to efficiently compute the key feed
conversion ratio metric. Together these four functions provide a
streamlined workflow for analysis of pig feed efficiency from raw CSV
data to cleaned datasets, growth curves, ADG estimates, DFI calculations
and FCR metrics ready for downstream evaluation and modeling. By
encapsulating repetitive tasks like data import and cleaning, they allow
users to focus on analytical insights rather than coding details. The
output at each step can be customized based on analysis needs. These
interlocked functions enable easy end-to-end data wrangling and analysis
for pig feed intake data, facilitating efficient computation of ADG, DFI
and FCR feed efficiency indicators. That’s it ! This the end of the
documented story of our package.