Basic Usage • pptsda

library(pptsda)

import_csv

The import_csv() function allows efficient importing of multiple CSV files into a single R data object. It takes a vector of file paths as the main argument, with an optional parameter to specify the package to use for reading CSVs. Valid options are “data.table”, “vroom”, and “readr”. By default, data.table::fread is used for fast parsing. The function handles checking for valid input, loading the required package, looping through the files, reading each one with the appropriate CSV reader, and combining the results into a data.table or data.frame depending on the package used. This provides a convenient wrapper for batch importing CSVs without needing to write a for loop each time. To use it, supply a character vector of CSV file paths to the file_list argument. An optional second argument can choose the CSV reading package, with “data.table” being the default. The result is a consolidated data object containing the contents of all CSVs. This simplifies workflows that involve wrangling multiple CSV files into an analysis-ready dataset in R. By abstracting away the loop boilerplate, it allows the user to focus on the analytical tasks instead of data import details.

list_csv <- list.files("C:/Users/Dell/Downloads/location301", pattern = ".csv", full.names = TRUE)
data <- import_csv(list_csv)
#> Loading required package: data.table

adg_get

The adg_get() function is used to process raw pig feed intake data to calculate average daily gain (ADG) and generate growth curves. It takes the raw data frame as input, cleans invalid records in multiple steps, runs robust regression to detect outliers, fits simple linear models to estimate ADG over desired weight ranges, and produces growth curve plots. The user can specify a weight range to segment the ADG calculation via the my_break parameter. Outlier detection threshold and output image save path can also be customized. After processing, it returns ADG summary info and the analyzed dataset with outlier flags. To use it, supply the raw data and optionally set parameters like my_break. It handles data cleaning, analysis and plotting automatically. The output contains ADG results that can be used for downstream modeling or evaluation. This provides a streamlined workflow to go from raw feed intake data to analyzed growth curves and ADG estimates in a single function call.

nedap_csv_data <- mintyr::nedap
adg_results <- adg_get(data = nedap_csv_data)
#> • There are no duplicate responders in different locations.
#> • The removing of weight < 15kg will not delete responder.
#> • Removing records of missing will delete responders: 1
#> • Deleted responders: 
#>  c("15964")
#> • Running RANSAC Robust Regression:
#> • RANSAC Robust Regression succeeded!
#> • The outliers detected by Robust model will not delete responder.
#> • All responders' begin_test_weight are less than or equal to 60kg.
#> • Removing end_test_weight <85kg records will delete responders: 1
#> • Deleted responders: 
#>  c("15967")
#> • Running Simple Linear Regression
#> • Calculate ADG using Simple Linear Regression succeeded!
head(adg_results$adg_info)
#> Key: <responder>
#>    responder location start_date_origin min_weight_origin end_date_origin
#>       <char>   <char>            <Date>             <num>          <Date>
#> 1:     13913      101        2024-02-19          21866.22      2024-05-25
#> 2:     13918      101        2024-02-19          16529.06      2024-05-25
#> 3:     13935      102        2024-02-19          21414.80      2024-05-25
#> 4:     13954      101        2024-02-20          30876.24      2024-05-25
#> 5:     13996      101        2024-02-19          22753.31      2024-05-25
#> 6:     14260      102        2024-02-19          27757.93      2024-05-25
#>    max_weight_origin r_squared  lm_slope
#>                <num>     <num>     <num>
#> 1:          132895.4 0.9914949 1180.5834
#> 2:          121102.8 0.9917870 1089.6744
#> 3:          138871.4 0.9976035 1233.5471
#> 4:          118602.7 0.9781564  941.2985
#> 5:          142956.6 0.9957625 1281.2277
#> 6:          142547.7 0.9962627 1221.0429

adfi_get

The adfi_get() function processes raw pig feed intake visit data to correct outliers and errors and calculate daily feed intake (DFI). It takes the raw data and outputs from adg_get() as input. It handles data cleaning by flagging invalid visit data based on predefined rules. Errors are corrected by redistributing feed intake to valid visits. Final DFI is calculated after removing errors. The user can optionally specify a weight range to match the DFI calculation to segmented ADG. Key outputs are DFI summary info and the analyzed dataset. To use it, supply the raw data and adg_get() outputs. It handles error detection, correction, DFI calculation and matching to ADG segments automatically. The output contains corrected DFI metrics ready for evaluation. This provides an automated workflow to go from raw feed intake visit data to analyzed DFI estimates matched to ADG weight segments in a single function.

nedap_csv_data <- mintyr::nedap
adg_results <- adg_get(data = nedap_csv_data)
#> • There are no duplicate responders in different locations.
#> • The removing of weight < 15kg will not delete responder.
#> • Removing records of missing will delete responders: 1
#> • Deleted responders: 
#>  c("15964")
#> • Running RANSAC Robust Regression:
#> • RANSAC Robust Regression succeeded!
#> • The outliers detected by Robust model will not delete responder.
#> • All responders' begin_test_weight are less than or equal to 60kg.
#> • Removing end_test_weight <85kg records will delete responders: 1
#> • Deleted responders: 
#>  c("15967")
#> • Running Simple Linear Regression
#> • Calculate ADG using Simple Linear Regression succeeded!
adfi_results <- adfi_get(data = nedap_csv_data, adg_res = adg_results)
#> • There are no duplicate responders in different locations.
#> • Successfully generated the following 3 variables:
#>  - FIV:feed intake per visit;
#>  - OTV:occupation time per visit;
#>  - FRV:feeding rate per visit;
#> • Successfully generated 10 error types from 3 variables:
#>  - FIV-lo; FIV-hi; FIV-0; OTV-lo; OTV-hi; FRV-hi-FIV-lo; FRV-hi-strict; FRV-hi; FRV-0; FRV-lo;
#> • Running linear mixed model with equation: 
#>  dfi_right_part ~ otd_2 + otd_6 + otd_9 + otd_10 + otv_hi_p + frv_hi_fiv_lo_p + frv_lo_p + location + lm_slope + weight + (1 | responder)
head(adfi_results$adfi_info)
#> Key: <responder, location>
#>    responder location test_days origin_dfi corrected_dfi
#>       <char>   <char>     <int>      <num>         <num>
#> 1:     13913      101        95   2562.365      2577.739
#> 2:     13918      101        96   2319.814      2305.710
#> 3:     13935      102        96   2703.371      2706.317
#> 4:     13954      101        95   2317.260      2308.288
#> 5:     13996      101        96   2909.309      2921.329
#> 6:     14260      102        96   2846.546      2819.889

fcr_get

The fcr_get() function computes feed conversion ratio (FCR) by combining average daily gain (ADG) and daily feed intake (DFI) data. It takes the output lists from adg_get() and dfi_get() as input, merges the ADG and DFI summary dataframes by pig ID and location, calculates FCR as corrected DFI divided by ADG slope, and optionally adjusts FCR for a specified weight range. The output is an FCR dataframe with pig ID, location, FCR, and adjusted FCR if weight range is provided. To use it, supply the adg_get() and dfi_get() result lists. It handles merging the data, FCR calculation, and adjustment automatically. This provides a streamlined workflow to go from raw feed intake data to analyzed FCR estimates matched to ADG weight segments. The output FCR metrics can then be used for evaluation and modeling of feed efficiency. By combining ADG and DFI data, fcr_get() enables easy computation of the key FCR feed efficiency metric.

nedap_csv_data <- mintyr::nedap
adg_results <- adg_get(data = nedap_csv_data)
#> • There are no duplicate responders in different locations.
#> • The removing of weight < 15kg will not delete responder.
#> • Removing records of missing will delete responders: 1
#> • Deleted responders: 
#>  c("15964")
#> • Running RANSAC Robust Regression:
#> • RANSAC Robust Regression succeeded!
#> • The outliers detected by Robust model will not delete responder.
#> • All responders' begin_test_weight are less than or equal to 60kg.
#> • Removing end_test_weight <85kg records will delete responders: 1
#> • Deleted responders: 
#>  c("15967")
#> • Running Simple Linear Regression
#> • Calculate ADG using Simple Linear Regression succeeded!
adfi_results <- adfi_get(data = nedap_csv_data, adg_res = adg_results)
#> • There are no duplicate responders in different locations.
#> • Successfully generated the following 3 variables:
#>  - FIV:feed intake per visit;
#>  - OTV:occupation time per visit;
#>  - FRV:feeding rate per visit;
#> • Successfully generated 10 error types from 3 variables:
#>  - FIV-lo; FIV-hi; FIV-0; OTV-lo; OTV-hi; FRV-hi-FIV-lo; FRV-hi-strict; FRV-hi; FRV-0; FRV-lo;
#> • Running linear mixed model with equation: 
#>  dfi_right_part ~ otd_2 + otd_6 + otd_9 + otd_10 + otv_hi_p + frv_hi_fiv_lo_p + frv_lo_p + location + lm_slope + weight + (1 | responder)
fcr_results <- fcr_get(adg_res = adg_results, adfi_res = adfi_results)
head(fcr_results$fcr_res)
#> Key: <responder, location>
#>    responder location start_date_origin min_weight_origin end_date_origin
#>       <char>   <char>            <Date>             <num>          <Date>
#> 1:     13913      101        2024-02-19          21866.22      2024-05-25
#> 2:     13918      101        2024-02-19          16529.06      2024-05-25
#> 3:     13935      102        2024-02-19          21414.80      2024-05-25
#> 4:     13954      101        2024-02-20          30876.24      2024-05-25
#> 5:     13996      101        2024-02-19          22753.31      2024-05-25
#> 6:     14260      102        2024-02-19          27757.93      2024-05-25
#>    max_weight_origin r_squared  lm_slope test_days origin_dfi corrected_dfi
#>                <num>     <num>     <num>     <int>      <num>         <num>
#> 1:          132895.4 0.9914949 1180.5834        95   2562.365      2577.739
#> 2:          121102.8 0.9917870 1089.6744        96   2319.814      2305.710
#> 3:          138871.4 0.9976035 1233.5471        96   2703.371      2706.317
#> 4:          118602.7 0.9781564  941.2985        95   2317.260      2308.288
#> 5:          142956.6 0.9957625 1281.2277        96   2909.309      2921.329
#> 6:          142547.7 0.9962627 1221.0429        96   2846.546      2819.889
#>         fcr
#>       <num>
#> 1: 2.183445
#> 2: 2.115962
#> 3: 2.193931
#> 4: 2.452238
#> 5: 2.280101
#> 6: 2.309410
head(fcr_results$fcr_summary)
#>               traits      N     min     max  median    mean     sd     cv
#>               <char> <char>  <char>  <char>  <char>  <char> <char> <char>
#> 1: min_weight_origin     28    7.09   30.88   22.22   21.52   4.87 22.64%
#> 2: max_weight_origin     28  105.51  146.80  135.95  132.66  10.57  7.97%
#> 3:          lm_slope     28  907.51 1321.20 1203.42 1173.66 109.56  9.33%
#> 4:         test_days     28   95.00   96.00   96.00   95.68   0.48  0.50%
#> 5:        origin_dfi     28 1867.24 3361.74 2823.68 2739.91 357.71 13.06%
#> 6:     corrected_dfi     28 1878.89 3336.57 2817.39 2737.01 353.74 12.92%

The import_csv() function provides easy importing and consolidation of multiple CSV files, handling the repetitive workflow of loading raw data files. The adg_get() function then processes the imported data to calculate average daily gain over desired weight ranges, with data cleaning, outlier detection and growth curve plotting. Building on adg_get(), dfi_get() focuses on visit feed intake data, correcting errors and computing daily feed intake matched to ADG segments. Finally, fcr_get() combines the ADG and DFI results to efficiently compute the key feed conversion ratio metric. Together these four functions provide a streamlined workflow for analysis of pig feed efficiency from raw CSV data to cleaned datasets, growth curves, ADG estimates, DFI calculations and FCR metrics ready for downstream evaluation and modeling. By encapsulating repetitive tasks like data import and cleaning, they allow users to focus on analytical insights rather than coding details. The output at each step can be customized based on analysis needs. These interlocked functions enable easy end-to-end data wrangling and analysis for pig feed intake data, facilitating efficient computation of ADG, DFI and FCR feed efficiency indicators. That’s it ! This the end of the documented story of our package.