Skip to contents

An R package for analyses of bioassays and probit graphs

Piyal Karunarathne, Nicolas Pocquet, Pascal Milesi, and Pierrick Labbé

This package is designed to analyze mortality data from bioassays of one or several strains/lines/populations. As of now, the functions in the package allow adjusting for mortality in the controls with Abott’s correction. For each strain, functions are available to generate a mortality-dose regression using a generalized linear model (which takes over-dispersion into account and allow mortality of 0 or 1), and plot the regressions with or without the desired confidence interval (e.g. 95%).

The package also provides functions to test the linearity of the log-dose response using a chi-square test between model predictions and observed data (significant deviations from linearity may reflect mixed populations for example).

The package also allows determining the lethal doses for 25%, 50% and 95% of the population (LD25, LD50 and LD95 respectively) or the level as specified by the user, with their 95% confidence intervals (CI) and variance of each (e.g., LD25var, LD50var, etc.), following Johnson et al. 2013 approach, which allows taking the heterogeneity of the data into account (Finney 1971) to calculate the CI (i.e. a larger heterogeneity will increase the CI).

The methods implemented here use likelihood ratio tests (LRT) to test for differences in resistance levels among different strains. Finally, resistance ratios (RR) at LD25, LD50 and LD95, i.e. the LD ratios between a given strain and the strain with the lowest LD50 (or LD25,LD50, and LD95; usually it is the susceptible reference), with their 95% confidence intervals are calculated according to Robertson and Preisler (1992).

  • Installing BioRssay
#1. CRAN version
install.packages("BioRssay")
#2. Developmental version
if (!requireNamespace("devtools", quietly = TRUE)) 
        install.packages("devtools") 
    devtools::install_github("milesilab/BioRssay", build_vignettes = TRUE)

1. DATA PREPARATION

BioRssay can import data in any format that is compatible with base R data import functions (e.g. read.table, read.csv). However, for the functions in BioRssay to work, the data must have at least the following columns (other columns won’t be used, but are no hindrance).

  • strain: a column containing the strains tested
  • dose: dosage tested on each strain/sample (controls should be entered as 0)
  • total: total number of samples tested
  • dead: number of dead (or knock down) samples

See the examples below.

Example 1

data(bioassay)
head(bioassay$assay2)
#>   insecticide  strain  dose total dead replicate     date
#> 1    temephos KIS-ref 0.000   100    1         1 26/01/11
#> 2    temephos KIS-ref 0.002    97   47         1 26/01/11
#> 3    temephos KIS-ref 0.003    96   68         1 26/01/11
#> 4    temephos KIS-ref 0.004    98   89         1 26/01/11
#> 5    temephos KIS-ref 0.005    95   90         1 26/01/11
#> 6    temephos KIS-ref 0.007    99   97         1 26/01/11

Also download the test data at https://github.com/milesilab/DATA/blob/main/BioAssays/Test.BioRssay.txt and find more example data sets at https://github.com/milesilab/DATA/tree/main/BioAssays

Example 2

file <- paste0(path.package("BioRssay"), "/Test.BioRssay.txt")
test<-read.table(file,header=TRUE)
head(test)
#>   insecticide strain dose total dead
#> 1  bendiocarb Kisumu 0.00    25    0
#> 2  bendiocarb Kisumu 0.00    25    0
#> 3  bendiocarb Kisumu 0.00    25    0
#> 4  bendiocarb Kisumu 0.00    25    0
#> 5  bendiocarb Kisumu 0.01    25    0
#> 6  bendiocarb Kisumu 0.01    25    0

NOTE: It is also possible to include a reference strain/population with the suffix “ref” in the strain column (see example 1), or the reference strain can be specified later in the function resist.ratio to obtain the resistance ratios for each strain (see below).

2. Analysis

The workflow is only succinctly described here, for more information on the functions and their options, see individual one in the reference index.

Example 1

Let’s have a quick look at the data again.

assays<-bioassay
exm1<-assays$assay2
head(exm1)
#>   insecticide  strain  dose total dead replicate     date
#> 1    temephos KIS-ref 0.000   100    1         1 26/01/11
#> 2    temephos KIS-ref 0.002    97   47         1 26/01/11
#> 3    temephos KIS-ref 0.003    96   68         1 26/01/11
#> 4    temephos KIS-ref 0.004    98   89         1 26/01/11
#> 5    temephos KIS-ref 0.005    95   90         1 26/01/11
#> 6    temephos KIS-ref 0.007    99   97         1 26/01/11
unique(as.character(exm1$strain))
#> [1] "KIS-ref" "DZOU"    "DZOU2"

This example contains the mortality data of three strains (KIS-ref, DZOU, and DZOU2 ); KIS is used as the reference, as indicated by the “ref” suffix.

The first step is to check whether the controls have a non-negligible mortality, in which case a correction should be applied to the data, before probit transformation. This is easily achieved with the function probit.trans().

dataT<-probit.trans(exm1) #additionally an acceptable threshold for controls' mortality can be set as desired with "conf="; default is 0.05.
dataT$convrg
#> NULL
head(dataT$tr.data)
#>   insecticide  strain  dose total dead replicate     date      mort   probmort
#> 2    temephos KIS-ref 0.002    97   47         1 26/01/11 0.4845361 -0.0387720
#> 3    temephos KIS-ref 0.003    96   68         1 26/01/11 0.7083333  0.5485223
#> 4    temephos KIS-ref 0.004    98   89         1 26/01/11 0.9081633  1.3295291
#> 5    temephos KIS-ref 0.005    95   90         1 26/01/11 0.9473684  1.6198563
#> 6    temephos KIS-ref 0.007    99   97         1 26/01/11 0.9797980  2.0495943
#> 7    temephos KIS-ref 0.010    99   99         1 26/01/11 0.9940000  2.5121443

The output of probit.trans is a list of which the first element (convrg) contains the results of Abott’s correction and convergence values.

However, since the mortality in the controls (dose=0) is below 5% (conf=0.05) in the present example, data$convrg is NULL and thus no correction is applied to the data . The second element of the list dataT is the probid transformed data with two additional columns: mort, the observed mortalities, and probmort, the observed probit-transformed mortalities. This data frame is what we’ll use in the next steps of the analysis.

If you set the threshold to conf=0.01 with example 1, you can assess the effects of the Abbot’s correction: all mortalities are slightly reduced to take the base control mortality into account.

The second step is to compute the lethal dose values (25%, 50% and 95%, LD25, LD50 and LD95 respectively) and the corresponding resistance ratios. The function resist.ratio allows you to do just that (user also has the option to calculate these values for different LD values). If no reference strain has been specified in the data file (using the suffix “ref” as mentioned above), it can be specified in ref.strain=. Otherwise, the strain with the lowest LD50 will be considered as such. By default, the LDs’ 95% confidence intervals are computed (the min and max values are reported); you can adjust this using conf.level=.

data<-dataT$tr.data #probid transformed data
RR<-resist.ratio(data)
RR
#>         Slope SlopeSE Intercept InterceptSE    h      g Chi2 df Chi(p)   LD25
#> DZOU     1.75  0.1537      3.72      0.3579 1.91 0.0397 9.88 10 0.4514 0.0030
#> DZOU2    1.69  0.1657      3.74      0.3907 2.33 0.0490 9.92 10 0.4473 0.0025
#> KIS-ref  3.17  0.3812      8.95      0.9849 3.26 0.0716 8.59 11 0.6597 0.0009
#>         LD25min LD25max   LD50 LD50min LD50max   LD95 LD95min LD95max rr25
#> DZOU      2e-04  0.0192 0.0074   6e-04  0.0404 0.0645  0.0087  0.2472 3.24
#> DZOU2     1e-04  0.0194 0.0062   3e-04  0.0412 0.0576  0.0055  0.2578 2.64
#> KIS-ref   0e+00  0.0142 0.0015   0e+00  0.0209 0.0050  0.0001  0.0537 1.00
#>         rr25min rr25max rr50 rr50min rr50max rr95 rr95min rr95max
#> DZOU     2.7000    3.91 4.84  4.1600    5.64   13  7.6800   22.00
#> DZOU2    2.1900    3.18 4.05  3.5000    4.69   11  7.1100   19.00
#> KIS-ref  0.7947    1.26 1.00  0.8488    1.18    1  0.8085    1.24

Note that we did not specify the reference strain here as it is already labeled in the data

For each strain, you have first the LD25, LD50 and LD95 and their upper and lower limits (defaults is 95% CI), then the slope and intercept of the regression (with their standard error), the heterogeneity (h) and the g factor (“With almost all good sets of data, g will be substantially smaller than 1.0 and seldom greater than 0.4.” Finney, 1971).

The result of the chi test (Chi(p)) is then indicated to judge whether the data follow a linear regression: here all the p-values are over 0.05 so the fits are acceptable. Finally the resistance ratios are indicated for LD25, LD50 and LD95 (RR25, RR50 and RR95), as well as their upper and lower limits.

The third step, when analyzing more than one strain, is now to test for difference in dose-mortality responses between strains using the model.signif() function.

model.signif(dataT$tr.data)
#> Analysis of Deviance Table
#> 
#> Model 1: mortality ~ log10(data$dose)
#> Model 2: mortality ~ log10(data$dose) * data$strain
#>   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
#> 1        32     936.46                          
#> 2        28      70.69  4   865.77 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> complete model is significant against a NULL model 
#>  continueing to pair-wise comparison
#> Output details
#>   model.pval - significance value of ANOVA on the binomial GLM test of the strain pair
#>   bonferroni - significance of the model.pval with bonferroni correction
#>   res.Dv - residual deviance
#>   thr - threshold for the significance of the pvalue
#>   str - values for the strains
#>   int - values for the interaction between the strain and the dose
#> 
#> $model
#>   strain1 strain2 model.pval bonferroni res.Dv.Null res.Dv.str res.Dv.int
#> 1    DZOU   DZOU2    0.26402    non-sig          NA         NA         NA
#> 2    DZOU KIS-ref          0        sig     1150.41     87.892     49.747
#> 3   DZOU2 KIS-ref          0        sig     1064.79     95.043     53.537
#>   str.pval str.thr int.pval int.thr
#> 1       NA      NA       NA      NA
#> 2        0 0.01250    0.001   0.025
#> 3        0 0.01667    0.001   0.050

As there are 3 strains, the function first tests whether all strains are similar (i.e. equivalent to 1 strain) or not (i.e. at least one is different from others), using a likelihood ratio test. Here, the test is highly significant, some strains are thus different in terms of dose response.

Pairwise tests are then performed and reported below. Here, the KIS strain is different from DZOU and from DZOU2 strains (model.pval <0.05). DZOU and DZOU2 are not different (model.pval >0.05). The bonferroni column indicates whether the p-values <0.05 remain significant (sig vs non-sig) after correction for multiple testing.

Further, the function outputs seven more columns with statistical outputs from the model evaluation between strains and strain-dose to a null model. The abbreviations are as follows:
res.Dv - residual deviance
thr - threshold for the significance of the pvalue
str - values for the strains
int - values for the interaction between the strain and the dose

Note: the pvalues for strain and strain-dose interaction is from a F-test for a binomial model.

Data Visualization The data and the regression can be plotted with confidence levels using the mort.plot() function. It is also possible to take the validity of the linearity test into account for the plots using the test.validity= option. The probit-transformed mortalities (probit.trans() function) are plotted as a function of the log10 of the doses.

strains<-levels(data$strain)
par(mfrow=c(1,2)) # set plot rows
# plot without confidence intervals and test of validity of the model
mort.plot(data,plot.conf=FALSE,test.validity=FALSE) 
# plot only the regression lines
mort.plot(data,plot.conf=FALSE,test.validity=FALSE,pch=NA) 

# same plots with confidence level
par(mfrow=c(1,2))
mort.plot(data,plot.conf=TRUE,test.validity=FALSE)
mort.plot(data,plot.conf=TRUE,test.validity=FALSE,pch=NA)

It is also possible to plot different confidence intervals with the conf.level= option (the default is 0.95). It is possible to plot only a subset of strains using the strains= option to list the desired strains; if not provided, all the strains will be plotted.

Note that the plots can be generated directly from the “resist.ratio” function using the plot=TRUE option.

Example 2

We follow the same workflow (using the plot option in resist.ratio()). However, there are more than one insecticide tested in this experiment. Therefore, we need to subset the data for each insecticide, and carry out the analysis as before.

head(test)
#>   insecticide strain dose total dead
#> 1  bendiocarb Kisumu 0.00    25    0
#> 2  bendiocarb Kisumu 0.00    25    0
#> 3  bendiocarb Kisumu 0.00    25    0
#> 4  bendiocarb Kisumu 0.00    25    0
#> 5  bendiocarb Kisumu 0.01    25    0
#> 6  bendiocarb Kisumu 0.01    25    0
unique(test$insecticide)
#> [1] "bendiocarb"         "chlorpyrifos-metyl" "permethrin"
bend<-test[test$insecticide=="bendiocarb",]
head(bend)
#>   insecticide strain dose total dead
#> 1  bendiocarb Kisumu 0.00    25    0
#> 2  bendiocarb Kisumu 0.00    25    0
#> 3  bendiocarb Kisumu 0.00    25    0
#> 4  bendiocarb Kisumu 0.00    25    0
#> 5  bendiocarb Kisumu 0.01    25    0
#> 6  bendiocarb Kisumu 0.01    25    0

We will use a subset of the data for the insecticide “bendiocarb” only.

dataT.b<-probit.trans(bend)
data.b<-dataT.b$tr.data

RR.b<-resist.ratio(data.b,plot = T,ref.strain = "Kisumu",plot.conf = T, test.validity = T)

head(RR.b)
#>         Slope SlopeSE Intercept InterceptSE    h      g  Chi2 df Chi(p)    LD25
#> Acerkis  5.88  0.3656  -10.2992      0.6734 1.00 0.0149 11.00 31 0.9997 43.0000
#> AgRR5    8.52  0.6962  -16.1686      1.3600 1.61 0.0279 18.00 31 0.9716 66.0000
#> Kisumu   7.69  0.6055    4.3400      0.3282 1.00 0.0238  7.34 39 1.0000  0.2228
#>         LD25min LD25max    LD50 LD50min  LD50max     LD95 LD95min  LD95max rr25
#> Acerkis 18.0000 132.000 57.0000  23.000 178.0000 108.0000 41.0000 371.0000  195
#> AgRR5   19.0000 376.000 79.0000  22.000 468.0000 123.0000 33.0000 798.0000  296
#> Kisumu   0.1355   0.321  0.2726   0.172   0.3824   0.4461  0.3077   0.5862    1
#>          rr25min rr25max rr50  rr50min rr50max rr95 rr95min rr95max
#> Acerkis  86.0000  442.00  207 103.0000   418.0  241 43.0000    1360
#> AgRR5   131.0000  666.00  290 144.0000   583.0  277 49.0000    1560
#> Kisumu    0.3352    2.98    1   0.3849     2.6    1  0.0884      11

Note that we have enabled the arguments “plot=” with “plot.conf=” and test.validity=. When the log-dose-response is not linear for a strain (Chi-square p-value < 0.05), it will be plotted without forcing linearity as for “Acerkis or AgRR5” strains in this example.

#To then test the difference in dose-mortality response between the strains
t.models<-model.signif(data.b)
#> Analysis of Deviance Table
#> 
#> Model 1: mortality ~ log10(data$dose)
#> Model 2: mortality ~ log10(data$dose) * data$strain
#>   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
#> 1       102     1981.5                          
#> 2        98      107.8  4   1873.7 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> complete model is significant against a NULL model 
#>  continueing to pair-wise comparison
#> Output details
#>   model.pval - significance value of ANOVA on the binomial GLM test of the strain pair
#>   bonferroni - significance of the model.pval with bonferroni correction
#>   res.Dv - residual deviance
#>   thr - threshold for the significance of the pvalue
#>   str - values for the strains
#>   int - values for the interaction between the strain and the dose
#> 
t.models
#> $model
#>   strain1 strain2 model.pval bonferroni res.Dv.Null res.Dv.str res.Dv.int
#> 1 Acerkis   AgRR5          0        sig     1437.16     94.149     76.989
#> 2 Acerkis  Kisumu          0        sig     1824.23     66.860     59.425
#> 3   AgRR5  Kisumu          0        sig     1822.46     80.266     79.193
#>   str.pval str.thr int.pval int.thr
#> 1        0 0.00833    0.000 0.01667
#> 2        0 0.01000    0.145 0.02500
#> 3        0 0.01250    0.594 0.05000

Note that at least one of the strains failed the linearity test, the validity of the pairwise dose-mortality response test is, at best, highly questionable. We do not recommend it.

If many strains are present and only one (few) fails the linearity tests, we do recommend users to remove specific strains from the analyses.

These steps can be repeated for the different insecticides, either one by one or or in a loop (e.g. “for” loop function).

Example 3

file <- paste0(path.package("BioRssay"), "/Example3.txt") #import the example file from the package
exm3<-read.table(file,header=TRUE)
trnd<-probit.trans(exm3) #probit transformation and correction of data
resist.ratio(trnd$tr.data,LD.value = c(50,95),plot = T) #get LD and RR values with the mortality plot

#>      Slope SlopeSE Intercept InterceptSE    h      g Chi2 df Chi(p)   LD50
#> DZOU  1.64  0.1857      3.77      0.4515 7.94 0.0579   61 17  0.000 0.0050
#> KIS   3.46  0.2648      9.64      0.7206 3.09 0.0279   14 13  0.358 0.0016
#>      LD50min LD50max   LD95 LD95min LD95max rr50 rr50min rr50max rr95 rr95min
#> DZOU   2e-04  0.0408 0.0504  0.0034  0.2650 3.07  2.7700    3.40   10  7.6200
#> KIS    1e-04  0.0099 0.0048  0.0005  0.0254 1.00  0.8991    1.11    1  0.8157
#>      rr95max
#> DZOU   14.00
#> KIS     1.23
model.signif(trnd$tr.data) # test the models significance for each strain
#> Output details
#>   model.pval - significance value of ANOVA on the binomial GLM test of the strain pair
#>   bonferroni - significance of the model.pval with bonferroni correction
#>   res.Dv - residual deviance
#>   thr - threshold for the significance of the pvalue
#>   str - values for the strains
#>   int - values for the interaction between the strain and the dose
#> 
#> $model
#> $model$pairT
#> Analysis of Deviance Table
#> 
#> Model 1: mortality ~ log10(data$dose)
#> Model 2: mortality ~ log10(data$dose) * data$strain
#>   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
#> 1        30     703.60                          
#> 2        28     164.17  2   539.43 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> $model$fullM
#>                              Df  Deviance Resid. Df Resid. Dev         F
#> NULL                         NA        NA        31  1839.6531        NA
#> log10(data$dose)              1 1136.0509        30   703.6021 225.62945
#> data$strain                   1  404.0593        29   299.5428  80.24964
#> log10(data$dose):data$strain  1  135.3709        28   164.1719  26.88581
#>                                    Pr(>F)
#> NULL                                   NA
#> log10(data$dose)             6.302655e-15
#> data$strain                  1.029903e-09
#> log10(data$dose):data$strain 1.671886e-05

3. REFERENCES

  1. Finney DJ(1971). Probitanalysis. Cambridge:Cambridge UniversityPress. 350p.

  2. HommelG(1988). A stage wise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75, 383-6.

  3. Johnson RM, Dahlgren L, Siegfried BD,EllisMD(2013). Acaricide,fungicide and druginteractions in honeybees (Apis mellifera). PLoSONE8(1): e54092.

  4. Robertson, J. L., and H.K. Preisler.1992. Pesticide bioassays with arthropods. CRC, Boca Raton, FL.