Abstract
Abstract
Zebrafish embryos are widely used as a model to monitor the effect of chemicals on their survival and hatching at different time epochs. This experimental design generates longitudinal data in which the observations for a given subject are correlated and they are statistically independent across the subjects. This particular nature of the observations suggests the use of generalized estimating equation (GEE) methodology for performing meaningful statistical analysis. However, it has been observed that the researchers1,2 working in this area have been routinely employing statistical methodologies such as analysis of variance (ANOVA) if the data are continuous and logit or probit models if the data are discrete. In our opinion, it is grossly incorrect to use these methods as they do not take into account the correlation structure mentioned above. The sole purpose of this article is to bring out this serious flaw clearly to the attention of the researchers. For illustration, we have studied the effects of two Ayurvedic bhasmas—Tamra bhasma and Suwarnamakshik bhasma—on survival and hatching of zebrafish embryos over certain time duration. The statistical analysis using GEE reveals a weak promotional effect of Suwarnamakshik bhasma and an inhibitory effect of Tamra bhasma on hatching.
Introduction
Z
Bhasmas are herbomineral medicinal preparations described in Ayurveda. They are prepared from several metals and minerals (including heavy metals like Hg and Pb, and toxic elements like S) by repeated grinding with herbal juices and heat treatments. 15 This repeated grinding leads to generation of small fraction of nanoparticles.16,17 The final product is in the form of fine powders comprising a mixture of partial and complete oxides of the starting metal(s). The selected bhasmas for this study, Tamra bhasma (prepared from copper foils) and Suwarnamakshik bhasma (prepared from iron pyrite), have been prescribed in several diseases, for example, liver disorders, arthritis, eczema, leucoderma, anemia, urinary diseases, and eye diseases. 15 The Tamra bhasma is found to reduce the superoxide dismutase levels in liver homogenates of rats treated orally with a dose of 5 mg/kg body weight indicating its antioxidant property. 18 Oral administration of Suwarnamakshik bhasma to rats at therapeutic dose increases the hemoglobin levels significantly accompanied with a significant decrease in the levels of cholesterol, triglycerides, and very-low-density lipoprotein, in serum without any adverse effects on liver, supporting its therapeutic use. 19 Also, attempts to understand the mode of action of bhasma at molecular level have led to the use of simpler model systems, such as yeast and mammalian cell lines. The previous studies with Jasad bhasma have shown oxygen-deficient zinc oxide nanoparticles in the bhasma preparation. They exert antioxidant effect in yeast and mammalian cell line A-549 by elevating the levels of antioxidant enzymes and protecting the cellular components better compared with the control.20–22 However, a simpler multicellular model, which is more similar to the one for human, and yet easy to handle and manipulate, is needed to screen the bhasmas for their bioactivity and quality. The zebrafish provides a vertebrate model system having these properties.
To study the effect of chemical entities or particles (that constitute these bhasmas too) on the zebrafish embryos, the embryos were incubated in a solution (or suspension) with the desired concentration(s) at 26°C to 28°C for several hours postfertilization (hpf ). Along with other specific biochemical tests, survival (or mortality), hatching, and developmental deformities are the most commonly observed general markers for estimating the gross effect of the chemical(s) or particles. A common way of representing the data about survival, hatching, or deformations of embryos is to report the percentage of embryos survived, hatched, and deformed.23–28 A few authors also report the data as the number of embryos survived, hatched, or deformed in the form of proportion.29–31 The observations thus obtained at a given time epoch principally follow binomial (and hence nonnormal) distribution.
Despite this, over the years, statistical significance of various treatments has been routinely determined by incorrectly employing analysis of variance (ANOVA) methodology meant for normal data to binomial data, at times without testing the data for normal distribution.2,32 Many authors have adopted the approach of testing the data for normality and equality of variance and then applying ANOVA methodology for analysis of the statistical differences between treatments.23,26,29,33–35 The Dunnett test or Tukey test has been used to find the significant difference among the treatments.27,32,34 Analysis for statistical significance has also been performed by using nonparametric methods, in case if the data do not seem to comply with the prerequisites of ANOVA.31,36–39
The probit model has been used to obtain the LC50 and EC50 values for the substance(s) having effect on hatching and survival of the embryos under treatment1,35,37,40,41 The significance of difference between the treatments has also been confirmed by constructing nonoverlapping confidence intervals for the LC50 and EC50 values. 33 The probit analysis has also been used to obtain the median effective time (MET) for survival and hatching of embryos to compare the effect of various metal ions on hatching and survival of the embryos under treatment.3,42
The method of logistic regression analysis considers the binomial nature of responses, such as hatching/no hatching and survival/no survival of embryos. 43 Although it has enriched the modeling scenario by providing a better suitable tool for the analysis of data, it has not been adopted widely. Very few authors have used the MET and/or concentration for hatching and survival of embryos from logistic regression analysis to compare the effect of treatment of chemicals on these parameters.24,30,44
It is to be noted that the use of aforementioned statistical methodologies for the analysis of longitudinal observations is grossly incorrect and that is because none of them takes into account the correlation structure that is present across the observations for a given subject. The only objective of this article was to bring to the attention of researchers working in this area this serious lapse on their part in performing statistical analysis of such data sets. It may, further be noted that, the correct statistical methodology to be used for analyzing longitudinal data set is generalized estimating equation (GEE). 45 GEE is basically used to capture correlation between observations made at different time epochs for a given subject. GEE methodology is available for response variables having standard probability distributions, such as normal, binomial, and Poisson. However, this statistical methodology has not so far been used to study the survival and hatching of zebrafish embryos under the treatment of chemical entities.
Here, the use of GEE methodology has been demonstrated by analyzing the data set that arise from treatment of zebrafish embryos with the selected bhasmas. The next section titled, Materials and Methods, describes the experimental setup used to generate the data and the details of the GEE methodology used for the statistical analysis. In the following sections, we report the findings of our work and provide detailed discussion on it. It is to be highlighted that this is the first attempt in applying GEE methodology to the analysis of zebrafish-related studies. To corroborate the effective use of GEE methodology for longitudinal data that arise in zebrafish-related studies, the help of simulated data from multinomial distribution has also been made.
Materials and Methods
Fish, bhasma, reagents
The wild-type zebrafish (Danio rerio) embryos were purchased from the local market for this study. All experiments were conducted as per the guidelines laid down by the Institute Animal Ethics Committee (IAEC). Tamra bhasma was prepared by Vaidya Ajit Joshi (Private practitioner of Ayurveda, Pune, India) as per the protocols described in Ayurveda. Suwarnamakshik bhasma (Bagewadikar, batch–SM-030708-BAG) was obtained as a gift. The embryo medium was prepared as described previously in Milli Q water using chemicals from SD Fine and Merck. 46 For microscopic observations of development, survival, and hatching of embryos, the Nikon TE2000U microscope was used.
Experiment details
The embryos of age 4 hpf were washed with the embryo medium and placed in 12-well plates as 10 embryos per well. The embryos were then incubated with bhasma suspension (Tamra bhasma and Suwarnamakshik bhasma). Three suspensions of concentrations 1000, 300, and 100 ppm were prepared by adding weighed amounts of the bhasma powder in the embryo medium. Concentrations of 5 and 10 ppm were prepared by diluting the 100 ppm suspension and that of 20 ppm was prepared by diluting the 300 ppm suspension of the respective bhasmas. The plates were observed for survival, hatching, and deformities of embryos at several time points. Each treatment was duplicated in each plate, and three such plates were maintained for each of the runs (total six replicates for a treatment). The details of treatment and time points are given in Table 1.
Statistical treatment of data
The data were recorded as number of embryos survived and hatched, out of 10 embryos per group, on treatment with several concentrations of Tamra bhasma and Suwarnamakshik bhasma. The experiment design included 300 embryos in Run 1 and 360 embryos in Run 2. The observations for survival or hatching of these embryos were made at several time points (Table 1). Hence, the methodology used for data analysis should take into account the longitudinal nature of the data along with its binary nature to understand the effect of the bhasmas on survival and hatching of embryos. Thus, the effect of concentration levels of bhasmas on survival and hatching of the embryos was analyzed by GEE approach, wherein each embryo was considered as an observation.
The GEE methodology is an extension of generalized linear model methodology developed to accommodate the correlation structure that exists between the responses measured at different time points on the same individual.
45
It is a semiparametric method based on marginal model approach, in which the likelihood for the marginal response is considered instead of full specification of the distribution function. The marginal responses are related to the linear combination of the covariates through a link function g(μij) (Equation 1) and the variance Var(Yij) is defined as a function of mean (Equation 2).
where μ
ij
=E (Yij) and Yij is the response for the subject i at time j,
The longitudinal and hence the correlated nature of data accounted for by including the working covariance matrix in the estimating equations. The design of experiments suggests that the correlation between any two observations over the same subject will depend on the distance between the time points of those observations. Thus, among several suggested models of covariance matrix, the first-order autoregressive–AR(1) covariance matrix is a natural one for the measurements taken repeatedly over the time.
45
Because in this model, the correlation between any two observations decreases as the distance between the time points increases (Equation 4).
where
where, Ai is the ti×ti.
The parameter estimation is by quasi-likelihood function where all the distributional assumptions are not necessary and deviations of the response variable from the standard distributions can be accounted for. The GEE estimate of β is the solution of Equation 6, where
The solution is found by using iterative reweighted least square approach. Even with the inaccurate specification of the structure of covariance matrix, the method yields consistent estimator of the regression coefficients and their variances. 47
Thus, the GEE methodology has been found to be more suitable one for the analysis of longitudinal data where the observations are made repeatedly over the same subject. A model selection is usually made using Akaike Information Criterion (AIC). AIC is based on the maximized value of the likelihood function L of the estimated model and a penalty function, the one that measures model complexity. A related criterion called quasi-likelihood under the independence model criterion (QIC) is obtained by replacing L by the quasi-likelihood function 48 and is used to select the models with most appropriate working covariance matrix and explanatory variables. The corrected quasi-likelihood under the independence model criterion (QICC), a corrected version of QIC that penalizes for model complexity, is a more suitable method for selection of covariates. 49 Among the models fitted using several working covariance matrices and combinations of covariates, the one with the least QIC and QICC scores is the best-fitting model to the data.
Using the selected models, the MET for hatching, which is the time at which the probability of hatching of an embryo is 0.5 for given concentration, is estimated. The statistical analysis software SPSS 16.0 has been used to compute the coefficients in the estimating equations.
Results
Figure 1a and b show the raw data of the number of embryos survived and hatched (out of 10 embryos in each group, 6 such groups per treatment) with respect to time at several concentration levels of Tamra bhasma and Suwarnamakshik bhasma, respectively. The preliminary observations made using the raw data show that the Tamra bhasma treatment delays hatching without exerting any adverse effect on the survival of embryos. Although 10 ppm Tamra bhasma (Run 1) seems to improve hatching and all higher concentrations appear to inhibit it when compared to the control group, the spread of data points restrains one from concluding in any specific manner (Fig. 1a). Similar observation can be noted in case of Figure 1b for Suwarnamakshik bhasma. Thus, the analysis of raw data using appropriate statistical method is necessary. In both the cases, no deformities were observed in embryos even at higher concentration levels of the bhasmas (Fig. 2).


Effect of Tamra bhasma and Suwarnamakshik bhasma on development of zebrafish embryos. Scale bar (500 μm).
Analysis of data using ANOVA
The data generated in these experiments are binary and longitudinal. The number of embryos survived (hatched) at a given time point is equal or less (more) than the number recorded on the previous time point. Hence, the prerequisites of ANOVA, that is, normality and independence of observations, are not met by the data. 47 However, when ANOVA methodology is forcefully applied to transformed data (square root transformation), it turns out that both the bhasmas do not have any effect on survival of embryos at all time points and all concentration levels (Supplementary Table S1; Supplementary Data are available online at www.liebertpub.com/zeb). The ANOVA for effect of the bhasma treatments on hatching of embryos along with the Tukey test reveals that the Tamra bhasma has significantly different effect at concentration levels (100, 300, and 1000 ppm). The average values for the treatment groups suggest inhibitory effect of Tamra bhasma on hatching. In a similar vein, Suwarnamakshik bhasma shows slight promotional effect on hatching of embryos at higher concentration level and during early time points (300 ppm) (Supplementary Tables S2 and S3).
Even though an attempt to transform the data using square root transformation to seek normality has been made, the independence assumption required for the use of ANOVA approach cannot be justified for analyzing this data set. In view of this, a technically sound statistical method, namely GEE method, has been employed here with the hope that the true picture regarding the effects of various concentrations of the bhasmas on the survival and hatching of embryos under the treatment will emerge.
Analysis of data using GEE methodology
The GEE methodology, which has been developed using generalized linear models for analyzing the correlated data with non-normal distribution, is more suitable in this case, where the binary observations for survival and hatching of an embryo were recorded repeatedly over several time epochs. 47
The models selected on the basis of lowest QIC and QICC score for Tamra bhasma and Suwarnamakshik bhasma (Tables 2 and 3) in case of survival of embryos show that only time variable is statistically significant (p-value<0.01) and concentration of each of the bhasmas does not affect the survival of embryos. The time affects the survival of the embryos adversely as manifested by the negative sign of the coefficient for time (Tables 2 and 3).
HS, the effect of variable is highly statistically significant (p-value<0.05).
SE, standard error.
HS, the effect of variable is highly statistically significant (p-value<0.05); MS, the effect of variable is marginally statistically significant (p-value 0.05–0.1); IS, the effect of variable is statistically insignificant (p-value>0.1).
The GEE model selected for the effect of Tamra bhasma on hatching of embryos clearly indicates that the concentration variable is highly statistically significant (p-value<0.01) with adverse effect, whereas the time variable turns out to be statistically significant (p-value<0.01) having a favorable effect. The model also indicates that the interaction effect such as “concentration of the bhasma”דtime” is highly statistically significant (Table 2). Figure 3a and b show the individual and cumulative probabilities for survival and hatching estimated using the GEE model for Tamra bhasma. It is to be noted that the MET for hatching, where the probability of hatching for an embryo is 0.5, increases with the concentration indicating the adverse effect of concentration levels of Tamra bhasma. For example, MET for hatching is 48 hpf in case of 10 ppm bhasma treatment and for 100 ppm bhasma, it increases to 63 hpf (Fig. 3c, d). However, in case of Run 2, the parameter estimates for the effect of Tamra bhasma on hatching of embryos (Table 2) are such that the MET estimates for concentration values greater than 256.33 ppm turn out to be negative and hence have been discarded (Supplementary Data).

In case of Suwarnamakshik bhasma, the concentration of bhasma is statistically insignificant variable (p-value>0.1) in model for hatching—Run 1. However, the concentration variable for Suwarnamakshik bhasma is significant (p-value<0.05) in case of hatching—Run 2 (Table 3) with a positive value for the β estimate. Figure 4a and b show the individual and cumulative probabilities estimated from the corresponding selected GEE models for Suwarnamakshik bhasma. The small enhancement in hatching due to the bhasma treatment can be seen in Figure 4b and d. The MET for hatching decreases with increasing concentration, indicating a positive effect of the bhasma on hatching of embryos (Fig. 4d).

Discussion
The longitudinal data are often collected in toxicological studies, pharmacological efficacy studies, and clinical trials. The GEE methodology has been specially developed for performing statistical analysis of longitudinal data.45,47,50 The longitudinal categorical data set often arise during the studies involving the effects of several compounds on the zebrafish embryos as the observations are made in terms of number of embryos survived or dead, hatched or unhatched, presence or absence of deformities, and so on over a certain time period. It may be noted that ANOVA and other statistical methods, such as probit, logit, and logistic regression analysis, have been incorrectly used for statistical analysis of such data sets.1,41 The reason that these methods are invalid is that they do not take into account the longitudinal nature of the data that arise from such studies.
The Tamra bhasma is prepared from copper foils, and the final product is in the form of fine particles of copper oxide. 17 Although appropriate, zebrafish model has not been used to study the bioactivity and quality of the bhasmas. Previous studies on the effect of copper ions on zebrafish embryos report delay in hatching of zebrafish embryos due to treatment of copper ions.42,51,52 The complete inhibition of hatching of zebrafish embryos incubated in 58 μg/L of copper solution has been observed. 51 Even though the survival of the embryos was unaffected at low concentration levels of copper ions (0.125 μg/L), higher concentrations of copper ions (128 μg/L) killed all embryos within 24 hpf. 42 The copper ions also induced the deformities, such as malformation of yolk sack and notochord, yolk sac edema, and slower heart rate in the developing embryos when treated with 58 μg/L of copper followed by 58 μg/L of lead solution. 51 The copper ion-treated embryos also had decreased larval lengths, increased heart rate, and impaired neuromast development when compared with controls. 52 The yolk sack edema was attributed to the slower development, whereas the increased heart rate was considered as the sign of stress response in case of copper-treated embryos. Thus, impaired development along with the inhibition of hatching enzyme (chorionase) due to copper ions can be attributed to the delayed hatching in these studies.53,54 The recent studies with copper oxide nanoparticles have confirmed that the dissolved copper ions from the nanoparticles inhibit the hatching enzyme. 55
In our studies, the statistical analysis performed on the data set to study the effect of Tamra bhasma on survival and hatching of zebrafish, using a more formal and suitable method such as GEE, confirms the preliminary observations made using the raw data. Basically, the GEE analysis shows that (a) the bhasma does not have any statistically significant effect on survival of embryos and (b) it does have statistically significant negative impact on hatching of embryos. This bhasma treatment results in delaying the hatching process. However, the response of a fish embryo to the treatment may depend on factors, such as age of an embryo before starting the treatment and genetic variability from batch to batch. Here, the wild-type embryos from two separate batches procured from market separately were used. Hence, the difference in response to the treatments noted here as delay in hatching and evaluated as MET for hatching may be due to the genetic variability across the embryos.
The Suwarnamakshik bhasma is prepared from the mineral copper–iron pyrite, and the final product contains micron- and submicron-sized particles of copper and iron oxides. 56 However, the Suwarnamakshik bhasma procured by the authors is found to be mainly Fe2O3 in fine particulate form. In 1985, Dave reported the effect of Fe3+ on survival and hatching of zebrafish embryos at different pH. Fe gets precipitated out from the solution at neutral pH even at low concentration as hydroxides. Hence, it fails to exert any effect on survival at this pH. However, small particles of iron hydroxides are capable of blocking the pores of the egg chorion. The blocking of pores induces hypoxia and stress, which may trigger the hatching of embryos by the release of chorionase. 3 However, findings from the studies by Zhu et al. to assess the toxicity of iron oxide nanoparticles are not in agreement; wherein the incubation of embryos with iron oxide (Fe2O3) nanoparticles at concentrations >10 mg/L affected the survival and hatching negatively and also caused developmental deformities. 2 The GEE analysis of the data presented here for the effect of Suwarnamakshik bhasma on survival and hatching of zebrafish embryos shows that the survival is unaffected by the bhasma and the hatching of embryos is slightly promoted by the bhasma. The slight promotion of hatching by fine Fe2O3 particles of the bhasma, present in sufficient number at higher concentration levels, may follow the mechanism reported by Dave. However, the promotional effect can only be observed during early hours of hatching. In case of Run 2, the time points are sufficiently closely spaced to observe this promotional effect of the bhasma. The Run 1 fails to identify this effect due to more distantly spaced time points.
In case of both the bhasmas, the development of embryos was not affected even at higher concentration levels. This indicates no developmental toxicity is posed by these bhasmas at least at the dose levels studied here. 4 These studies also demonstrate that the zebrafish embryos can be used to evaluate the effects of bhasmas and hence can successfully be used as model systems with appropriate statistical analytical tool, such as GEE.
It is to be noted with caution here that, in this particular study, the results provided by ANOVA and GEE turn out to be similar. However, this may not be true in all such cases. To verify this, it was not possible to perform the statistical analysis of previously published similar data using GEE methodology due to nonavailability of the same. This has prompted us to simulate the data similar to the experimental data using multinomial distribution, and the software used for the same is MatLab R2010a (version 7.10.0.499).
Generation and statistical analysis of simulated data sets
The data, similar to experimental data, have been simulated using multinomial distributions having (n,
>>n=10;
>>p=[p1, p2, p3, p4];
>>r=mnrnd (n,p,4).
Here, we have used p=(0.085, 0.12, 0.14, 0.655) for concentration level 1 (0 ppm),
This data set was then represented in manner similar to experimental data (Table 4) and was analyzed using both ANOVA and GEE for comparison of results.
The replicates are entered as multiple entries in a cell.
The simulated data set when analyzed with ANOVA showed no significant (p-value>0.05) effect of concentration levels on survival of embryos (Fig. 5b) at different time points. On the other hand, GEE analysis of same data set shows that the concentration levels do have a highly significant (p-value<0.01) negative effect on survival of embryos (Fig. 5c).

Simulated data set demonstrating the difference in analyses and inferences by ANOVA and GEE.
The use of ANOVA methodology is appropriate only when
(A) observations under consideration are continuous, normally distributed, and statistically independent
or
(B) observations (discrete or continuous) represent partial sums or sample means of independent observations and the sample size is very large. In this case, the central limit theorem (CLT) can be invoked to ensure the assumption of normality. However, if the number of observations is not sufficiently large (say n≥30) (also in case of examples discussed here), then the CLT cannot be used and the ANOVA methodology does not become applicable. Moreover, ANOVA cannot be used when the observations are longitudinal in nature.
One may use MANOVA (multivariate version of ANOVA) for the data
Typically, the number of observations used in zebrafish-related studies being small, neither ANOVA (for fixed time points) nor MANOVA (for vector valued data) can be employed to study the statistical significance of the treatment effects.
In light of the above discussion, the only suitable methodology that could be used to analyze this type of repeated measurement or longitudinal data is GEE and the results obtained using GEE are more reliable.
Conclusion
The valid statistical methodology, GEE, for analyzing longitudinal data set that arises in the study of effects of an ancient Ayurvedic preparations, Tamra bhasma and Suwarnamakshik bhasma, on zebrafish embryos reveals that (a) the treatment of Tamra bhasma results in delaying the hatching process. The MET for hatching increases with the bhasma concentration. The delay is attributed to the inhibition of hatching enzymes by the copper ions from the bhasma particles. Also (b) the Suwarnamakshik bhasma promotes hatching at higher concentrations. None of the bhasmas affects the development of the zebrafish embryos or led to the deformations. The minute but statistically significant effect of Suwarnamakshik bhasma on hatching of zebrafish is realized through GEE. Simulated data sets have been used to indicate the appropriateness of GEE methodology over ANOVA for statistical analysis of longitudinal data sets.
Footnotes
Acknowledgment
We are thankful to S. G. Kane and Industrial Research and Consultancy Centre (IRCC), IIT Bombay, and Nano Mission of the Department of Science and Technology (DST), New Delhi, for funding this work.
Disclosure Statement
No competing financial interests exist.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
