Abstract:Probability distributions and their generalizations have contributed greatly inmodeling and analysis of random variables.
However, due to the increasedintroduction of new distributions there has been a major problem with theapplications of the several distributions in the literature, this has to dowith deciding the most appropriate distribution to be used for a given set ofdata. Most times, it is discovered that, the data set in question fits two ormore probability distributions and hence one has to be chosen among the others.The Lomax-Weibull and Lomax-Log-Logistic distributions introduced in an earlierstudy using a Lomax-based generator were found to be positively skewed and maybe victims of this situation especially when modeling positively skeweddatasets. In this article, we apply the two distributions to some selected datasetsto compare their performance and provide useful insight on how to select themost fit among them in real life situations. We used the value of thelog-likelihood function, AIC, CAIC, BIC, HQIC, Cram’er-VonMises(W*) and Anderson Darling(A*) statistics as performanceevaluation tools for selecting between the two distributions. Keywords:Lomax-Weibull distribution, Lomax-Log-Logistic distribution, Lomax-basedgenerator, Performance Evaluation.1.
Introduction A clear choice betweentwo related probability distribution functions is very vital and has been doneby some researchers such as Atkinson (1969), Dumonceaux et al (1973), Atkinson(1970),Kundu and Manglick (2005), Dumonceauxand Antle (1973), aswell as Kundu and Manglick (2004), e.t.c.The Lomaxdistribution was pioneered to model business failure data by Lomax (1954). Thisdistribution has found wide application in different fields of human endeavorcomprising income and wealth inequality, size of cities, actuarial science, medicaland biological sciences, engineering, lifetime and reliability modeling.
A randomvariable X is said to follow a Lomaxdistribution with parameter ? and ? ifits probability density function (pdf)is given by(1)where the correspondingcumulative distribution function (cdf)is given as(2)For where ? and ? are the shape and scaleparameters respectively.According to Cordeiroet al. (2014), the cdf and pdfof the Lomax-G family distributions (based on a Lomax generator) are respectivelygiven by: (3) and (4)where g(x) and G(x) are the pdf and cdf of any continuous distribution to begeneralized, while>0 and ?>0 are the additional newparameters responsible for the scale and shape of the distributionrespectively.
The rest of thisarticle is organized as follows: in Section 2we defined both Lomax-Weibull and Lomax-Log-Logisticdistributions. In section 3, we present a description of the goodness-of-fittest,some datasets, their summary and analysis. Finally, we offer some concludingremarks in section 4. 2 Materialsand Methods2.1The Lomax-Weibull Distribution (LWD)Weibulldistribution is a very popular continuous probability distribution named aftera Swedish Engineer, Scientist and Mathematician, Waloddi Weibull (1887 – 1979).The probability distribution was proposed and applied in 1939 to analyze thebreaking strength of materials. Since then, it has been widely used foranalyzing lifetime data in reliability engineering. It is a versatiledistribution that can take on the characteristics of other types ofdistributions, based on the value of the shape parameter.
The Weibull distributionis a widely used statistical model for studying fatigue and endurance life inengineering devices and materials.If a randomvariable X follows Weibull distribution with scale parameter ?>0and shape parameter ?>0, then its cdfand pdf are respectively given by:(5)(6)For where a and b are the scale and shapeparameters respectively.By substituting equations(5) and (6) into (3) and (4) and simplifying, we obtain the cdf and pdf of the Lomax-Weibull distribution respectively as:(7)(8)The following is aplot the pdf of the LWD at different parameter values.Figure 2: Thegraph of pdf of the LWD at different parameter values where .Considering the plotabove, we can rightly say that the LWDis skewed to the right with a very high degree of peakedness and can be usedfor modeling data sets positively skewed with higher kurtosis. 2.2 The Lomax-Log-Logistic Distribution (LLD)Log-Logistic distributionis also referred to as the fisk distribution in Economics, is a continuousprobability distribution for a non-negative random variable.
The log-logisticdistribution is often used to model random lifetime data and hence hasapplications in reliability analyses.The cdf and pdf of the Log-logistic are respectively given by: (9)and (10)For , where a> 0 andb> 0 are the scale and shapeparameters respectively.By substituting equations(9) and (10) into (3) and (4) and simplifying, we obtain the cdf and pdf of the Lomax-Log-Logistic distribution as follows:(11) (12)Below is a graphof the pdf of the LLD for some selected values of themodel parameters.Figure 2: The graph of pdf of the LLD atdifferent parameter values where .The plot for the pdf shows that the LLD is positively skewed with a very low coefficient of kurtosis andtherefore will only be good for datasets skewed to the right with moderatekurtosis.
2.3 Goodness-of-Fit Test To compare these twodistributions, we have considered some criteria: the value of thelog-likelihood function evaluated at the MLEs (ll), AIC (Akaike Information Criterion), CAIC (Consistent Akaike Information Criterion), BIC (Bayesian Information Criterion),and HQIC (Hannan Quin InformationCriterion). These statistics are given as:andWhere ?? denotes thelog-likelihood function evaluated at the MLEs,k is the number of model parametersand n is the sample size.
We also usedgoodness-of-fit tests in order to know which distribution fits the data better,we apply the Cram’er-Von Mises (W*), andAnderson Darling (A*) statistics.Further information about these statistics can be obtained from Chen andBalakrishnan (1995).These statistics can be computed as:andWhere,,is the known cdfwith (a k-dimensionalparameter vector), is the standardquantile function, , and .Note: In decisionmaking, model with the lowest values for these statistics would be chosen asthe best fit model.3 Results and Discussions3.1Analysis of Data In this section, seven differentdatasets were used to fit both the LWDand Lomax-Log-Logistic distribution by applying the formulas of the teststatistics in section 4 in order to discriminating between the two mentioneddistributions.
The available data sets and theirrespective summary statistics are provided in as follows;Dataset I: This dataset represents the remission times (in months) of arandom sample of 128 bladder cancer patients. It has previously been used byLee and Wang (2003).It is summarized as follows:Table 1: Summary statisticsfor dataset I parameters n Minimum Median Mean Maximum Variance Skewness Kurtosis Values 128 0.0800 3.348 6.395 11.840 9.366 79.
05 110.425 3.3257 19.1537 Dataset II:This dataset is the strength data of glass of the aircraftwindow reported by Fuller et al.
(1994). Table 2: Summary statisticsfor dataset II parameters n Minimum Median Mean Maximum Variance Skewness Kurtosis Values 31 18.83 25.51 29.90 35.83 30.81 45.
38 52.61 0.43 2.
38 Dataset III: This dataset represents the waiting times (in minutes) beforeservice of 100 Bank customers and examined and analyzed by Ghitany et al. (2013) for fitting the Lindleydistribution.Table 3: Summary statisticsfor dataset III parameters n Minimum Median Mean Maximum Variance Skewness Kurtosis Values 100 0.80 4.675 8.10 13.020 9.
877 38.500 52.3741 1.4953 5.
7345 Dataset IV: This dataset represents the lifetime’s data relating to relieftimes (in minutes) of 20 patients receiving an analgesic and reported by Gross et al. (1975) and has been used by Shankeret al. (2016).Table 4: Summary statisticsfor dataset IV parameters n Minimum Median Mean Maximum Variance Skewness Kurtosis Values 20 1.10 1.475 1.
70 2.05 1.90 4.10 0.4958 1.
8625 7.1854 DatasetV:This data represent the survival times in weeks for male rats. (Lawless, 2003).Table 5: Summary statisticsfor dataset V parameters n Minimum Median Mean Maximum Variance Skewness Kurtosis Values 20 40.00 86.75 119.00 140.
80 113.45 165.00 1280.892 -0.3552 2.2120 DatasetVI:The dataset is from Lawless (1982).
The data given arose in tests on enduranceof deep groove ball bearings. The data are the number of million revolutionsbefore failure for each of the 23 ball bearings in the life tests. Its summaryis given as follows:Table 6: Summary statisticsfor dataset VI parameters n Minimum Median Mean Maximum Variance Skewness Kurtosis Values 23 17.88 47.20 67.
80 95.88 72.23 173.40 1404.78 1.
0089 3.9288 Dataset VII: This dataset represents 66 observations of the breaking stressof carbon fibres of 50mm length (in GPa) given by Nicholas and Padgett (2006).The descriptive statistics for this data are as follows:Table 7: Descriptive statisticsfor dataset VII parameters n Minimum Median Mean Maximum Variance Skewness Kurtosis Values 66 0.390 2.178 2.
835 3.278 2.760 4.
900 0.795 -0.1285 3.2230 From the summary statisticsof the seven data sets, we found that data sets I, II, III, IV and VI arepositively skewed, while V is approximately normal. Also, data sets I, III andIV have higher kurtosis while others have moderate level of peakness.Table8:Performance of the distribution using their AIC, CAIC,BIC and HQIC values of the models MLEsbased on datasets I-VII.
Datasets Models Log-likelihood value Parameter Estimates Statistics Model Ranks Dataset I LWD 420.7675 =0.3928 =0.8735 =4.4202 =6.5906 AIC=849.5355 CAIC=849.8607 BIC=860.9437 HQIC=854.1707 2 LLD 411.4727 =7.9519 =1.6252 =8.1254 =5.4517 AIC=830.9454 CAIC=831.2707 BIC=842.3536 HQIC=835.5806 1 Dataset II LWD 146.435 =0.0987 =0.7832 =7.1911 =5.3806 AIC=300.8701 CAIC=302.4085 BIC=306.606 HQIC=302.7398 1 LLD 148.548 =9.5745 =3.3012 =2.2311 =6.2759 AIC=305.096 CAIC=306.6345 BIC=310.832 HQIC306.9658 2 Dataset III LWD 342.2547 =0.5010 =0.7455 =3.4439 =8.6494 AIC=692.5095 CAIC=692.9305 BIC=702.9302 HQIC=696.7269 2 LLD 319.8772 =9.5864 =2.2868 =7.5884 =4.8861 AIC=647.7543 CAIC=648.1754 BIC=658.175 HQIC=651.9718 1