Next:   [Contents][Index]

52.1 Introduction to distrib

Package distrib contains a set of functions for making probability computations on both discrete and continuous univariate models.

What follows is a short reminder of basic probabilistic related definitions.

Let \(f(x)\) be the density function of an absolute continuous random variable \(X\). The distribution function is defined as

\[F\left(x\right)=\int_{ -\infty }^{x}{f\left(u\right)\;du} \]

which equals the probability \({\rm Pr}(X \le x)\) .

The mean value is a localization parameter and is defined as

\[E\left[X\right]=\int_{ -\infty }^{\infty }{x\,f\left(x\right)\;dx} \]

The variance is a measure of variation,

\[V\left[X\right]=\int_{ -\infty }^{\infty }{f\left(x\right)\,\left(x -E\left[X\right]\right)^2\;dx} \]

which is a positive real number. The square root of the variance is the standard deviation, \(D[x]=\sqrt{V[X]}\) , and it is another measure of variation.

The skewness coefficient is a measure of non-symmetry,

\[SK\left[X\right]={{\int_{ -\infty }^{\infty }{f\left(x\right)\, \left(x-E\left[X\right]\right)^3\;dx}}\over{D\left[X\right]^3}} \]

And the kurtosis coefficient measures the peakedness of the distribution,

\[KU\left[X\right]={{\int_{ -\infty }^{\infty }{f\left(x\right)\, \left(x-E\left[X\right]\right)^4\;dx}}\over{D\left[X\right]^4}}-3 \]

If \(X\) is gaussian, \(KU[X]=0\). In fact, both skewness and kurtosis are shape parameters used to measure the non–gaussianity of a distribution.

If the random variable \(X\) is discrete, the density, or probability, function \(f(x)\) takes positive values within certain countable set of numbers \(x_i\), and zero elsewhere. In this case, the distribution function is

\[ F\left(x\right)=\sum_{x_{i}\leq x}{f\left(x_{i}\right)} \]

The mean, variance, standard deviation, skewness coefficient and kurtosis coefficient take the form

\[\eqalign{ E\left[X\right]&=\sum_{x_{i}}{x_{i}f\left(x_{i}\right)}, \cr V\left[X\right]&=\sum_{x_{i}}{f\left(x_{i}\right)\left(x_{i}-E\left[X\right]\right)^2},\cr D\left[X\right]&=\sqrt{V\left[X\right]},\cr SK\left[X\right]&={{\sum_{x_{i}}{f\left(x\right)\, \left(x-E\left[X\right]\right)^3\;dx}}\over{D\left[X\right]^3}}, \cr KU\left[X\right]&={{\sum_{x_{i}}{f\left(x\right)\, \left(x-E\left[X\right]\right)^4\;dx}}\over{D\left[X\right]^4}}-3, } \]

respectively.

There is a naming convention in package distrib. Every function name has two parts, the first one makes reference to the function or parameter we want to calculate,

Functions:
   Density function            (pdf_*)
   Distribution function       (cdf_*)
   Quantile                    (quantile_*)
   Mean                        (mean_*)
   Variance                    (var_*)
   Standard deviation          (std_*)
   Skewness coefficient        (skewness_*)
   Kurtosis coefficient        (kurtosis_*)
   Random variate              (random_*)

The second part is an explicit reference to the probabilistic model,

Continuous distributions:
   Normal              (*normal)
   Student             (*student_t)
   Chi^2               (*chi2)
   Noncentral Chi^2    (*noncentral_chi2)
   F                   (*f)
   Exponential         (*exp)
   Lognormal           (*lognormal)
   Gamma               (*gamma)
   Beta                (*beta)
   Continuous uniform  (*continuous_uniform)
   Logistic            (*logistic)
   Pareto              (*pareto)
   Weibull             (*weibull)
   Rayleigh            (*rayleigh)
   Laplace             (*laplace)
   Cauchy              (*cauchy)
   Gumbel              (*gumbel)

Discrete distributions:
   Binomial             (*binomial)
   Poisson              (*poisson)
   Bernoulli            (*bernoulli)
   Geometric            (*geometric)
   Discrete uniform     (*discrete_uniform)
   hypergeometric       (*hypergeometric)
   Negative binomial    (*negative_binomial)
   Finite discrete      (*general_finite_discrete)

For example, pdf_student_t(x,n) is the density function of the Student distribution with n degrees of freedom, std_pareto(a,b) is the standard deviation of the Pareto distribution with parameters a and b and kurtosis_poisson(m) is the kurtosis coefficient of the Poisson distribution with mean m.

In order to make use of package distrib you need first to load it by typing

(%i1) load("distrib")$

For comments, bugs or suggestions, please contact the author at ’riotorto AT yahoo DOT com’.

Categories: Statistical functions · Share packages · Package distrib ·

Next:   [Contents][Index]

JavaScript license information