The Gram–Charlier A series (named in honor of Jørgen Pedersen Gram and Carl Charlier), and the Edgeworth series (named in honor of Francis Ysidro Edgeworth) are series that approximate a probability distribution in terms of its cumulants.[1] The series are the same; but, the arrangement of terms (and thus the accuracy of truncating the series) differ.[2] The key idea of these expansions is to write the characteristic function of the distribution whose probability density function f is to be approximated in terms of the characteristic function of a distribution with known and suitable properties, and to recover f through the inverse Fourier transform.
Gram–Charlier A series [edit]
We examine a continuous random variable. Let be the characteristic function of its distribution whose density function is f, and its cumulants. We expand in terms of a known distribution with probability density function ψ, characteristic function , and cumulants . The density ψ is generally chosen to be that of the normal distribution, but other choices are possible as well. By the definition of the cumulants, we have (see Wallace, 1958)[3]
- and
-
which gives the following formal identity:
-
By the properties of the Fourier transform, is the Fourier transform of , where D is the differential operator with respect to x. Thus, after changing with on both sides of the equation, we find for f the formal expansion
-
If ψ is chosen as the normal density
-
with mean and variance as given by f, that is, mean and variance , then the expansion becomes
-
since for all r > 2, as higher cumulants of the normal distribution are 0. By expanding the exponential and collecting terms according to the order of the derivatives, we arrive at the Gram–Charlier A series. Such an expansion can be written compactly in terms of Bell polynomials as
-
Since the n-th derivative of the Gaussian function is given in terms of Hermite polynomial as
-
this gives us the final expression of the Gram–Charlier A series as
-
Integrating the series gives us the cumulative distribution function
-
where is the CDF of the normal distribution.
If we include only the first two correction terms to the normal distribution, we obtain
-
with and .
Note that this expression is not guaranteed to be positive, and is therefore not a valid probability distribution. The Gram–Charlier A series diverges in many cases of interest—it converges only if falls off faster than at infinity (Cramér 1957). When it does not converge, the series is also not a true asymptotic expansion, because it is not possible to estimate the error of the expansion. For this reason, the Edgeworth series (see next section) is generally preferred over the Gram–Charlier A series.
The Edgeworth series [edit]
Edgeworth developed a similar expansion as an improvement to the central limit theorem.[4] The advantage of the Edgeworth series is that the error is controlled, so that it is a true asymptotic expansion.
Let be a sequence of independent and identically distributed random variables with mean and variance , and let be their standardized sums:
-
Let denote the cumulative distribution functions of the variables . Then by the central limit theorem,
-
for every , as long as the mean and variance are finite.
The standardization of ensures that the first two cumulants of are and Now assume that, in addition to having mean and variance , the i.i.d. random variables have higher cumulants . From the additivity and homogeneity properties of cumulants, the cumulants of in terms of the cumulants of are for ,
-
If we expand the formal expression of the characteristic function of in terms of the standard normal distribution, that is, if we set
-
then the cumulant differences in the expansion are
-
-
-
The Gram–Charlier A series for the density function of is now
-
The Edgeworth series is developed similarly to the Gram–Charlier A series, only that now terms are collected according to powers of . The coefficients of n -m/2 term can be obtained by collecting the monomials of the Bell polynomials corresponding to the integer partitions of m. Thus, we have the characteristic function as
-
where is a polynomial of degree . Again, after inverse Fourier transform, the density function follows as
-
Likewise, integrating the series, we obtain the distribution function
-
We can explicitly write the polynomial as
-
where the summation is over all the integer partitions of m such that and and
For example, if m = 3, then there are three ways to partition this number: 1 + 1 + 1 = 2 + 1 = 3. As such we need to examine three cases:
- 1 + 1 + 1 = 1 · k 1, so we have k 1 = 3, l 1 = 3, and s = 9.
- 1 + 2 = 1 · k 1 + 2 · k 2, so we have k 1 = 1, k 2 = 1, l 1 = 3, l 2 = 4, and s = 7.
- 3 = 3 · k 3, so we have k 3 = 1, l 3 = 5, and s = 5.
Thus, the required polynomial is
-
The first five terms of the expansion are[5]
-
Here, φ(j)(x) is the j-th derivative of φ(·) at point x. Remembering that the derivatives of the density of the normal distribution are related to the normal density by , (where is the Hermite polynomial of order n), this explains the alternative representations in terms of the density function. Blinnikov and Moessner (1998) have given a simple algorithm to calculate higher-order terms of the expansion.
Note that in case of a lattice distributions (which have discrete values), the Edgeworth expansion must be adjusted to account for the discontinuous jumps between lattice points.[6]
Illustration: density of the sample mean of three χ² distributions [edit]
Density of the sample mean of three chi2 variables. The chart compares the true density, the normal approximation, and two Edgeworth expansions.
Take and the sample mean .
We can use several distributions for :
- The exact distribution, which follows a gamma distribution: .
- The asymptotic normal distribution: .
- Two Edgeworth expansions, of degrees 2 and 3.
Discussion of results [edit]
- For finite samples, an Edgeworth expansion is not guaranteed to be a proper probability distribution as the CDF values at some points may go beyond .
- They guarantee (asymptotically) absolute errors, but relative errors can be easily assessed by comparing the leading Edgeworth term in the remainder with the overall leading term. [7]
See also [edit]
- Cornish–Fisher expansion
- Edgeworth binomial tree
References [edit]
- ^ Stuart, A., & Kendall, M. G. (1968). The advanced theory of statistics. Hafner Publishing Company.
- ^ Kolassa, J. E. (2006). Series approximation methods in statistics (Vol. 88). Springer Science & Business Media.
- ^ Wallace, D. L. (1958). "Asymptotic Approximations to Distributions". Annals of Mathematical Statistics. 29 (3): 635–654. doi:10.1214/aoms/1177706528. JSTOR 2237255.
- ^ Hall, P. (2013). The bootstrap and Edgeworth expansion. Springer Science & Business Media.
- ^ Weisstein, Eric W. "Edgeworth Series". MathWorld.
- ^ Kolassa, John E.; McCullagh, Peter (1990). "Edgeworth series for lattice distributions". Annals of Statistics. 18 (2): 981–985. doi:10.1214/aos/1176347637. JSTOR 2242145.
- ^ Kolassa, John E. (2006). Series approximation methods in statistics (3rd ed.). Springer. ISBN0387322272.
Further reading [edit]
- H. Cramér. (1957). Mathematical Methods of Statistics. Princeton University Press, Princeton.
- Wallace, D. L. (1958). "Asymptotic approximations to distributions". Annals of Mathematical Statistics. 29 (3): 635–654. doi:10.1214/aoms/1177706528.
- M. Kendall & A. Stuart. (1977), The advanced theory of statistics, Vol 1: Distribution theory, 4th Edition, Macmillan, New York.
- P. McCullagh (1987). Tensor Methods in Statistics. Chapman and Hall, London.
- D. R. Cox and O. E. Barndorff-Nielsen (1989). Asymptotic Techniques for Use in Statistics. Chapman and Hall, London.
- P. Hall (1992). The Bootstrap and Edgeworth Expansion. Springer, New York.
- "Edgeworth series", Encyclopedia of Mathematics, EMS Press, 2001 [1994]
- Blinnikov, S.; Moessner, R. (1998). "Expansions for nearly Gaussian distributions" (PDF). Astronomy and Astrophysics Supplement Series. 130: 193–205. arXiv:astro-ph/9711239. Bibcode:1998A&AS..130..193B. doi:10.1051/aas:1998221.
- Martin, Douglas; Arora, Rohit (2017). "Inefficiency and bias of modified value-at-risk and expected shortfall". Journal of Risk. 19 (6): 59–84. doi:10.21314/JOR.2017.365.
- J. E. Kolassa (2006). Series Approximation Methods in Statistics (3rd ed.). (Lecture Notes in Statistics #88). Springer, New York.
0 Response to "Continuity Corrections for Discrete Distributions Under the Edgeworth Expansion"
ارسال یک نظر