|
Johnson Distributions: Part II ... a continuation of Part I
|
One thing about inventing a density distribution, say f(x) (where x can vary from -infinity to infinity), it must satisfy:
- f(x) ≥ 0 for all values of x ... since f(x) dx is the probability that x lies between x and x+dx and probabilities ain't negative
f(x) dx = 1 ... since it's guaranteed that x lies between -∞ and ∞ ... so probability = 1
Conclusion?
If we invent a distribution such as:
[A] f(x) =
where z = h(x) is some (invented) increasing function of x, (like the Johnson distributions)
then
[B] dx = 1
... and that puts restrictions of the function z = h(x)
Indeed, we might just as well do this:
- Consider g(x) = e-z2 ... instead of f(x)
- Replace z by h(x) ... where h(x) is our increasing function of x
- Evaluate
g(x) dx = e-z2 dx = e-h2(x) dx = C ... some constant that'll depend upon our choice of h(x)
- Then write our distribution as f(x) = (1/C) g(x).
Then we'll guarantee that f(x) dx = (1/C) g(x) dx = (1/C) C = 1
>So you'll be talking about e-z2, right?
Right and ...
>But you'll need to evaluate that integral e-h2(x) dx, right?
Yes, but ...
>I suspect that won't be easy since ...
Be quiet and I'll explain!
We can invent z = h(x) so that the integration can be performed explicitly.
For example:
- If z = sinh-1(x) then x = sinh(z) = (1/2)(ez - e-z)
- Then dx = cosh(z) dz = (1/2)(ez + e-z) dz
- Then
e-z2 dx
= (1/2) e-z2 (ez + e-z) dz
... and the integrals that arise are pretty easy to evaluate.
>Easy? That's easy for you to say!
Oh, I've long ago forgotten how to do these guys. I just
look them up
In the case of sinh(z), the hyperbolic sine, we just get a bunch of integrals like e-(z-k)2 dz = π
>So you invent your probability functions so the integrals are easy? Shouldn't you invent them so that ... ?
So they match stock data? Sure. Go ahead. I'll wait ...
>Why can't I just use the spreadsheet in Part I?
Sure. Match historical data using the ol' EB-method.
>Huh?
You mean you don't know the ol' EyeBall method?
I asked Jay, my son-in-law, to use the inverse hyperbolic sine distribution
z = A + B sinh-1((x-m)/s))
and pick the parameters A, B, m and s, to match the distribution of thirty years of S&P500 returns ... using the ol' EB-method.
Then I determined the parameters again so the the mean squared error
(between the invented distribution and the actual returns) was a minimum.
It turned out that ...
>Jay was bang on, right?
You got it ... which says something interesting about abtruse mathematical rituals and the ol'
EyeBall, eh?
Here's a couple of EyeBall fits to thirty years of monthly S&P500 data ... using two functions:
Note:
The average absolute error (between the EB-fit and the actual returns) is minimized in each case.
Although J(u) = u is smaller error (that's the normal distribution), J(u) = asinh has
fatter tails
However, there's not that much difference, eh?
That's because there's little difference between J(u) = u and J(u) = asinh(u) when u is small
... and u measures the distance of returns from their Mean
(measured in units of Standard Deviation).
Indeed, they differ when u is larger (them's the tails, eh?)
>Why not invent J(u) so it looks like u for small values and ... ?
And deviates for larger u? Good idea.
|  Figure 1A
|
Aside:
Note that sinh(u) = (1/2)(eu - e-u) and the inverse is asinh(u) = log{ u + SQRT(1 + u2) }.
For small values of u, SQRT(1 + u2) ≈ 1 + (1/2) u2
so
log{ u + SQRT(1 + u2) } ≈ log{ 1 + u + (1/2)u2 }.
Further, for small values of w, log(1 + w) ≈ w - (1/2) w2 + (1/3) w3
so, for w = u + (1/2)u2
log{ 1 + u + (1/2)u2 }
≈ u + (1/2)u2 - (1/2)[ u + (1/2)u2 ]2 + (1/3)[ u + (1/2)u2 ]3
Ignoring all the higher powers of the small number u, we get:
asinh(u) ≈ u - (1/6) u3
|  Figure 1B
|
>But Figure 1B looks like Figure 1A.
Yes.
It'd be nice if we could find a J(u) so that:
- J(u) ≈ u when u is small
- J(u) is smaller, in absolute value, when u is large
This'd make J(u) like the asinh(u) function, as in Figure 1, above. It deviates from u as |u| increases.
- There are several parameters to select, to match historical data.
- The invented J(u) gives fatter tails.
- AND ... e-z2 can be integrated if z = A + B J(u)
>You're dreaming, right?
Well ... I'm not sure.
For example, we could choose:
J(u) = u when 0 < u < 0.6
J(u) = 0.6 + 0.5*(u - 0.6) when u > 0.6
and give J(u) odd symmetry ... as in Figure 2A.
>Is it any good?
It's simple, right? It's just a broken line, right?
>Yeah, but ...
It gives a pretty good match, right? Look at like Figure 2B.
>And how about integrating? If you invent f(z), can you easily ... ?
Can I calculate F(x) = f(z) dz? Well, not exactly, but ...
>Then why don't you invent F(x)?
Huh? Well ... uh ...
Why didn't I think of that?
>Do you really want to know? It involves cerebral prowess and ...
|  Figure 2A
 Figure 2B
|
Okay, I have another idea. Here's what I'll do:
- Invent F(u), where u = (x - m)/s, such that it goes from -1 to +1 x go from -∞ to +∞.
... and introduce a few parameters so F can be "adjusted".
- Calculate f(x) = dF/dx and adjust parameters so that
f (x) dx = 1.
(Note that u = (x - m)/s is the x-deviation from the mean, in units of standard deviation.
Hence our f(u) and F(u) are really functions of x, eh?
- Adjust the parameters to match historical data.
- Pray that you get a good match ... with fat tails
>So?
So we'll try F(u) = (1/2)(1 + tanh(u)), so F'(u) = f(u) = (1/2)/cosh2(u).
>Huh?
tanh(u) = (eu - e-u) / (eu + e-u)
cosh(u) = (1/2) (eu + e-u)
>Yeah, but what do they look like?
Like this:
Now pay attention:
F(u) = (1/2)(1+tanh(u)) where u = (x-m)/s, so
f(x) = dF/dx = dF/du du/dx = (1/2)/cosh2(u) (1/s) = (1/2s)/cosh2(u) and
f (x) dx = F(u) = (1/2)(1+tanh(u)) evaluated between -∞ and +∞
>And your probability density is ... what?
f(x) = (1/2s) / cosh2(u) where u = (x-m)/s.
and the cumulative probability is:
F(x) = (1/2) (1 + tanh(u)) where u = (x-m)/s.
for Part III
|