Volatility: a hidden Markov process in financial time series
Oh'
Zoltan EisleiQ
Department of Theoretical Physics,
Budapest University of Technology and Economics
Budafoki lit 8., H-llll, Budapest, Hungary
Josep Perelkfj] and Jaume Masolivci@
Departament de Fisica Fonamental, Universitat de Barcelona
Diagonal 647, E-08028 Barcelona, Spain
(Dated: February 2, 2008)
04 ■ The volatility characterizes the amplitude of price return fluctuations. It is a central magnitude in
' finance closely related to the risk of holding a certain asset. Despite its popularity on trading floors,
the volatility is unobservable and only the price is known. Diffusion theory has many common points
with the research on volatility, the key of the analogy being that volatility is the time-dependent
£ — ■ diffusion coefficient of the random walk for the price return. We present a formal procedure to extract
£Nj ' volatility from price data, by assuming that it is described by a hidden Markov process which together
with the price form a two-dimensional diffusion process. We derive a maximum likelihood estimate
of the volatility path valid for a wide class of two-dimensional diffusion processes. The choice of
the exponential Ornstein-Uhlenbeck (expOU) stochastic volatility model performs remarkably well
in inferring the hidden state of volatility. The formalism is applied to the Dow Jones index. The
Q ■ main results are: (i) the distribution of estimated volatility is lognormal, which is consistent with
O ' the expOU model and (ii) the estimated volatility is related to trading volume by a power law of
& \ the form a tx y ' 55 ; and (iii) future returns are proportional to the current volatility which suggests
C/j , some degree of predictability for the size of future returns.
o ■
; PACS numbers: 89.65.Gh, 02.50.Ey, 05.40. Jc, 05.45.Tp
Keywords: random diffusion, econophysics, stochastic volatility
43 :
£3.;
m ! I. INTRODUCTION
> ■
00 ■ The volatility measures the amplitude of return fluctuations and it is one of the central quantities in finance [l[ .
Investors sometimes place even greater emphasis on the level of volatility than on the market trend itself. The reason
C\| ■ for this is mainly that the risk of holding an asset is classically associated with its volatility Q. The theoretical
framework used to quantify aspects of price fluctuations has many common points with areas of physics dealing with
noisy signals. The research of random diffusion aims to describe the dynamics of particles in random media and its
methods have been applied to a large variety of phenomena in statistical physics and condensed matter Q . Time series
& describing solar flares, earthquakes, the human heartbeat or climate records show strong correlations, multi-scaling,
^ ■ non-Gaussian statistics and self-organized behavior [IJMS, 0] • These are properties also observed in financial time
series where volatility is considered to play a key role |8[ .
The picture that prices follow a simple diffusion process was first proposed by Bachelier in 1900 Q. Later in 1959,
the physicist Osborne introduced the geometric Brownian motion and suggested that volatility can be viewed as
the diffusion coefficient of this random walk (Io| . The simplest possible assumption - that it is a time-independent
. ^ ' constant - lies at the heart of classical models such as the Black-Scholes option pricing formula [ll[ ■ More recently it
has become widely accepted that such an assumption is inadequate to explain the richness of the markets' behavior
5^ ' [l|. Instead, volatility itself should be treated as a random quantity with its own particular dynamics.
Among its most relevant properties [l], [l2|, [IH, [ld[ , volatility is the responsible for the observed clustering in price
changes. That is: large fluctuations are commonly followed by other large fluctuations and similarly for small changes
P, i|. Another related feature is that, in clear contrast with price changes which show negligible autocorrelations,
volatility autocorrelation is still significant for time lags longer than a year [H, [l2|, EH ■ Most of these studies introduce
a subordinated process which is associated with the volatility in one way or another [lj| El, [l?], EH EH US HH HH US
O
'Electronic address: eislcr@maxwcll.phy.bme.hu
^Electronic address: joscp.pcrcllo@ub.edu
t Electronic address: jaumc.masolivcr@ub.edu
2
The main obstacle of the appropriate analysis of volatility is that it is directly unobservable. As we have mentioned,
volatility provides important information to traders but it is very unclear how reliable the estimates of such a hidden
process can be. Investors use several proxies to infer the level of current asset volatility. The most common ways are:
(i) to make it equivalent to the absolute value of return changes, (ii) to assume a proportional law between volatility
and market volume prl I27L l28l . l29l | , and (iii) to use the information contained in option prices and obtain the so-called
"implied volatility" which, in fact, corresponds to the market's belief of volatility
This already raises the questions: What process is a proper model of volatility and how to adjust the possible
parameters to describe various stocks and markets? Among the possible candidates, multifractals [2, E3, El, El, l20j
and stochastic volatility models [TH, [2l|, [22|, [23|, [24|, HH, HH are the most promising. The fact that volatility is directly
unobservable makes even more difficult to get a conclusive answer about the best model. This doubled challenge has
deserved the attention from the most diverse disciplines that look at financial markets. All of them have converged on
the description of what is technically known as realized volatility and on methodologies for reconstructing volatility
path. Several procedures for reconstructing volatility have already been presented being more or less dependent on
the volatility model chosen. Mathematics, econometrics, and finance research have a large number of publications
during the last decade on this issue (see e.g. [H, H3, SH El, |H, H, [H, H, [H, [H, [H, \M, EH ) .
Our research presents an alternative procedure to estimate volatility from the price dynamics only. Likewise most of
the papers in the literature, the price dynamics is represented by a two-dimensional diffusion process: one dimension
for price and second dimension for a volatility described by a hidden Markov process. Our case estimates the volatility
subordinated time series through maximum likelihood but, in contrast with other studies, having fixed in advance the
parameters of the model.
We have decided to focus on a particular stochastic volatility model that is able to circumvent both mathematical and
computational difficulties: the exponential Ornstein-Uhlenbeck volatility model [2l|, HI] although the same procedure
can be used for a larger class of models. As its name indicates, the model assumes that the logarithm of the volatility
follows an Ornstein-Uhlenbeck process, that is: a mean reverting process with linear drift. The resulting model is
capable of reproducing the statistical properties of the financial markets fairly well [2l| .
The paper is organized as follows. In Sec. |TT] we outline the general stochastic volatility framework and more
specifically the exponential Ornstein-Uhlenbeck model. In Sec. IIIII we present a maximum likelihood estimator for
a wide class of stochastic volatility models. For the case of the exponential Ornstein-Uhlenbeck (expOU) model we
present Monte Carlo simulations to show that it performs remarkably well in inferring the hidden state of the volatility
process. In Sec. IIVI the procedure is applied to the Dow Jones Industrial Average. Conclusions are drawn in Sec. [V]
and some more technical details are left to the appendices.
II. STOCHASTIC VOLATILITY MODELS
The geometric Brownian motion (GBM) [l(| is the most widely used model in finance. In this setting the asset
price S(t) is described through the following Langcvin equation (in Ito sense):
^=Vdt + *dW 1 (t), (1)
where a is the volatility, assumed to be constant, (i is some deterministic drift indicating an eventual trend in the
market, and W\(t) is the Wiener process. We define the zero-mean return X(t) as
X(t) = In [S(t + t Q )/S(t )} - (In [S(t + t )/S(t Q )}) , (2)
where the symbol (• • • ) designates the average value and t is the initial time which is usually set to be zero. In terms
of X(t) the GBM is simply written as
dX(t) =adWi(t). (3)
However, especially after the 1987 market crash, compelling empirical evidence has become available that the
assumption of a constant volatility is doubtful @ . Neither is volatility a deterministic function of time as one might
expect on account of the non-stationarity of financial data, but a random quantity [22l | .
In the most general setting one therefore assumes that the volatility er is a given function of a random process Y(t):
a(t) = f[Y(t)).
(4)
3
Most stochastic volatility (SV) models assume Y(t) is also a diffusion process that may or may not be correlated with
price. The main difference between various models is only the parametrization of this scheme. In a general notation
the zero- mean return X(t) defined above is described by the following set of stochastic differential equations
dX(t) = flY^dWxit) (5)
dY(t) = -g[Y(t)]dt + h[Y(t)]dW 2 (t), (6)
where dX = dS/S — (dS/S) and /, g and h are given functions of Y(t). As shown in Eq. f[Y(t)] corresponds to
the volatility, i.e., the amplitude of return fluctuations. However, since f(x) is usually chosen to be a monotonically
increasing function, it is not misleading to think of Y as a measure of volatility. Thus, as far as there is no confusion, we
will refer to the process Y(t) as "volatility" as well. On the other hand, the function g[Y(i)] describes a reverting force
that drives the volatility toward the so-called "normal level". This force brings the volatility process to a stationary
regime for long time horizons and the normal level is related to the average volatility in that limit. Finally, the
subordinated process Y(t) may have a non-constant diffusion coefficient defined in terms of the function h[Y(t)} which
is called the volatility-of- volatility ("vol of vol"). The functions g and h fully describe the volatility process. The
resulting dynamics is comparable with the one described by a Gaussian particle trapped in a potential well V(y)
whose associated force is —g{y), where g(y) = V'(y). In finance one typically proposes convex potentials with only
one minimum whose value is related to the normal level of the volatility.
In what follows we will mostly work with one particular SV model, the exponential Ornstcin-Uhlcnbcck (cxpOU)
model, which follows from the substitutions:
f(x) = me 1 , g(x) = ax, h(x) = k,
that is,
dX(t) = me Y ^dW 1 (t), (7)
dY(t) = -aY(t)dt + kdW 2 (t). (8)
Note that in this model the process Y(i) is precisely the logarithm of the volatility, or "log- volatility" for short. The
main statistical properties of the model are thoroughly discussed in Ref. [2l[ . We simply recall that the stationary
distribution of the process Y(t) is a Gaussian (i.e., a lognormal distribution for a) with zero mean and variance (3:
p(y) = -^=cxp(-y 2 /2(3), (9)
y Z7T/3
where
(3=k 2 /2a. (10)
III. VOLATILITY ESTIMATION
A. The Wiener measure and volatility estimation
Let X and Y denote a simultaneous realization of the variables X(t) and Y(t) in the time interval t — s < r <
t. Omitting all Y-indepcndent terms, we show in Appendix [A] that the probability density (likelihood) of such a
realization is approximately given by
InP(X,Y)~~ f
^ J t—s
Before proceeding further, we will discuss the meaning of this expression. We first note that Eq. pd|) has to be
understood in the sense of generalized functions [42j since the Wiener process is only differentiable in this sense and,
hence, X(t) and Y(t) do not exist as ordinary functions and Eq. is just a symbolic expression. Nevertheless, the
formula is still valid when the integral and the derivatives therein are discretized with arbitrary small time steps, a
requirement that is indeed necessary for numerical computations.
Let us now see some qualitative properties of Eq. pip. The first summand measures the fluctuations of the
zero-mean return with respect to the volatility, [X (t) / f (Y (t))] 2 , and their contribution to the likelihood (probability
Mr)
f[Y(r)]
dr--
2
Y + 9[Y(t)\
h[Y(r)]
dr
(11)
4
density) of a given return realization. Note that the higher this contribution is, the lower those "relative" fluctuations
are. In the same fashion, the second summand in Eq. pip measures the fluctuations of the volatility process Y(t)
with respect to the vol of vol h(Y), although in this case these fluctuations are gauged with the mean reverting force
—g(Y). As before, the lower these fluctuations, the higher their contribution to the log-likelihood (TTT|) .
While Eqs. (J5j)- (J6]) represent a joint model for return and volatility, the stock market data only include recordings
of the return process X . The Y process and, hence, the volatility f(Y), must be inferred indirectly in a Bayesian
fashion through Eq. (fTTj) . Indeed, the conditional probability density that the realization of the hidden Y"-process is
Y, given that the observed return is X, reads
lnP(Y|X) = lnP(X,Y) -lnP(X). (12)
In consequence, we can find the maximum likelihood sample path of the (hidden ) volatility process by maximizing Eq.
ilty) with respect to Y (recall that Y is a realization of Y(r) in the interval t — s < r < t). Since the second summand
of Eq. (|12p is independent of Y, we can neglect P(X) in this maximization process. Therefore, the maximization of
lnP(Y|X) yields the same result as that of lnP(X, Y).
Note that, besides the specification of the stochastic volatility model (that is, the explicit forms of /, g, and h), the
only free parameter is s: the duration of past return data to take into account. After substituting the observed return
history as X, we will obtain by maximum likelihood the quantity:
Y = argmax Y lnP(Y|X) = argmax Y lnP(X, Y). (13)
We should mention that similar maximization problems have been studied in the context of hidden Markov models,
where this procedure is called "decoding". When the state space (the number of possible X and Y values) is finite, the
optimization can be done exactly by the Viterbi algorithm [431 ] , while there has been limited success in the continuous
case [441 ] . A similar technique has been applied to the forecasting of volatility assuming a binomial cascade model which,
unlike stochastic volatility models, has a finite state space (see Ref. [13] )■ More focused on the stochastic volatility,
we should also mention the efforts based on Kalman or Particle filtering, Bayesian inference, conditional likelihood or
Fourier methods among other similar techniques (see e.g. Refs. [H, H3, H3, [H, [H, [H, [35|, [H, H3, [H, [H, 0, 5l[ ) . Most
of the cited works assume an specific model, although they never focuss on the expOU model. In addition, they are
mainly worried about intraday (high-frequency) data while we are here focussed on reproducing daily (low-frequency)
data. It deserves an special attention the work on a superposition of Levy Ornstein-Uhlenbeck processes for a 2 crafted
to describe high-frequency (intraday) data and provide a power law slow decay for the volatility autocorrelation [T5II32I].
All the techniques mentioned above provide an efficient way to reproduce the volatility path but, in contrast to our
case, the method also serves to estimate the parameters of the model. We have decided to provide the parameters
beforehand thus using an independent way of estimating them (see Sec. IIV A| . It should be left for a future research
an accurate comparison between our method and others. We may look for a way to implement parameter estimation
in our method.
As we have already stated, our main objective is to design a method able to filter the Wiener noise dW\(t) out of
Eq. and thus to obtain a reliable estimate Y(t) of the hidden volatility process Y(t). The method, an extension
of the deconvolution procedure previously presented in plj . has basically the following five steps:
(i) We simulate a random sample path of the Wiener process dWi(r) for t — s < r < t.
(ii) Then a surrogate realization of Y is generated as
Y(r) = r 1
dX(r)
dWx{T)
(14)
where t — s < r < t. Note that this equation requires that f(x) is invertible which implies that f(x) be chosen
to be a monotonic function.
(iii) Substitute Y s and X into Eq. (fTTj) to calculate the log-likelihood of this realization.
(iv) Iterate (i)-(iii) for / steps, keep the highest likelihood random realization (the conditional median) and assume
this to be the proper estimate Y(t).
(v) The estimate of the hidden process at time t is then Y(t). The estimate of the volatility is given by
a(t) = f[Y(t)].
(15)
5
B. Interpretation of the method
Let us further elaborate the meaning of such an estimate. In finance, the volatility is often identified with the
absolute value of returns variations. Indeed, as a first approximation, we can replace in Eq. ^ the noise term by its
expected value and write
r ,. \dX(t)\
which shows that the volatility is approximately proportional to the absolute returns. Eq. ([5]) can be thus thought of
as a first approximation toward estimating volatility. Our method, based on the maximization of Eq. (|12|) . takes this
estimation two steps further. In effect, the first step was taken in Ref. [2l[ where we replaced the average (|G?Wi(i)|)
by a simulated sample path. We are now taking a second and more refined step in which we are not only replacing
the Wiener noise by a random simulation but, in addition, we perform the maximum likelihood method described by
items (i)-(v).
Thus we arc basically separating the observed returns dX(t) into two sources: a(t) and dW\{t). To do this, we
have first considered a specific form of stochastic volatility. Secondly, we have taken the driving Wiener noises dW\ (t)
and dW2(t) appearing in Eqs. ((5]) and ([6]) to be uncorrelated. Finally, we have assumed that a(t) is approximately
constant over the time step during which we numerically evaluate the derivatives Y(t) and X(t) appearing in Eq.
(fTTj) . We incidentally note that if h(x) = - the vol of vol is equal to zero - then the stationary solution of Eq. ^ is
Y(t) = 0. Thus the model reduces to the Wiener process in which the volatility is constant and absolute returns are
uncorrelated @.
C. The performance of the estimator
In order to test the performance of the estimator, we simulate the expOU process by using Eqs. 0-© with the
realistic parameters obtained in Section HVl The relationship between the simulated value of the log- volatility Y(t)
and its estimate Y{t) is given in Fig. [1] The two quantities agree within error bars, so we may state that
Y{t) « Y(t) (in mean square sense).
In what follows we will always use s = 10 days of past data and I = 10 5 iterations for maximization. The time step
for discretization will be At = 1 day. For random optimization, the necessary number of iterations / grows very fastly,
perhaps exponentially, with s/ At. The values of s and At cited here were chosen to keep the task computationally
feasible. As for the value of /, in this case an increase to I = 10 6 does not improve the estimates noticeably. The
estimates generated with this parameter set have negligible bias and they can efficiently distinguish between low and
high volatility periods (see Appendix [B] for a discussion on the robustness and possible bias of the procedure) .
An additional verification of the good performance of our estimate is shown in Fig. [2J where we give the actual
sample path of Y for a single realization of the expOU model together with the estimated Y. As we can see the
estimate follows the true log- volatility Y(t) very closely.
D. Volatility forecasting
Another possible approach to the utility of the expOU model is to evaluate its forecasting performance. One can
for example use the mean error of absolute returns \dX(t + h)\ for a given Monte Carlo simulated path:
E(h) = (j\dX(t + h)\ - \dX(t + h)\fj, (17)
where \dX(t + h)\ is our estimate for the absolute return \dX(t + h)\ given an estimate Y(t) while h is the time
horizon (number of days) to forecast ahead. We recall that for our estimates we take the conditional median of our
MonteCarlo simulations. Thus for the expOU model we have
\dX(t + h)\ =M[\dWi(t + h)\]mexp[Y(t)exp(-ah)], (18)
where M[|(iVFi(f + /i)|] denotes the median of the absolute value of Wiener increment. Figure [3] compares five methods
to forecast \dX(t + h)\ using only information available before time t. For clarity we also give the titles used in Fig.
1 in brackets ("..."):
6
3
>-
£
estimated Y
estimated = true
true Y
Figure 1: Estimated log- volatility Y as a function of the actual log- volatility Y taken from 2 x 10 5 simulations of the expOU
model. Reconstruction used the last s — 10 values of returns, and I = 10 5 iterations. The error bars represent the 25% and
75% quantiles of the distribution estimated volatility.
Figure 2: Estimated and actual volatility for a typical sample path of the expOU model. The estimated values were smoothed
by 5-neighbor averaging to reduce noise.
1. The median of the last 5 absolute returns. This method clearly has predictive power for h < 150 days. ("5-day
abs. ret.")
2. The median of the last 15 absolute returns. The longer averaging period brings a substantial improvement for
short-term forecasts, but not for long-term. ("15-day abs. ret.")
3. Eq. (fl~8|) with Y(t) = Y(t), which is the true value in the underlying simulation of the expOU model. This gives
a substantial decrease in forecasting error, which persists up to h ~ 500 days. This curve is also a theoretical
lower bound for error achievable by the expOU model in any time series: Here the underlying data are perfectly
described by expOU, and the parameters and Y(t) are known "perfectly". Neither of these usually happen in
real data, and so there one expects worse performance, ("perfect")
4. Eq. ([T8|) with Y(t) estimated by the reconstruction procedure of Sec. IIIIl This estimator does not perform well,
due to noise in the optimization procedure. ("1-day forecast")
5. Eq. (fT8|) with Y(t) estimated by the reconstruction procedure of Sec. IIIIl then averaged for the last 5 days.
Such averaging greatly decreases noise, and the accuracy of the forecast is improved for short times (cf. Fig. [2]).
("5-day forecast")
7
c
03
<u
E
m
>
<u
250 500 750
forecast horizon (day)
Figure 3: Median forecasting error E(h) of absolute returns calculated according to Eq. (|18fl by five methods. The errors were
normalized by the lowest level of error achievable with the assumption of a time-independent constant volatility (horizontal
dashed line). The result was averaged over 150000 independent realizations of the expOU process, parameters were chosen as
outlined in Sec. IIV Al
Finally, note that for h ~ 500 days all estimators based on the expOU model lose any information included in Y(t),
and converge to the same error level. This is consistent with real data (cf. Sec. IIV Dp .
IV. APPLICATION TO STOCK MARKET DATA
In this section we present an application of the method to actual stock market data. We analyze the Dow Jones
Industrial Average (DJIA) index in the period 01 /01/1900-05/26/2006, a total of 29, 038 days. In order to work with
zero-mean returns, the mean return was subtracted from the actual data. Trading volumes for the index are only
available for the period 04/01/1993 - 05/26/2006, a total of 3, 375 days.
Parameter estimation
We recall that in the estimation procedure presented here one necessarily needs to assume a theoretical model for
the volatility. Having done this, the next step is to estimate the parameters involved in the model chosen. For the
expOU model, Eqs. (O-©, these parameters are: to, k and a.
Before proceeding further we remark that time increments in real data have a finite size since the market always
works on discrete times (for daily data the minimum time increment is 1 day). Thus, in practice, the (infinitesimal)
return variation dX (t) = X(t + dt) — X(t) corresponds to a (finite) return increment A A" (t) = X (t + At) — X (t ) where
At is the time step between two consecutive ticks. Also the Wiener differentials dW(t) correspond, in mean square
sense [45[, to the increments
(19)
AW(t) ps e(f)VAt,
where e(t) is a Gaussian process with zero mean and unit variance [45j |. In the present case our time step has a fixed
width and is equal to At = 1 day.
Coming back to the estimation of parameters, we show in Appendix [Q that
lnm ps ( 7 + ln2)/2+ (Jin (j AA|/VaT) ^ , (20)
where 7 = 0.5772 • • • is the Euler constant. Taking into account that the third summand can be evaluated from data
we see that Eq. (f2"0| provides a direct estimation of m.
On the other hand, if the expOU model is appropriate then the empirical estimate Y(t) of the hidden volatility
Y(t) should also be a Gaussian process with a stationary distribution of zero mean and variance given by j3 = k 2 /2a
8
[see Eq. (|10[) ]. As shown in Fig. [5J if one takes (3 = 0.61 ± 0.05, the distribution of Y{t) is Gaussian and coincides
with the theoretical distribution of Y(t) given by Eq. @. The assumption of a Gaussian distribution for our estimate
is robust and it holds for a wide range of parameters.
To fully specify the model we have to obtain the parameter a. We have chosen the value found in Ref. [2l[ which
was obtained in order to capture the long range correlations of the volatility, at least up to 500 days. The model is
able to provide the appropriate long range behavior with an infinite sum of exponentials but it is also true that it
does not provide a pure power law decay like the models in Refs. [l5lll7||. In any case, we have seen in Ref. [2l[ that
at least for daily data our approach is satisfactory. Our final set of parameter estimates is thus
m = (7.5 ± 0.5) x 10~ 3 days~ 1/2 ,
a = (1.82 ± 0.03) x 10 -3 days" 1 ,
and
k = (4.7 ± 0.3) x 10~ 2 days" 1/2 .
The errors were determined based on Fig. [4j similarly to the error of f3. In this parameter range the distributions of
the estimated Y(t) and simulated Y(t) series agree well. The results reported throughout this paper are insensitive
to the misspecification of these parameters, even beyond these error bars (see also Appendix |B|) .
The distributions of log- volatility are compared in Fig. 0]for four cases: our maximum likelihood procedure applied
to Dow Jones, a simulation of cxpOU, the simple estimate of Eq. (|21[) and the deconvolution procedure introduced
in Ref. \2M which can be written as
Y d ccon(t) = In —
171
dX{t)
dW x {t)
(21)
where Wi(t) is a simulation of the Wiener process. Note that this deconvoluted log-return estimator has indeed a
Gaussian distribution but with a larger variance as hinted in Fig. 2]in view of its wider density. We finally mention that
the estimate for the log-volatility \n(a/m) sa In \dX\ given by Eq. (flU]) shows a non Gaussian and biased distribution
as was also reported in Ref. [2l[. This suggests that l^decon is an appropriate "null model" to contrast with Y. Both
quantities are generated by dividing dX(t) by the increments of a realization of the Wiener process and then taking
the logarithm of the absolute value of this ratio. The difference lays in the fact that Y takes the realizations that
satisfy a maximum likelihood requirement while Idecon takes a Wiener process realization that is purely random. In
such a way, our method keeps the divisor correlated with dX and, as we will see next, it seems to conserve clustering
and memory effects in the log-volatility time series.
B. Clustering and the estimated volatility
To support our claim that the technique presented is powerful enough in filtering the noise dW\ out of returns we
will give a visual comparison based on the following qualitative experiments.
Figure [5] [Left] displays a comparison over a 1000-day time interval. One can observe there that the noise level in
Y is substantially smaller than in Ydccon, as also inferred from Fig. |U
In order to show that such a correlation is responsible for suppressing large fluctuations in the ratio, we can perform
a second experiment. Thus in Fig. O [Right] we see a comparison between the logarithm of absolute returns variations,
In \dX\, and the estimated volatility Y, Note that the proper clustering of volatility becomes clearly visible.
C. A comparison with trading volume
The hidden nature of the volatility process has been addressed by several authors [2(1, [2?], For instance, Ref.
[2(| suggests that, instead of the volatility, a good estimate would be the square root of the daily trading volume.
That is,
M[a(t)} cx V{t) a , (22)
where a = 0.5 and M[-] denotes the median. In Fig. [5] we show evidence that supports this assumption. Again,
the first estimation for the volatility is Therefore, we regress ln|c?X(t)| versus \nV(t) as shown in Fig. [5]
9
Figure 4: A comparison of estimates of log- volatility for Dow Jones. Black boxes ("estimated Y, D JIA") represent our maximum
likelihood method applied to empirical data. Empty circles ("simulated Y, expOU") represent the distribution of the simulated
sample path of the log- volatility, assuming that Y(t) follows the expOU model. We also plot the empirical distributions of two
estimates: the red line ("In [return], DJIA") was obtained through Eq. (|16[l and the black line ("In [deconv. ret], DJIA") via
Eq. dHJ.
Figure 5: [Left] A comparison between the estimate Y(t) and Y& ecorl (t) for a typical 1000-day period of Dow Jones. These curves
were not smoothed in order to show the substantial reduction of both the noise level and the asymmetry in Y(t) compared to
^decon(i) with a random approximation of the noise term dW. [Right] The estimate Y(t) and the logarithm of absolute return
variations for the whole sample of Dow Jones.
In the same figure we also present the regression between the maximum likelihood estimate Y(t) and hiV(t) which
appears to be less noisy than the former regression in accordance with the smaller variance of Y(t) compared to that
of ln|dAT(i)|. Nevertheless, the exponent a « 0.55 is the same for both regressions. There have been similar [Z(|,
albeit controversial [13, [H|, findings for the price impact of single transactions. However, Eq. does not yet imply
that volatility is proportional to volume, only that its typical value is (i.e., the median). Fluctuations around the
average behavior due to changes in liquidity might have a key role in the process (48j .
10
9 10
In(volume) (arb.)
11
Figure 6: Logarithm of daily absolute return and estimated log-volatility Y as the function of daily volume. Days with similar
volumes were binned for better visibility. The symbols represent the medians, and the error bars the 25 — 75% quantiles in the
bins.
D. The predictive power of volatility
From Eqs. ([T])-©, we know that for the expOU model simple relationship can be given between ln|dX(t)| and
Y(ty.
\n\dX(t)\ =ln(m\dW 1 (t)\)+Y(t).
Therefore, the conditional median of In |gLY(£)| given Y(t) is
M
In|dX(t)|
Y(t)
const. + Y(t).
(23)
We point out that this relationship implies some degree of predictability of the absolute changes in return through
their median, if one knows the current value of the log- volatility Y(t). We test Eq. (|23[) for real data and with Y(t)
replaced by its estimate Y(t). As shown by the bottom curve of Fig. [7l the slope of the linear regression between
M[ln |F] and Y(t) is not equal to 1 - as would have been implied by Eq. (f2"3"| - but 0.9 which still suggests
strong predictive power.
Recall that the minimum time step of the empirical data used is 1 day. Hence, Eq. (|23p implies the prediction of
tomorrow's return based on today's volatility and return. Wc now want to extend the prediction horizon. To this end
we generalize Eq. (|23[) and propose the following ansatz:
M
]n\dX(t + h)\
Y(t)
const. +^(h)Y(t),
(24)
where h = 0, 1, 2, ■ • • . In Fig. [7] we test this ansatz for several values of the horizon: h = 0,5, 20, 100 and 1000 days.
We find that the slope "f{h) is a decreasing function from the value 7(0) = 0.9 to practically zero when h = 1000 days
which means a complete loss of memory. Note that, when h = 100 trading days we have 7 = 0.25 still implying a
slight degree of prediction after approximately five months, which is of the same order of magnitude than the DJIA
characteristic time scale, 1/a ~ 500 days, for the relaxation of the volatility plj .
V. CONCLUSIONS
The volatility is a crucial quantity for financial markets since it provides a measure of the amplitude of price
fluctuations. Traders try to follow carefully the level of volatility because it gives the perception of the risk associated
with any asset. Although volatility was originally conceived to be the diffusion coefficient of the price return random
walk, there is compelling evidence not to consider it a constant, but a subordinated random process. The framework is
11
2T
-10-L-, . , . , . , . , . 1
-2-10 1 2 3
estimated Y (5 day average)
Figure 7: The proportionality between the estimated volatility and of the logarithm of absolute return variations. In order to
decrease noise, 5-day moving averages have been used. Numbers on the right indicate the slopes of the corresponding regression
lines. Time shifts from bottom to top: h = days (■), 5 days (•), 20 days (a), 100 days (▼), 1000 days (♦). Days with
similar absolute returns were binned for better visibility. The symbols represent the medians, and the error bars the 25 — 75%
quantiles in the bins.
analogous to that of random diffusion processes which have been applied to a large variety of phenomena in statistical
mechanics and condensed matter physics.
The main obstacle to get a better knowledge of the volatility's nature is that it is not directly observed. In fact, this
is precisely the motivation behind the present research. Our main objective has been to develop a tool which visualizes
the sample path of volatility. The procedure derives a maximum likelihood estimate assuming that the volatility is
a hidden Markov process. To do so, one needs also to assume a specific model for the volatility. We have chosen
a class of two dimensional diffusions commonly known as stochastic volatility models, where the volatility acts as a
diffusion particle trapped in a potential well. We have focused on the expOU model and obtained promising results,
especially for three reasons: (i) the model is computationally feasible; (ii) its parameters can be easily obtained and
fit the data reasonably well; and (iii) the distribution of the estimated volatility is log-normal, which is consistent
with the assumed expOU model.
We have shown for the Dow Jones index daily data that the sample path of our estimated volatility improves other
estimates. We have compared our estimation with a rather typical one which identifies volatility with absolute return
changes. Our estimation is able to remove the existing bias in the stationary distribution of volatility while still
preserving the clustering in volatility time scries. We have also studied the estimate of volatility that deconvolutcs
the return by the simulation of a random Wiener path [2l| . This last procedure also provides a Gaussian distribution
for the log-volatility, albeit the distribution has too fat tails and pays the price of losing clustering and memory in
the volatility time series. Our new procedure is in fact a more sophisticated variant of this estimate since it filters out
Wiener realizations via maximum likelihood. The estimate drastically reduces the noise in the volatility path thus
preserving data clustering. In this way we have thus proposed an alternative methods to those already provided by
mathematical finance literature (see e.g. [H, H3, l3ll l32l. [H, [HI, [H, [H, H3, [H, [H, H(| EI)- ^ should however be
left for a future research an accurate comparison between our method and others. We may even look for a way to
implement parameter estimation in our method.
The median of the estimated volatility has also been related to trading volume by the power-law expression M[a] oc
t/0.55^ jjjjj between volatility and trading volume has been previously mentioned in different studies however our
estimate is again capable to provide a less noisy regression. We must, indeed, stress the fact that this does not imply
that volatility is proportional to a power of the volume, but only that its typical value is and that fluctuations around
the average might play an important role.
We have also seen that current returns are proportional to the estimated volatility, as otherwise expected. How-
ever, the main novelty is that we have observed how future returns are proportional to current volatility and their
12
predictability diminishes monotonically with the number of time steps ahead. This last finding implies that our
estimation method can be applied to predict the size of future returns with the knowledge of current volatility.
As a final remark we stress the fact that the technique herein presented can be applied to a variety of physical
phenomena besides finance. One typical problem of this sort is provided by the Brownian motion inside a field of
force in which inertial effects are not negligible (49[. In this situation the dynamics of the particle is described by
a two dimensional diffusion process (X(t), V(t)) representing the position and the velocity of the Brownian particle.
The maximum likelihood technique might provide a reliable estimate of the velocity in the case that, for instance, the
only accessible experimental measures are the positions of the particle at wide time steps, so that a measure of the
velocity - which implies the knowledge of two very close positions - is too noisy and unreliable.
Acknowledgments
ZE is grateful to Janos Kertesz for his support and to the Universitat de Barcelona for its hospitality during his visit
at the Departamcnt de Fisica Fonamental; also support by OTKA T049238 is acknowledged. JP and JM acknowledge
support from Direccion General de Investigation under contract No. FIS2006-05204.
Appendix A: DERIVATION OF THE LIKELIHOOD FUNCTION
In order to make notations more compact, in this Appendix the time dependence of the stochastic processes is
mostly indicated as a lower index. A generic stochastic volatility model is defined as
dX t = f{Y t )dW 1 (t), (Al)
dY t = -g(Y t )dt + h(Y t )dW 2 (t). (A2)
To explain the procedure it is more convenient to work with the discrete time version of the model. To this end,
suppose that At is a small time step and that the driving noises in Eqs. (|Alj) - (|A2|) can be approximated by (c/Eq.
dWi(t)ttSi(t)VAt, (t = l,2), (A3)
where £i(t) are independent standard Gaussian processes with zero mean and unit variance. We remark that the
approximation (|A3p has to be understood in mean square sense (45| . The discrete time equations of the model thus
read
X t - X t _ At = /(y t _ At )£i(i - At)VAi (A4)
Y t - Y t _ At = -g(Y t „ At )At + h(Y t _ At )e 2 (t - At)^At (A5)
from which we get
e l {t-At)= (A6)
/(y t _ At )VAF
e 2 (t-At) = Y *- Yt -* t+9 ^ At . (A7)
For a given number of realizations, the probability of the set {X T , Y T } (r = t — At, t — 2At, . . .,t — s) can be easily
obtained, as we will see next.
Let us denote the set of realizations as {X, Y}. Then the Markov property of the process ensures that one can
decompose the joint probability density function (pdf ) of this set as a chain of products between conditional probability
densities. In consequence, the pdf of the whole sample path can be written as
t-At
p({x,y}) = p(Av s ,y t _ s ) Yl F(x T ,Y T \x T _ At ,Y T _ At ),
r—t — s
(A8)
13
where the first term, ¥(X t - s , Y t - S ), corresponds to the initial realizations of X and Y s/At time steps far from the
present time t; all the remaining terms of the form f(X T , Y T \X T _Ati Y T _At) are the conditional pdf's for transitions
between consecutive steps:
(X T -At, Y T -At) — ► (X T ,Y T ).
The logarithm of Eq. (|A8J) is
t-At
lnP({X,Y}) = lnP(AV s ,y t _ s ) + \nP(X r ,Y T \X T _At,Y T _ At ).
r—t — s
On the other hand from Eqs. (|A4|) - (|A5|) we realize that
¥{X Tl Y r \X T _At,Y T _At) = \J\P(sx(T-At),s 2 (T-At)),
where \ J\ is the Jacobian of the transformation (X T , Y T ) — > — At), e 2 {t — At)) defined by Eqs.
is,
1
\J\ =
(A9)
3, that
f(Y T -At)h(Y T -At)At
But £i and £2 are independent standard Gaussians, hence
P(e 1 ,ea) = (l/27r)ffltp[(e?+e2)/2] )
whence
F(X T ,Y T \X T _ A t,Y T -At)
l/(27rAt)
f{Y T -At)h(Xr-At)
exp
£?(t- At)+sl(r- At)
(A10)
Substituting Eqs. (|A6[) - (|A7p into this equation and the result into Eq. (|A9[) we finally get
i-Ai
lnP({X,Y}) = _ sln (2^At) _ ^ [ m/( y r _ At)+m/l( y T _ At)]
t-At
inP(x t _ s ,y t _ s )-i £
X T — Xf-
Al
i-Ai
Y T -Y T _At 9(Y T -At)
h(Y T -At)At h(Y T —At)
f(Y T _ At )At
-1 2
At
At.
(All)
Let us briefly explain the origin of some of these contributions. The first summand comes from the normalization
constant of the Gaussian distribution (|A10|) . It appears in every conditional probability density and this is the reason
for the factor s/At, which is the number of time steps between t — s and t. The resulting term does not depend on
the realization, so that we can neglect it for a maximization with respect to Y. The term also goes to — 00 in the
At — > limit, which means that any individual realization has a probability measure zero.
The second summand is mostly the sum of the Jacobian transformations of each transition probability and depends
on Y . Stochastic volatility models typically assume that these / and g [cf. Eqs. (|A1|) and (|A2|) ] are continuous and
monotonically increasing functions or even constants. For instance, in the expOU model we have /(x) = mexp(x)
and g(x) = k. Because of this, we will also neglect this term in the maximization procedure. The contribution shifts
the maximum at the excessive cost of adding more noise to the numerical computations. We can however look at
the situation from another point of view. Ignoring this term is equivalent to omitting the Jacobian transformation
between the two probability density measures [cf. Eq. (|A10[) ] . In this way, we are stating that what we are really going
to maximize is the probability of the realization of £\{t) and £2(t) -instead of Y(t)- in terms of the past history of
the process expressed by {X, Y}.
The term lnP(X t _ s , Y t - S ) is fixed by the initial conditions of the process. If we assume a known initial return X -
which can be set to zero - and take a random Y t - S following the stationary distribution P st (Yt_ s ) given by Eq. ([5]),
then P(X t _ s ,F t _ s ) = 5(X t _ s - X) P st (Y t _ s ) and hence
lnP(AV s ,y t _ s ) = lnP st (r 4 _ s ) +ln<5(AVs - X).
(A12)
14
>
T3
CD
M — '
CO
E
true Y
Figure 8: Estimated log-volatility Y as a function of the actual log-volatility Y taken from 29038 simulations of the expOU
model in four ways. Reconstruction uses the last s = 10 values of returns, and I = 10 5 iterations. The error bars represent the
25% and 75% quantiles of the distribution estimated volatility.
Had we taken another initial condition, the technique would have given equivalent results (we have checked this by
using several initial distributions). For this reason and in order to improve the convergence of the maximum likelihood
estimate we have neglected also this contribution.
We therefore write
1 t— Ai
lnP({X,Y}) ~ -- £
r=t-
1 t— At
- - y
2 ^
T—t — S
We can represent this equation in the continuous time framework if At is sufficiently small and if /(a;), h(x) and g(x)
are continuous. In such a case, Eq. (|A13p yields the result given in Eq. (fTTj) .
Appendix B: ROBUSTNESS OF THE PROCEDURE
In order to show that our volatility estimation procedure is robust, we carry out four measurements along the lines
of Sec. IIII Cl We generate artificial time series of the same length as the DJIA dataset, with the same parameters as
therein: m = (7.5 ± 0.5) x 1CT 3 days" 1/2 , a = (1.82 ±0.03) x 10~ 3 days -1 , and k = (4.7 ±0.3) x 10~ 2 days" 1/2 .
Then we reproduce Fig. [T] in several variations:
(a) In a way identical to the paper as shown in Fig. [TJ
(b) Data generated as in (a), but for reconstructing Y we use the above parameters minus twice the error specified
above.
(c) Data generated as in (a), but for reconstructing Y we use the above parameters plus twice the error specified
above.
(d) In a way identical to (a), but the data is detrended by the Monte Carlo sample mean /i. We however remind that
the process is driftless as given by Eq. ([3]). In this way we want to check whether detrending causes a systematic
bias in the estimation.
Y
X T — A r _At
f(Y T ^ At )At
Y T - At ,
At
h(Y T -At)At h{Y T -At).
At
(A13)
15
T3
CD
M — '
CO
E
-*— '
CO
CD
1.1,1,
i.i.
i ' i 1 i 1
■ true bias
10x true bias
* 100x true bias
1 1 i 1
1
true Y
Figure 9: Estimated log-volatility Y as a function of the actual log-volatility Y taken from 29038 simulations of the expOU
model in three different levels of "detrending error". Reconstruction used the last s = 10 values of returns, and / = 10 J
iterations. The error bars represent the 25% and 75% quantiles of the distribution estimated volatility. Note that there only
exists a clear deviation when the error is 100 times the true bias.
The results are given in Fig. [8J The plot shows, that either option gives the same width of the Y distribution, and
the bias introduced by either specification error is small. We have also run similar simulations but with very different
parameters values and the procedure still provides consistent results similar to those given by Fig. [8j
Finally, a more detailed test along the lines of point (d) above can be performed (that is: a test on wether the
detrending causes a bias in our estimation procedure). We thus have repeated the test (d) also including errors 10^t
and 100/i. The procedure tolerates up to 10 times more detrending error than expected in the real dataset. Only,
when we have about 100 times the error, the reconstruction shows a strong upward bias for low volatility periods as
shown in Fig. O All these results imply that possible errors in the detrending procedure could affect our procedure
only when any wrong specification of the drift is far outside the error domain involved in our DJIA data set.
Appendix C: DERIVATION OF EQ. (20)
We start from Eq. IJ7J which we write in the approximate form
AX(t) ~ me Y ^AW{t),
thus In \AX(t)\ = m + Y(t) + In |AVFi(t)| and, taking into account that (Y(t)) = 0, we have
(ln|AX(i)|) ~lnm+(ln|AVFi(t)|). (CI)
On the other hand, we know that AWi(i) ~ e^/At, where e is a standard Gaussian variable (c/Eq. (fH))) ). Hence
(ln|AVFi(t)|) « (ln|e|) + (lnAi)/2.
But
1
(ln| £ |) =
~ e / 2 ln|e|cie,
which, after a simple change of variables inside the integral, yields [50|, [51(
16
(In lei) = —= / X - 1/2 e- x/2 \nxdx = -y/^/2h + In 2),
2\/2Wo
where 7 = 0.5772 • • • is the Euler constant. Therefore,
(ln|AWi(i)|) « (lnAi)/2- ( 7 + ln2)/2. (C2)
Substituting Eq. jC2]) into Eq. JUT]) proves Eq. (120)) .
J. -P. Bouchaud and M. Potters, Theory of Financial Risk (Cambridge University Press, Cambridge, 2000).
H. Markowitz, Journal of Finance 7, 77 (1952).
D. Ben-Avraham and S. Havlin, Diffusion and Reactions in Fractals and Disordered Systems (Cambridge University Press,
Cambridge, 2000).
M. Paczuski, S. Boettcher, and M. Baiesi, Phys. Rev. Lett. 95, 181102 (2005).
P. Bak, K. Christensen, L. Danon, and T. Scanlon, Phys. Rev. Lett. 88, 178501 (2002).
A. Bunde, S. Havlin, J. W. Kantelhardt, T. Penzel, J.-H. Peter, and K. Voigt, Phys. Rev. Lett. 85, 3736 (2000).
A. Bunde, J. F. Eichner, J. W. Kantelhardt, and S. Havlin, Phys. Rev. Lett. 94, 048701 (2005).
R. Cont, Quant. Finance 1, 223 (2001).
L. Bachelier, Annales Scientifiques de l'Ecole Normale Superieure 111-17, 21 (1900).
M. Osborne, Operations Research 7, 145 (1959).
F. Black and M. Scholes, The Journal of Political Economy 81, 637 (1973).
A. Lo, Econometrica 59, 1279 (1991).
Z. Ding, C. W. J. Granger, and R. F. Engle, Journal of Empirical Finance 1, 83 (1993).
Y. Liu, P. Gopikrishnan, P. Cizeau, C.-K. Peng, M. Meyer, and H. Stanley, Phys. Rev. E 60, 1390 (1999).
O. E. Barndorff-Nielsen and N. Shephard, J. R. Statist. Soc. B 63, 167 (2001).
A. Saichev and D. Sornette, Phys. Rev. E 74, 011111 (2006).
E. Bacry, J. Delour, and J. F. Muzy, Phys. Rev. E 64, 26103 (2001).
L. Calvet and A. Fisher, Review of Economics and Statistics 84, 381 (2002).
T. Lux, submitted to Journal of Business and Economics Statistics.
L. E. Calvet and A. J. Fisher, Journal of Financial Econometrics 2 (2004).
J. Masoliver and J. Perello, Quantitative Finance 6, 423 (2006).
J. -P. Fouque, G. Papanicolaou, and K. R. Sircar, Derivatives in Financial Markets with Stochastic Volatility (Cambridge
University Press, Cambridge, 2000).
J. Perello and J. Masoliver, Phys. Rev. E 67, 037102 (2003).
J. Perello, J. Masoliver, and N. Anento, Physica A 344, 134 (2004).
J. Masoliver and J. Perello, International Journal of Theoretical and Applied Finance 5, 541 (2002).
A. A. Dragulescu and V. M. Yakovenko, Quantitative Finance 2, 443 (2002).
T. G. Andersen and J. Lund, Journal of Econometrics 77, 343 (1997).
W. A. Brock and B. D. LeBaron, Rev. Econ. Stat. 78, 94 (1996).
C. Lindberg, Mathematical Finance 16, 549 (2006).
V. Genon-Catalot, T. Jeantheau, and C. Laredo, Bernouilli 6, 1051 (2000).
O. Elerian, S. Chib, and N. Shephard, Econometrica 69, 959 (2001).
O. E. Barndorff-Nielsen and N. Shephard, J. R. Statist. Soc. B 64, 253 (2002).
V. Genon-Catalot, T. Jeantheau, and C. Laredo, Scandinavian Journal of Statistics 30, 297 (2003).
F. Comte and V. Genon-Catalot, Scandinavian Journal of Statistics 33, 875 (2006).
J. E. Griffin and M. F. J. Steel, Journal of Econometrics 134, 605 (2006).
G. Jongbloed and F. H. V. der Meulen, Scandinavian Journal of Statistics 33, 825 (2006).
K. Morimune, The Japanese Economic Review 58, 1 (2007).
T. Andersen and T. Bollerslev, Journal of Finance 53, 219 (1998).
E. Barucci and R. Reno, Economics Letters 74, 371 (2002).
Y. Omori, S. Chib, N. Shephard, and J. Nakajima, Journal of Econometrics (2006).
R. Reno, Econometric Theory (2007).
I. M. Gelfand and G. E. Shilov, Generalized Functions (Academic Press, New York, 1964).
T. K. Moon and W. C. Stirling, Mathematical Methods and Algorithms for Signal Processing (Prentice Hall, 2000).
C. R. Champlin and D. Morrell, in Proceedings of the 31 st Conference on Decision and Control (1991), p. 3707.
C. W. Gardiner, Handbook of Stochastic Methods (Springer- Verlag, Berlin, 1997).
X. Gabaix, P. Gopikrishnan, V. Plerou, and H. Stanley, Nature 423, 267 (2003).
Z. Eisler and J. Kertesz, Eur. Phys. J. B 51, 145 (2006).
J. D. Farmer, L. Gillemot, F. Lillo, S. Mike, and A. Sen, Quantitative Finance 4, 383 (2004).
17
[49] J. Masoliver, Phys. Rev. E 48, 121 (1993).
[50] I. S. Gradstein and I. M. Ryzhik, Table of Integrals, Series, and Products (Academic Press, San Diego, 1994).
[51] W. Magnus, F. Oberhettinger, and R. P. Soni, Formulas and Theorems for the Special Functions of the Mathematical
Physics (Springer- Verlag, Berlin, 1966).