# Full text of "Volatility: a hidden Markov process in financial time series"

## See other formats

Volatility: a hidden Markov process in financial time series Oh' Zoltan EisleiQ Department of Theoretical Physics, Budapest University of Technology and Economics Budafoki lit 8., H-llll, Budapest, Hungary Josep Perelkfj] and Jaume Masolivci@ Departament de Fisica Fonamental, Universitat de Barcelona Diagonal 647, E-08028 Barcelona, Spain (Dated: February 2, 2008) 04 ■ The volatility characterizes the amplitude of price return fluctuations. It is a central magnitude in ' finance closely related to the risk of holding a certain asset. Despite its popularity on trading floors, the volatility is unobservable and only the price is known. Diffusion theory has many common points with the research on volatility, the key of the analogy being that volatility is the time-dependent £ — ■ diffusion coefficient of the random walk for the price return. We present a formal procedure to extract £Nj ' volatility from price data, by assuming that it is described by a hidden Markov process which together with the price form a two-dimensional diffusion process. We derive a maximum likelihood estimate of the volatility path valid for a wide class of two-dimensional diffusion processes. The choice of the exponential Ornstein-Uhlenbeck (expOU) stochastic volatility model performs remarkably well in inferring the hidden state of volatility. The formalism is applied to the Dow Jones index. The Q ■ main results are: (i) the distribution of estimated volatility is lognormal, which is consistent with O ' the expOU model and (ii) the estimated volatility is related to trading volume by a power law of & \ the form a tx y ' 55 ; and (iii) future returns are proportional to the current volatility which suggests C/j , some degree of predictability for the size of future returns. o ■ ; PACS numbers: 89.65.Gh, 02.50.Ey, 05.40. Jc, 05.45.Tp Keywords: random diffusion, econophysics, stochastic volatility 43 : £3.; m ! I. INTRODUCTION > ■ 00 ■ The volatility measures the amplitude of return fluctuations and it is one of the central quantities in finance [l[ . Investors sometimes place even greater emphasis on the level of volatility than on the market trend itself. The reason C\| ■ for this is mainly that the risk of holding an asset is classically associated with its volatility Q. The theoretical framework used to quantify aspects of price fluctuations has many common points with areas of physics dealing with noisy signals. The research of random diffusion aims to describe the dynamics of particles in random media and its methods have been applied to a large variety of phenomena in statistical physics and condensed matter Q . Time series & describing solar flares, earthquakes, the human heartbeat or climate records show strong correlations, multi-scaling, ^ ■ non-Gaussian statistics and self-organized behavior [IJMS, 0] • These are properties also observed in financial time series where volatility is considered to play a key role |8[ . The picture that prices follow a simple diffusion process was first proposed by Bachelier in 1900 Q. Later in 1959, the physicist Osborne introduced the geometric Brownian motion and suggested that volatility can be viewed as the diffusion coefficient of this random walk (Io| . The simplest possible assumption - that it is a time-independent . ^ ' constant - lies at the heart of classical models such as the Black-Scholes option pricing formula [ll[ ■ More recently it has become widely accepted that such an assumption is inadequate to explain the richness of the markets' behavior 5^ ' [l|. Instead, volatility itself should be treated as a random quantity with its own particular dynamics. Among its most relevant properties [l], [l2|, [IH, [ld[ , volatility is the responsible for the observed clustering in price changes. That is: large fluctuations are commonly followed by other large fluctuations and similarly for small changes P, i|. Another related feature is that, in clear contrast with price changes which show negligible autocorrelations, volatility autocorrelation is still significant for time lags longer than a year [H, [l2|, EH ■ Most of these studies introduce a subordinated process which is associated with the volatility in one way or another [lj| El, [l?], EH EH US HH HH US O 'Electronic address: eislcr@maxwcll.phy.bme.hu ^Electronic address: joscp.pcrcllo@ub.edu t Electronic address: jaumc.masolivcr@ub.edu 2 The main obstacle of the appropriate analysis of volatility is that it is directly unobservable. As we have mentioned, volatility provides important information to traders but it is very unclear how reliable the estimates of such a hidden process can be. Investors use several proxies to infer the level of current asset volatility. The most common ways are: (i) to make it equivalent to the absolute value of return changes, (ii) to assume a proportional law between volatility and market volume prl I27L l28l . l29l | , and (iii) to use the information contained in option prices and obtain the so-called "implied volatility" which, in fact, corresponds to the market's belief of volatility This already raises the questions: What process is a proper model of volatility and how to adjust the possible parameters to describe various stocks and markets? Among the possible candidates, multifractals [2, E3, El, El, l20j and stochastic volatility models [TH, [2l|, [22|, [23|, [24|, HH, HH are the most promising. The fact that volatility is directly unobservable makes even more difficult to get a conclusive answer about the best model. This doubled challenge has deserved the attention from the most diverse disciplines that look at financial markets. All of them have converged on the description of what is technically known as realized volatility and on methodologies for reconstructing volatility path. Several procedures for reconstructing volatility have already been presented being more or less dependent on the volatility model chosen. Mathematics, econometrics, and finance research have a large number of publications during the last decade on this issue (see e.g. [H, H3, SH El, |H, H, [H, H, [H, [H, [H, \M, EH ) . Our research presents an alternative procedure to estimate volatility from the price dynamics only. Likewise most of the papers in the literature, the price dynamics is represented by a two-dimensional diffusion process: one dimension for price and second dimension for a volatility described by a hidden Markov process. Our case estimates the volatility subordinated time series through maximum likelihood but, in contrast with other studies, having fixed in advance the parameters of the model. We have decided to focus on a particular stochastic volatility model that is able to circumvent both mathematical and computational difficulties: the exponential Ornstein-Uhlenbeck volatility model [2l|, HI] although the same procedure can be used for a larger class of models. As its name indicates, the model assumes that the logarithm of the volatility follows an Ornstein-Uhlenbeck process, that is: a mean reverting process with linear drift. The resulting model is capable of reproducing the statistical properties of the financial markets fairly well [2l| . The paper is organized as follows. In Sec. |TT] we outline the general stochastic volatility framework and more specifically the exponential Ornstein-Uhlenbeck model. In Sec. IIIII we present a maximum likelihood estimator for a wide class of stochastic volatility models. For the case of the exponential Ornstein-Uhlenbeck (expOU) model we present Monte Carlo simulations to show that it performs remarkably well in inferring the hidden state of the volatility process. In Sec. IIVI the procedure is applied to the Dow Jones Industrial Average. Conclusions are drawn in Sec. [V] and some more technical details are left to the appendices. II. STOCHASTIC VOLATILITY MODELS The geometric Brownian motion (GBM) [l(| is the most widely used model in finance. In this setting the asset price S(t) is described through the following Langcvin equation (in Ito sense): ^=Vdt + *dW 1 (t), (1) where a is the volatility, assumed to be constant, (i is some deterministic drift indicating an eventual trend in the market, and W\(t) is the Wiener process. We define the zero-mean return X(t) as X(t) = In [S(t + t Q )/S(t )} - (In [S(t + t )/S(t Q )}) , (2) where the symbol (• • • ) designates the average value and t is the initial time which is usually set to be zero. In terms of X(t) the GBM is simply written as dX(t) =adWi(t). (3) However, especially after the 1987 market crash, compelling empirical evidence has become available that the assumption of a constant volatility is doubtful @ . Neither is volatility a deterministic function of time as one might expect on account of the non-stationarity of financial data, but a random quantity [22l | . In the most general setting one therefore assumes that the volatility er is a given function of a random process Y(t): a(t) = f[Y(t)). (4) 3 Most stochastic volatility (SV) models assume Y(t) is also a diffusion process that may or may not be correlated with price. The main difference between various models is only the parametrization of this scheme. In a general notation the zero- mean return X(t) defined above is described by the following set of stochastic differential equations dX(t) = flY^dWxit) (5) dY(t) = -g[Y(t)]dt + h[Y(t)]dW 2 (t), (6) where dX = dS/S — (dS/S) and /, g and h are given functions of Y(t). As shown in Eq. f[Y(t)] corresponds to the volatility, i.e., the amplitude of return fluctuations. However, since f(x) is usually chosen to be a monotonically increasing function, it is not misleading to think of Y as a measure of volatility. Thus, as far as there is no confusion, we will refer to the process Y(t) as "volatility" as well. On the other hand, the function g[Y(i)] describes a reverting force that drives the volatility toward the so-called "normal level". This force brings the volatility process to a stationary regime for long time horizons and the normal level is related to the average volatility in that limit. Finally, the subordinated process Y(t) may have a non-constant diffusion coefficient defined in terms of the function h[Y(t)} which is called the volatility-of- volatility ("vol of vol"). The functions g and h fully describe the volatility process. The resulting dynamics is comparable with the one described by a Gaussian particle trapped in a potential well V(y) whose associated force is —g{y), where g(y) = V'(y). In finance one typically proposes convex potentials with only one minimum whose value is related to the normal level of the volatility. In what follows we will mostly work with one particular SV model, the exponential Ornstcin-Uhlcnbcck (cxpOU) model, which follows from the substitutions: f(x) = me 1 , g(x) = ax, h(x) = k, that is, dX(t) = me Y ^dW 1 (t), (7) dY(t) = -aY(t)dt + kdW 2 (t). (8) Note that in this model the process Y(i) is precisely the logarithm of the volatility, or "log- volatility" for short. The main statistical properties of the model are thoroughly discussed in Ref. [2l[ . We simply recall that the stationary distribution of the process Y(t) is a Gaussian (i.e., a lognormal distribution for a) with zero mean and variance (3: p(y) = -^=cxp(-y 2 /2(3), (9) y Z7T/3 where (3=k 2 /2a. (10) III. VOLATILITY ESTIMATION A. The Wiener measure and volatility estimation Let X and Y denote a simultaneous realization of the variables X(t) and Y(t) in the time interval t — s < r < t. Omitting all Y-indepcndent terms, we show in Appendix [A] that the probability density (likelihood) of such a realization is approximately given by InP(X,Y)~~ f ^ J t—s Before proceeding further, we will discuss the meaning of this expression. We first note that Eq. pd|) has to be understood in the sense of generalized functions [42j since the Wiener process is only differentiable in this sense and, hence, X(t) and Y(t) do not exist as ordinary functions and Eq. is just a symbolic expression. Nevertheless, the formula is still valid when the integral and the derivatives therein are discretized with arbitrary small time steps, a requirement that is indeed necessary for numerical computations. Let us now see some qualitative properties of Eq. pip. The first summand measures the fluctuations of the zero-mean return with respect to the volatility, [X (t) / f (Y (t))] 2 , and their contribution to the likelihood (probability Mr) f[Y(r)] dr-- 2 Y + 9[Y(t)\ h[Y(r)] dr (11) 4 density) of a given return realization. Note that the higher this contribution is, the lower those "relative" fluctuations are. In the same fashion, the second summand in Eq. pip measures the fluctuations of the volatility process Y(t) with respect to the vol of vol h(Y), although in this case these fluctuations are gauged with the mean reverting force —g(Y). As before, the lower these fluctuations, the higher their contribution to the log-likelihood (TTT|) . While Eqs. (J5j)- (J6]) represent a joint model for return and volatility, the stock market data only include recordings of the return process X . The Y process and, hence, the volatility f(Y), must be inferred indirectly in a Bayesian fashion through Eq. (fTTj) . Indeed, the conditional probability density that the realization of the hidden Y"-process is Y, given that the observed return is X, reads lnP(Y|X) = lnP(X,Y) -lnP(X). (12) In consequence, we can find the maximum likelihood sample path of the (hidden ) volatility process by maximizing Eq. ilty) with respect to Y (recall that Y is a realization of Y(r) in the interval t — s < r < t). Since the second summand of Eq. (|12p is independent of Y, we can neglect P(X) in this maximization process. Therefore, the maximization of lnP(Y|X) yields the same result as that of lnP(X, Y). Note that, besides the specification of the stochastic volatility model (that is, the explicit forms of /, g, and h), the only free parameter is s: the duration of past return data to take into account. After substituting the observed return history as X, we will obtain by maximum likelihood the quantity: Y = argmax Y lnP(Y|X) = argmax Y lnP(X, Y). (13) We should mention that similar maximization problems have been studied in the context of hidden Markov models, where this procedure is called "decoding". When the state space (the number of possible X and Y values) is finite, the optimization can be done exactly by the Viterbi algorithm [431 ] , while there has been limited success in the continuous case [441 ] . A similar technique has been applied to the forecasting of volatility assuming a binomial cascade model which, unlike stochastic volatility models, has a finite state space (see Ref. [13] )■ More focused on the stochastic volatility, we should also mention the efforts based on Kalman or Particle filtering, Bayesian inference, conditional likelihood or Fourier methods among other similar techniques (see e.g. Refs. [H, H3, H3, [H, [H, [H, [35|, [H, H3, [H, [H, 0, 5l[ ) . Most of the cited works assume an specific model, although they never focuss on the expOU model. In addition, they are mainly worried about intraday (high-frequency) data while we are here focussed on reproducing daily (low-frequency) data. It deserves an special attention the work on a superposition of Levy Ornstein-Uhlenbeck processes for a 2 crafted to describe high-frequency (intraday) data and provide a power law slow decay for the volatility autocorrelation [T5II32I]. All the techniques mentioned above provide an efficient way to reproduce the volatility path but, in contrast to our case, the method also serves to estimate the parameters of the model. We have decided to provide the parameters beforehand thus using an independent way of estimating them (see Sec. IIV A| . It should be left for a future research an accurate comparison between our method and others. We may look for a way to implement parameter estimation in our method. As we have already stated, our main objective is to design a method able to filter the Wiener noise dW\(t) out of Eq. and thus to obtain a reliable estimate Y(t) of the hidden volatility process Y(t). The method, an extension of the deconvolution procedure previously presented in plj . has basically the following five steps: (i) We simulate a random sample path of the Wiener process dWi(r) for t — s < r < t. (ii) Then a surrogate realization of Y is generated as Y(r) = r 1 dX(r) dWx{T) (14) where t — s < r < t. Note that this equation requires that f(x) is invertible which implies that f(x) be chosen to be a monotonic function. (iii) Substitute Y s and X into Eq. (fTTj) to calculate the log-likelihood of this realization. (iv) Iterate (i)-(iii) for / steps, keep the highest likelihood random realization (the conditional median) and assume this to be the proper estimate Y(t). (v) The estimate of the hidden process at time t is then Y(t). The estimate of the volatility is given by a(t) = f[Y(t)]. (15) 5 B. Interpretation of the method Let us further elaborate the meaning of such an estimate. In finance, the volatility is often identified with the absolute value of returns variations. Indeed, as a first approximation, we can replace in Eq. ^ the noise term by its expected value and write r ,. \dX(t)\ which shows that the volatility is approximately proportional to the absolute returns. Eq. ([5]) can be thus thought of as a first approximation toward estimating volatility. Our method, based on the maximization of Eq. (|12|) . takes this estimation two steps further. In effect, the first step was taken in Ref. [2l[ where we replaced the average (|G?Wi(i)|) by a simulated sample path. We are now taking a second and more refined step in which we are not only replacing the Wiener noise by a random simulation but, in addition, we perform the maximum likelihood method described by items (i)-(v). Thus we arc basically separating the observed returns dX(t) into two sources: a(t) and dW\{t). To do this, we have first considered a specific form of stochastic volatility. Secondly, we have taken the driving Wiener noises dW\ (t) and dW2(t) appearing in Eqs. ((5]) and ([6]) to be uncorrelated. Finally, we have assumed that a(t) is approximately constant over the time step during which we numerically evaluate the derivatives Y(t) and X(t) appearing in Eq. (fTTj) . We incidentally note that if h(x) = - the vol of vol is equal to zero - then the stationary solution of Eq. ^ is Y(t) = 0. Thus the model reduces to the Wiener process in which the volatility is constant and absolute returns are uncorrelated @. C. The performance of the estimator In order to test the performance of the estimator, we simulate the expOU process by using Eqs. 0-© with the realistic parameters obtained in Section HVl The relationship between the simulated value of the log- volatility Y(t) and its estimate Y{t) is given in Fig. [1] The two quantities agree within error bars, so we may state that Y{t) « Y(t) (in mean square sense). In what follows we will always use s = 10 days of past data and I = 10 5 iterations for maximization. The time step for discretization will be At = 1 day. For random optimization, the necessary number of iterations / grows very fastly, perhaps exponentially, with s/ At. The values of s and At cited here were chosen to keep the task computationally feasible. As for the value of /, in this case an increase to I = 10 6 does not improve the estimates noticeably. The estimates generated with this parameter set have negligible bias and they can efficiently distinguish between low and high volatility periods (see Appendix [B] for a discussion on the robustness and possible bias of the procedure) . An additional verification of the good performance of our estimate is shown in Fig. [2J where we give the actual sample path of Y for a single realization of the expOU model together with the estimated Y. As we can see the estimate follows the true log- volatility Y(t) very closely. D. Volatility forecasting Another possible approach to the utility of the expOU model is to evaluate its forecasting performance. One can for example use the mean error of absolute returns \dX(t + h)\ for a given Monte Carlo simulated path: E(h) = (j\dX(t + h)\ - \dX(t + h)\fj, (17) where \dX(t + h)\ is our estimate for the absolute return \dX(t + h)\ given an estimate Y(t) while h is the time horizon (number of days) to forecast ahead. We recall that for our estimates we take the conditional median of our MonteCarlo simulations. Thus for the expOU model we have \dX(t + h)\ =M[\dWi(t + h)\]mexp[Y(t)exp(-ah)], (18) where M[|(iVFi(f + /i)|] denotes the median of the absolute value of Wiener increment. Figure [3] compares five methods to forecast \dX(t + h)\ using only information available before time t. For clarity we also give the titles used in Fig. 1 in brackets ("..."): 6 3 >- £ estimated Y estimated = true true Y Figure 1: Estimated log- volatility Y as a function of the actual log- volatility Y taken from 2 x 10 5 simulations of the expOU model. Reconstruction used the last s — 10 values of returns, and I = 10 5 iterations. The error bars represent the 25% and 75% quantiles of the distribution estimated volatility. Figure 2: Estimated and actual volatility for a typical sample path of the expOU model. The estimated values were smoothed by 5-neighbor averaging to reduce noise. 1. The median of the last 5 absolute returns. This method clearly has predictive power for h < 150 days. ("5-day abs. ret.") 2. The median of the last 15 absolute returns. The longer averaging period brings a substantial improvement for short-term forecasts, but not for long-term. ("15-day abs. ret.") 3. Eq. (fl~8|) with Y(t) = Y(t), which is the true value in the underlying simulation of the expOU model. This gives a substantial decrease in forecasting error, which persists up to h ~ 500 days. This curve is also a theoretical lower bound for error achievable by the expOU model in any time series: Here the underlying data are perfectly described by expOU, and the parameters and Y(t) are known "perfectly". Neither of these usually happen in real data, and so there one expects worse performance, ("perfect") 4. Eq. ([T8|) with Y(t) estimated by the reconstruction procedure of Sec. IIIIl This estimator does not perform well, due to noise in the optimization procedure. ("1-day forecast") 5. Eq. (fT8|) with Y(t) estimated by the reconstruction procedure of Sec. IIIIl then averaged for the last 5 days. Such averaging greatly decreases noise, and the accuracy of the forecast is improved for short times (cf. Fig. [2]). ("5-day forecast") 7 c 03 <u E m > <u 250 500 750 forecast horizon (day) Figure 3: Median forecasting error E(h) of absolute returns calculated according to Eq. (|18fl by five methods. The errors were normalized by the lowest level of error achievable with the assumption of a time-independent constant volatility (horizontal dashed line). The result was averaged over 150000 independent realizations of the expOU process, parameters were chosen as outlined in Sec. IIV Al Finally, note that for h ~ 500 days all estimators based on the expOU model lose any information included in Y(t), and converge to the same error level. This is consistent with real data (cf. Sec. IIV Dp . IV. APPLICATION TO STOCK MARKET DATA In this section we present an application of the method to actual stock market data. We analyze the Dow Jones Industrial Average (DJIA) index in the period 01 /01/1900-05/26/2006, a total of 29, 038 days. In order to work with zero-mean returns, the mean return was subtracted from the actual data. Trading volumes for the index are only available for the period 04/01/1993 - 05/26/2006, a total of 3, 375 days. Parameter estimation We recall that in the estimation procedure presented here one necessarily needs to assume a theoretical model for the volatility. Having done this, the next step is to estimate the parameters involved in the model chosen. For the expOU model, Eqs. (O-©, these parameters are: to, k and a. Before proceeding further we remark that time increments in real data have a finite size since the market always works on discrete times (for daily data the minimum time increment is 1 day). Thus, in practice, the (infinitesimal) return variation dX (t) = X(t + dt) — X(t) corresponds to a (finite) return increment A A" (t) = X (t + At) — X (t ) where At is the time step between two consecutive ticks. Also the Wiener differentials dW(t) correspond, in mean square sense [45[, to the increments (19) AW(t) ps e(f)VAt, where e(t) is a Gaussian process with zero mean and unit variance [45j |. In the present case our time step has a fixed width and is equal to At = 1 day. Coming back to the estimation of parameters, we show in Appendix [Q that lnm ps ( 7 + ln2)/2+ (Jin (j AA|/VaT) ^ , (20) where 7 = 0.5772 • • • is the Euler constant. Taking into account that the third summand can be evaluated from data we see that Eq. (f2"0| provides a direct estimation of m. On the other hand, if the expOU model is appropriate then the empirical estimate Y(t) of the hidden volatility Y(t) should also be a Gaussian process with a stationary distribution of zero mean and variance given by j3 = k 2 /2a 8 [see Eq. (|10[) ]. As shown in Fig. [5J if one takes (3 = 0.61 ± 0.05, the distribution of Y{t) is Gaussian and coincides with the theoretical distribution of Y(t) given by Eq. @. The assumption of a Gaussian distribution for our estimate is robust and it holds for a wide range of parameters. To fully specify the model we have to obtain the parameter a. We have chosen the value found in Ref. [2l[ which was obtained in order to capture the long range correlations of the volatility, at least up to 500 days. The model is able to provide the appropriate long range behavior with an infinite sum of exponentials but it is also true that it does not provide a pure power law decay like the models in Refs. [l5lll7||. In any case, we have seen in Ref. [2l[ that at least for daily data our approach is satisfactory. Our final set of parameter estimates is thus m = (7.5 ± 0.5) x 10~ 3 days~ 1/2 , a = (1.82 ± 0.03) x 10 -3 days" 1 , and k = (4.7 ± 0.3) x 10~ 2 days" 1/2 . The errors were determined based on Fig. [4j similarly to the error of f3. In this parameter range the distributions of the estimated Y(t) and simulated Y(t) series agree well. The results reported throughout this paper are insensitive to the misspecification of these parameters, even beyond these error bars (see also Appendix |B|) . The distributions of log- volatility are compared in Fig. 0]for four cases: our maximum likelihood procedure applied to Dow Jones, a simulation of cxpOU, the simple estimate of Eq. (|21[) and the deconvolution procedure introduced in Ref. \2M which can be written as Y d ccon(t) = In — 171 dX{t) dW x {t) (21) where Wi(t) is a simulation of the Wiener process. Note that this deconvoluted log-return estimator has indeed a Gaussian distribution but with a larger variance as hinted in Fig. 2]in view of its wider density. We finally mention that the estimate for the log-volatility \n(a/m) sa In \dX\ given by Eq. (flU]) shows a non Gaussian and biased distribution as was also reported in Ref. [2l[. This suggests that l^decon is an appropriate "null model" to contrast with Y. Both quantities are generated by dividing dX(t) by the increments of a realization of the Wiener process and then taking the logarithm of the absolute value of this ratio. The difference lays in the fact that Y takes the realizations that satisfy a maximum likelihood requirement while Idecon takes a Wiener process realization that is purely random. In such a way, our method keeps the divisor correlated with dX and, as we will see next, it seems to conserve clustering and memory effects in the log-volatility time series. B. Clustering and the estimated volatility To support our claim that the technique presented is powerful enough in filtering the noise dW\ out of returns we will give a visual comparison based on the following qualitative experiments. Figure [5] [Left] displays a comparison over a 1000-day time interval. One can observe there that the noise level in Y is substantially smaller than in Ydccon, as also inferred from Fig. |U In order to show that such a correlation is responsible for suppressing large fluctuations in the ratio, we can perform a second experiment. Thus in Fig. O [Right] we see a comparison between the logarithm of absolute returns variations, In \dX\, and the estimated volatility Y, Note that the proper clustering of volatility becomes clearly visible. C. A comparison with trading volume The hidden nature of the volatility process has been addressed by several authors [2(1, [2?], For instance, Ref. [2(| suggests that, instead of the volatility, a good estimate would be the square root of the daily trading volume. That is, M[a(t)} cx V{t) a , (22) where a = 0.5 and M[-] denotes the median. In Fig. [5] we show evidence that supports this assumption. Again, the first estimation for the volatility is Therefore, we regress ln|c?X(t)| versus \nV(t) as shown in Fig. [5] 9 Figure 4: A comparison of estimates of log- volatility for Dow Jones. Black boxes ("estimated Y, D JIA") represent our maximum likelihood method applied to empirical data. Empty circles ("simulated Y, expOU") represent the distribution of the simulated sample path of the log- volatility, assuming that Y(t) follows the expOU model. We also plot the empirical distributions of two estimates: the red line ("In [return], DJIA") was obtained through Eq. (|16[l and the black line ("In [deconv. ret], DJIA") via Eq. dHJ. Figure 5: [Left] A comparison between the estimate Y(t) and Y& ecorl (t) for a typical 1000-day period of Dow Jones. These curves were not smoothed in order to show the substantial reduction of both the noise level and the asymmetry in Y(t) compared to ^decon(i) with a random approximation of the noise term dW. [Right] The estimate Y(t) and the logarithm of absolute return variations for the whole sample of Dow Jones. In the same figure we also present the regression between the maximum likelihood estimate Y(t) and hiV(t) which appears to be less noisy than the former regression in accordance with the smaller variance of Y(t) compared to that of ln|dAT(i)|. Nevertheless, the exponent a « 0.55 is the same for both regressions. There have been similar [Z(|, albeit controversial [13, [H|, findings for the price impact of single transactions. However, Eq. does not yet imply that volatility is proportional to volume, only that its typical value is (i.e., the median). Fluctuations around the average behavior due to changes in liquidity might have a key role in the process (48j . 10 9 10 In(volume) (arb.) 11 Figure 6: Logarithm of daily absolute return and estimated log-volatility Y as the function of daily volume. Days with similar volumes were binned for better visibility. The symbols represent the medians, and the error bars the 25 — 75% quantiles in the bins. D. The predictive power of volatility From Eqs. ([T])-©, we know that for the expOU model simple relationship can be given between ln|dX(t)| and Y(ty. \n\dX(t)\ =ln(m\dW 1 (t)\)+Y(t). Therefore, the conditional median of In |gLY(£)| given Y(t) is M In|dX(t)| Y(t) const. + Y(t). (23) We point out that this relationship implies some degree of predictability of the absolute changes in return through their median, if one knows the current value of the log- volatility Y(t). We test Eq. (|23[) for real data and with Y(t) replaced by its estimate Y(t). As shown by the bottom curve of Fig. [7l the slope of the linear regression between M[ln |F] and Y(t) is not equal to 1 - as would have been implied by Eq. (f2"3"| - but 0.9 which still suggests strong predictive power. Recall that the minimum time step of the empirical data used is 1 day. Hence, Eq. (|23p implies the prediction of tomorrow's return based on today's volatility and return. Wc now want to extend the prediction horizon. To this end we generalize Eq. (|23[) and propose the following ansatz: M ]n\dX(t + h)\ Y(t) const. +^(h)Y(t), (24) where h = 0, 1, 2, ■ • • . In Fig. [7] we test this ansatz for several values of the horizon: h = 0,5, 20, 100 and 1000 days. We find that the slope "f{h) is a decreasing function from the value 7(0) = 0.9 to practically zero when h = 1000 days which means a complete loss of memory. Note that, when h = 100 trading days we have 7 = 0.25 still implying a slight degree of prediction after approximately five months, which is of the same order of magnitude than the DJIA characteristic time scale, 1/a ~ 500 days, for the relaxation of the volatility plj . V. CONCLUSIONS The volatility is a crucial quantity for financial markets since it provides a measure of the amplitude of price fluctuations. Traders try to follow carefully the level of volatility because it gives the perception of the risk associated with any asset. Although volatility was originally conceived to be the diffusion coefficient of the price return random walk, there is compelling evidence not to consider it a constant, but a subordinated random process. The framework is 11 2T -10-L-, . , . , . , . , . 1 -2-10 1 2 3 estimated Y (5 day average) Figure 7: The proportionality between the estimated volatility and of the logarithm of absolute return variations. In order to decrease noise, 5-day moving averages have been used. Numbers on the right indicate the slopes of the corresponding regression lines. Time shifts from bottom to top: h = days (■), 5 days (•), 20 days (a), 100 days (▼), 1000 days (♦). Days with similar absolute returns were binned for better visibility. The symbols represent the medians, and the error bars the 25 — 75% quantiles in the bins. analogous to that of random diffusion processes which have been applied to a large variety of phenomena in statistical mechanics and condensed matter physics. The main obstacle to get a better knowledge of the volatility's nature is that it is not directly observed. In fact, this is precisely the motivation behind the present research. Our main objective has been to develop a tool which visualizes the sample path of volatility. The procedure derives a maximum likelihood estimate assuming that the volatility is a hidden Markov process. To do so, one needs also to assume a specific model for the volatility. We have chosen a class of two dimensional diffusions commonly known as stochastic volatility models, where the volatility acts as a diffusion particle trapped in a potential well. We have focused on the expOU model and obtained promising results, especially for three reasons: (i) the model is computationally feasible; (ii) its parameters can be easily obtained and fit the data reasonably well; and (iii) the distribution of the estimated volatility is log-normal, which is consistent with the assumed expOU model. We have shown for the Dow Jones index daily data that the sample path of our estimated volatility improves other estimates. We have compared our estimation with a rather typical one which identifies volatility with absolute return changes. Our estimation is able to remove the existing bias in the stationary distribution of volatility while still preserving the clustering in volatility time scries. We have also studied the estimate of volatility that deconvolutcs the return by the simulation of a random Wiener path [2l| . This last procedure also provides a Gaussian distribution for the log-volatility, albeit the distribution has too fat tails and pays the price of losing clustering and memory in the volatility time series. Our new procedure is in fact a more sophisticated variant of this estimate since it filters out Wiener realizations via maximum likelihood. The estimate drastically reduces the noise in the volatility path thus preserving data clustering. In this way we have thus proposed an alternative methods to those already provided by mathematical finance literature (see e.g. [H, H3, l3ll l32l. [H, [HI, [H, [H, H3, [H, [H, H(| EI)- ^ should however be left for a future research an accurate comparison between our method and others. We may even look for a way to implement parameter estimation in our method. The median of the estimated volatility has also been related to trading volume by the power-law expression M[a] oc t/0.55^ jjjjj between volatility and trading volume has been previously mentioned in different studies however our estimate is again capable to provide a less noisy regression. We must, indeed, stress the fact that this does not imply that volatility is proportional to a power of the volume, but only that its typical value is and that fluctuations around the average might play an important role. We have also seen that current returns are proportional to the estimated volatility, as otherwise expected. How- ever, the main novelty is that we have observed how future returns are proportional to current volatility and their 12 predictability diminishes monotonically with the number of time steps ahead. This last finding implies that our estimation method can be applied to predict the size of future returns with the knowledge of current volatility. As a final remark we stress the fact that the technique herein presented can be applied to a variety of physical phenomena besides finance. One typical problem of this sort is provided by the Brownian motion inside a field of force in which inertial effects are not negligible (49[. In this situation the dynamics of the particle is described by a two dimensional diffusion process (X(t), V(t)) representing the position and the velocity of the Brownian particle. The maximum likelihood technique might provide a reliable estimate of the velocity in the case that, for instance, the only accessible experimental measures are the positions of the particle at wide time steps, so that a measure of the velocity - which implies the knowledge of two very close positions - is too noisy and unreliable. Acknowledgments ZE is grateful to Janos Kertesz for his support and to the Universitat de Barcelona for its hospitality during his visit at the Departamcnt de Fisica Fonamental; also support by OTKA T049238 is acknowledged. JP and JM acknowledge support from Direccion General de Investigation under contract No. FIS2006-05204. Appendix A: DERIVATION OF THE LIKELIHOOD FUNCTION In order to make notations more compact, in this Appendix the time dependence of the stochastic processes is mostly indicated as a lower index. A generic stochastic volatility model is defined as dX t = f{Y t )dW 1 (t), (Al) dY t = -g(Y t )dt + h(Y t )dW 2 (t). (A2) To explain the procedure it is more convenient to work with the discrete time version of the model. To this end, suppose that At is a small time step and that the driving noises in Eqs. (|Alj) - (|A2|) can be approximated by (c/Eq. dWi(t)ttSi(t)VAt, (t = l,2), (A3) where £i(t) are independent standard Gaussian processes with zero mean and unit variance. We remark that the approximation (|A3p has to be understood in mean square sense (45| . The discrete time equations of the model thus read X t - X t _ At = /(y t _ At )£i(i - At)VAi (A4) Y t - Y t _ At = -g(Y t „ At )At + h(Y t _ At )e 2 (t - At)^At (A5) from which we get e l {t-At)= (A6) /(y t _ At )VAF e 2 (t-At) = Y *- Yt -* t+9 ^ At . (A7) For a given number of realizations, the probability of the set {X T , Y T } (r = t — At, t — 2At, . . .,t — s) can be easily obtained, as we will see next. Let us denote the set of realizations as {X, Y}. Then the Markov property of the process ensures that one can decompose the joint probability density function (pdf ) of this set as a chain of products between conditional probability densities. In consequence, the pdf of the whole sample path can be written as t-At p({x,y}) = p(Av s ,y t _ s ) Yl F(x T ,Y T \x T _ At ,Y T _ At ), r—t — s (A8) 13 where the first term, ¥(X t - s , Y t - S ), corresponds to the initial realizations of X and Y s/At time steps far from the present time t; all the remaining terms of the form f(X T , Y T \X T _Ati Y T _At) are the conditional pdf's for transitions between consecutive steps: (X T -At, Y T -At) — ► (X T ,Y T ). The logarithm of Eq. (|A8J) is t-At lnP({X,Y}) = lnP(AV s ,y t _ s ) + \nP(X r ,Y T \X T _At,Y T _ At ). r—t — s On the other hand from Eqs. (|A4|) - (|A5|) we realize that ¥{X Tl Y r \X T _At,Y T _At) = \J\P(sx(T-At),s 2 (T-At)), where \ J\ is the Jacobian of the transformation (X T , Y T ) — > — At), e 2 {t — At)) defined by Eqs. is, 1 \J\ = (A9) 3, that f(Y T -At)h(Y T -At)At But £i and £2 are independent standard Gaussians, hence P(e 1 ,ea) = (l/27r)ffltp[(e?+e2)/2] ) whence F(X T ,Y T \X T _ A t,Y T -At) l/(27rAt) f{Y T -At)h(Xr-At) exp £?(t- At)+sl(r- At) (A10) Substituting Eqs. (|A6[) - (|A7p into this equation and the result into Eq. (|A9[) we finally get i-Ai lnP({X,Y}) = _ sln (2^At) _ ^ [ m/( y r _ At)+m/l( y T _ At)] t-At inP(x t _ s ,y t _ s )-i £ X T — Xf- Al i-Ai Y T -Y T _At 9(Y T -At) h(Y T -At)At h(Y T —At) f(Y T _ At )At -1 2 At At. (All) Let us briefly explain the origin of some of these contributions. The first summand comes from the normalization constant of the Gaussian distribution (|A10|) . It appears in every conditional probability density and this is the reason for the factor s/At, which is the number of time steps between t — s and t. The resulting term does not depend on the realization, so that we can neglect it for a maximization with respect to Y. The term also goes to — 00 in the At — > limit, which means that any individual realization has a probability measure zero. The second summand is mostly the sum of the Jacobian transformations of each transition probability and depends on Y . Stochastic volatility models typically assume that these / and g [cf. Eqs. (|A1|) and (|A2|) ] are continuous and monotonically increasing functions or even constants. For instance, in the expOU model we have /(x) = mexp(x) and g(x) = k. Because of this, we will also neglect this term in the maximization procedure. The contribution shifts the maximum at the excessive cost of adding more noise to the numerical computations. We can however look at the situation from another point of view. Ignoring this term is equivalent to omitting the Jacobian transformation between the two probability density measures [cf. Eq. (|A10[) ] . In this way, we are stating that what we are really going to maximize is the probability of the realization of £\{t) and £2(t) -instead of Y(t)- in terms of the past history of the process expressed by {X, Y}. The term lnP(X t _ s , Y t - S ) is fixed by the initial conditions of the process. If we assume a known initial return X - which can be set to zero - and take a random Y t - S following the stationary distribution P st (Yt_ s ) given by Eq. ([5]), then P(X t _ s ,F t _ s ) = 5(X t _ s - X) P st (Y t _ s ) and hence lnP(AV s ,y t _ s ) = lnP st (r 4 _ s ) +ln<5(AVs - X). (A12) 14 > T3 CD M — ' CO E true Y Figure 8: Estimated log-volatility Y as a function of the actual log-volatility Y taken from 29038 simulations of the expOU model in four ways. Reconstruction uses the last s = 10 values of returns, and I = 10 5 iterations. The error bars represent the 25% and 75% quantiles of the distribution estimated volatility. Had we taken another initial condition, the technique would have given equivalent results (we have checked this by using several initial distributions). For this reason and in order to improve the convergence of the maximum likelihood estimate we have neglected also this contribution. We therefore write 1 t— Ai lnP({X,Y}) ~ -- £ r=t- 1 t— At - - y 2 ^ T—t — S We can represent this equation in the continuous time framework if At is sufficiently small and if /(a;), h(x) and g(x) are continuous. In such a case, Eq. (|A13p yields the result given in Eq. (fTTj) . Appendix B: ROBUSTNESS OF THE PROCEDURE In order to show that our volatility estimation procedure is robust, we carry out four measurements along the lines of Sec. IIII Cl We generate artificial time series of the same length as the DJIA dataset, with the same parameters as therein: m = (7.5 ± 0.5) x 1CT 3 days" 1/2 , a = (1.82 ±0.03) x 10~ 3 days -1 , and k = (4.7 ±0.3) x 10~ 2 days" 1/2 . Then we reproduce Fig. [T] in several variations: (a) In a way identical to the paper as shown in Fig. [TJ (b) Data generated as in (a), but for reconstructing Y we use the above parameters minus twice the error specified above. (c) Data generated as in (a), but for reconstructing Y we use the above parameters plus twice the error specified above. (d) In a way identical to (a), but the data is detrended by the Monte Carlo sample mean /i. We however remind that the process is driftless as given by Eq. ([3]). In this way we want to check whether detrending causes a systematic bias in the estimation. Y X T — A r _At f(Y T ^ At )At Y T - At , At h(Y T -At)At h{Y T -At). At (A13) 15 T3 CD M — ' CO E -*— ' CO CD 1.1,1, i.i. i ' i 1 i 1 ■ true bias 10x true bias * 100x true bias 1 1 i 1 1 true Y Figure 9: Estimated log-volatility Y as a function of the actual log-volatility Y taken from 29038 simulations of the expOU model in three different levels of "detrending error". Reconstruction used the last s = 10 values of returns, and / = 10 J iterations. The error bars represent the 25% and 75% quantiles of the distribution estimated volatility. Note that there only exists a clear deviation when the error is 100 times the true bias. The results are given in Fig. [8J The plot shows, that either option gives the same width of the Y distribution, and the bias introduced by either specification error is small. We have also run similar simulations but with very different parameters values and the procedure still provides consistent results similar to those given by Fig. [8j Finally, a more detailed test along the lines of point (d) above can be performed (that is: a test on wether the detrending causes a bias in our estimation procedure). We thus have repeated the test (d) also including errors 10^t and 100/i. The procedure tolerates up to 10 times more detrending error than expected in the real dataset. Only, when we have about 100 times the error, the reconstruction shows a strong upward bias for low volatility periods as shown in Fig. O All these results imply that possible errors in the detrending procedure could affect our procedure only when any wrong specification of the drift is far outside the error domain involved in our DJIA data set. Appendix C: DERIVATION OF EQ. (20) We start from Eq. IJ7J which we write in the approximate form AX(t) ~ me Y ^AW{t), thus In \AX(t)\ = m + Y(t) + In |AVFi(t)| and, taking into account that (Y(t)) = 0, we have (ln|AX(i)|) ~lnm+(ln|AVFi(t)|). (CI) On the other hand, we know that AWi(i) ~ e^/At, where e is a standard Gaussian variable (c/Eq. (fH))) ). Hence (ln|AVFi(t)|) « (ln|e|) + (lnAi)/2. But 1 (ln| £ |) = ~ e / 2 ln|e|cie, which, after a simple change of variables inside the integral, yields [50|, [51( 16 (In lei) = —= / X - 1/2 e- x/2 \nxdx = -y/^/2h + In 2), 2\/2Wo where 7 = 0.5772 • • • is the Euler constant. Therefore, (ln|AWi(i)|) « (lnAi)/2- ( 7 + ln2)/2. (C2) Substituting Eq. jC2]) into Eq. JUT]) proves Eq. (120)) . J. -P. Bouchaud and M. Potters, Theory of Financial Risk (Cambridge University Press, Cambridge, 2000). H. Markowitz, Journal of Finance 7, 77 (1952). D. Ben-Avraham and S. Havlin, Diffusion and Reactions in Fractals and Disordered Systems (Cambridge University Press, Cambridge, 2000). M. Paczuski, S. Boettcher, and M. Baiesi, Phys. Rev. Lett. 95, 181102 (2005). P. Bak, K. Christensen, L. Danon, and T. Scanlon, Phys. Rev. Lett. 88, 178501 (2002). A. Bunde, S. Havlin, J. W. Kantelhardt, T. Penzel, J.-H. Peter, and K. Voigt, Phys. Rev. Lett. 85, 3736 (2000). A. Bunde, J. F. Eichner, J. W. Kantelhardt, and S. Havlin, Phys. Rev. Lett. 94, 048701 (2005). R. Cont, Quant. Finance 1, 223 (2001). L. Bachelier, Annales Scientifiques de l'Ecole Normale Superieure 111-17, 21 (1900). M. Osborne, Operations Research 7, 145 (1959). F. Black and M. Scholes, The Journal of Political Economy 81, 637 (1973). A. Lo, Econometrica 59, 1279 (1991). Z. Ding, C. W. J. Granger, and R. F. Engle, Journal of Empirical Finance 1, 83 (1993). Y. Liu, P. Gopikrishnan, P. Cizeau, C.-K. Peng, M. Meyer, and H. Stanley, Phys. Rev. E 60, 1390 (1999). O. E. Barndorff-Nielsen and N. Shephard, J. R. Statist. Soc. B 63, 167 (2001). A. Saichev and D. Sornette, Phys. Rev. E 74, 011111 (2006). E. Bacry, J. Delour, and J. F. Muzy, Phys. Rev. E 64, 26103 (2001). L. Calvet and A. Fisher, Review of Economics and Statistics 84, 381 (2002). T. Lux, submitted to Journal of Business and Economics Statistics. L. E. Calvet and A. J. Fisher, Journal of Financial Econometrics 2 (2004). J. Masoliver and J. Perello, Quantitative Finance 6, 423 (2006). J. -P. Fouque, G. Papanicolaou, and K. R. Sircar, Derivatives in Financial Markets with Stochastic Volatility (Cambridge University Press, Cambridge, 2000). J. Perello and J. Masoliver, Phys. Rev. E 67, 037102 (2003). J. Perello, J. Masoliver, and N. Anento, Physica A 344, 134 (2004). J. Masoliver and J. Perello, International Journal of Theoretical and Applied Finance 5, 541 (2002). A. A. Dragulescu and V. M. Yakovenko, Quantitative Finance 2, 443 (2002). T. G. Andersen and J. Lund, Journal of Econometrics 77, 343 (1997). W. A. Brock and B. D. LeBaron, Rev. Econ. Stat. 78, 94 (1996). C. Lindberg, Mathematical Finance 16, 549 (2006). V. Genon-Catalot, T. Jeantheau, and C. Laredo, Bernouilli 6, 1051 (2000). O. Elerian, S. Chib, and N. Shephard, Econometrica 69, 959 (2001). O. E. Barndorff-Nielsen and N. Shephard, J. R. Statist. Soc. B 64, 253 (2002). V. Genon-Catalot, T. Jeantheau, and C. Laredo, Scandinavian Journal of Statistics 30, 297 (2003). F. Comte and V. Genon-Catalot, Scandinavian Journal of Statistics 33, 875 (2006). J. E. Griffin and M. F. J. Steel, Journal of Econometrics 134, 605 (2006). G. Jongbloed and F. H. V. der Meulen, Scandinavian Journal of Statistics 33, 825 (2006). K. Morimune, The Japanese Economic Review 58, 1 (2007). T. Andersen and T. Bollerslev, Journal of Finance 53, 219 (1998). E. Barucci and R. Reno, Economics Letters 74, 371 (2002). Y. Omori, S. Chib, N. Shephard, and J. Nakajima, Journal of Econometrics (2006). R. Reno, Econometric Theory (2007). I. M. Gelfand and G. E. Shilov, Generalized Functions (Academic Press, New York, 1964). T. K. Moon and W. C. Stirling, Mathematical Methods and Algorithms for Signal Processing (Prentice Hall, 2000). C. R. Champlin and D. Morrell, in Proceedings of the 31 st Conference on Decision and Control (1991), p. 3707. C. W. Gardiner, Handbook of Stochastic Methods (Springer- Verlag, Berlin, 1997). X. Gabaix, P. Gopikrishnan, V. Plerou, and H. Stanley, Nature 423, 267 (2003). Z. Eisler and J. Kertesz, Eur. Phys. J. B 51, 145 (2006). J. D. Farmer, L. Gillemot, F. Lillo, S. Mike, and A. Sen, Quantitative Finance 4, 383 (2004). 17 [49] J. Masoliver, Phys. Rev. E 48, 121 (1993). [50] I. S. Gradstein and I. M. Ryzhik, Table of Integrals, Series, and Products (Academic Press, San Diego, 1994). [51] W. Magnus, F. Oberhettinger, and R. P. Soni, Formulas and Theorems for the Special Functions of the Mathematical Physics (Springer- Verlag, Berlin, 1966).