A Hybrid Fuzzy GJR-GARCH Modeling Approach for Stock Market Volatility Forecasting

Forecasting stock market returns volatility is a challenging task that has attracted the attention of market practitioners, regulators and academics in recent years. This paper proposes a Fuzzy GJR-GARCH model to forecast the volatility of S&P 500 and Ibovespa indexes. The model comprises both the concept of fuzzy inference systems and GJR-GARCH modeling approach in order to consider the principles of time-varying volatility, leverage effects and volatility clustering, in which changes are cataloged by similarity. Moreover, a differential evolution (DE) algorithm is suggested to solve the problem of Fuzzy GJR-GARCH parameters estimation. The results indicate that the proposed method offers signiﬁcant improvements in volatility forecasting performance in comparison with GARCH-type models and with a current Fuzzy-GARCH model reported in the literature. Furthermore, the DE-based algorithm aims to achieve an optimal solution with a rapid convergence rate.


Introduction
Accurately measuring and forecasting stock market volatility plays a crucial role for asset and derivative pricing, hedge strategies, portfolio allocation and risk management.Since the 1987 stock market crash, academics, practitioners and regulators have been investigated the development of financial time series models with changing variance over time in order to avoid huge investments losses due to their exposure to unexpected market movements (Allen & Morzuch, 2006, Carvalho et al., 2005, Lin et al., 2012).Indeed, volatility, as a measure of financial security prices fluctuation around its expected value, is one of the primary inputs in decision making processes under uncertainty, justifying its growing interest in the financial and economic literature (Kapetanios et al., 2006, Lux & Kaizoji, 2007).
Stock returns volatility is often characterized by some stylized facts, such as volatility clusters, persistence, leptokurtic data behavior and time-varying volatility.A convenient framework for dealing with time-dependent volatility in financial markets concerns the autoregressive conditional heteroskedasticity (ARCH) model, proposed by Engle (1982), becoming a popular tool for volatility modeling.Providing a more flexible structure, Bollerslev (1986) introduced the Generalized ARCH (GARCH) model, which combines the ARCH and autoregressive moving average (ARMA) models.The GARCH model estimates jointly a conditional mean and conditional variance equation, and it is characterized by a fat tail and excess of kurtosis, regularly used in studying the daily returns of stock market data (Han & Park, 2008). 1espite the success of GARCH model, it has been criticized for failing to capture the asymmetric volatility (Liu & Hung, 2010), since for stock prices, negative shocks to returns generally have large impacts on their volatility than positive shocks.To overcome these limitations, extensions of the GARCH model have been proposed, comprising a class of asymmetric GARCH models.The exponential GARCH (EGARCH), developed by Nelson (1991), and the GJR-GARCH, proposed by Glosten et al. (1993), are the main representatives of GARCH-family models assuming persistence.These models include the asymmetric responses of volatility to positive and negative shocks, but they do not simulate stock fluctuations with volatility clustering well.Howsoever, these effects can be modeled by modifications of linear models, while other require nonlinear approaches, less commonly used in practice due to their complexity (Hung, 2011b).
Methods based on artificial intelligence have been extensively applied as a flexible way to describe complex dynamics of various economic and financial problems (Haofei et al., 2007, Racine, 2001).Hamid & Iqbal (2004) suggested the use of an artificial neural network (ANN) model to predict the volatility of S&P 500 index futures prices.Bildirici & Ersin (2009) enhanced GARCH-family models with ANN to forecast the volatility of daily returns in Istanbul Stock Exchange.On the other hand, Hajizadeh et al. (2012) proposed a scheme in which the estimates of volatility obtained by an EGARCH model are fed forward to an ANN model, considering the S&P 500 index prices.
A radial basis function neural network with Gaussian activation functions and robust clustering algorithms to model the conditional volatility of the Spanish electricity pool prices was suggested by Coelho & Santos (2011).The authors showed that their model performed better than traditional linear models to predict upward and downward movements in electricity future prices.Concerning the issue of derivative securities pricing, Wang (2009) integrated a GJR-GARCH model into an ANN option-pricing model and indicated that their approach provides higher predictability than other volatility methodologies.Providing similar results, Wang et al. (2012) and Tseng et al. (2008) also evaluated volatility forecasting performance in option pricing combining neural networks and GARCH-family models.Tino et al. (2001) introduced a recurrent neural network model to simulate daily trading of straddles on financial indexes based on predictions of daily volatility.The authors showed that while GARCH models cannot generate any significantly positive profit, the use of recurrent networks can generate a statistically significant excess profit.Moreover, Tung & Quek (2011) jointed a self-organising neural network and option straddle-based approach to financial volatility trading.Compared with several benchmarks, the proposed methodology demonstrated that its ability to forecast the future volatility enhances investments profits.Despite the high ability to deal with the problem of volatility forecasting, ANN drawbacks include its "black box" nature, greater computational burden, proneness to overfitting, and the empirical nature of model development.
Due to these shortcomings, models based on fuzzy theory appear as an alternative methodology to evaluate high nonlinear systems (Zadeh, 2005, Savran, 2007).Popov & Bykhanov (2005) combined the concept of fuzzy rules and GARCH approach to model volatility of financial time series.The conditional volatility forecasting of foreign exchange rates returns was considered by Geng & Ma (2008), using a functional fuzzy inference system applied to the GARCH model.Hung (2009a) adopted the method of fuzzy logic systems to modify the threshold values for an asymmetric GARCH model.Based on simulations, the author showed that the forecasting performance is significantly improved if the leverage effect of clustering is considered along with the use of fuzzy systems and GARCH approaches.Thavaneswaran et al. (2009) proposed a fuzzy weighted possibilistic model for option valuation based on the estimation and forecasting of financial volatility, considered as fuzzy numbers.They stated that fuzzy assumptions are more flexible and reveal promising results for option pricing as an intuitive way to look at the uncertainty in the models parameters.To capture the volatility conditional distribution on higher-order moments such as skewness, a GARCH-Fuzzy-Density method for volatility density forecasting was proposed by Helin & Koivisto (2011).
The model provided more accurate density forecasts for the higher-order moment varying processes than traditional GARCH models.
Combining the concepts of fuzzy systems and artificial neural networks, Chang et al. (2011) suggested the use of a hybrid adaptive network-based fuzzy inference system (ANFIS) to forecast the volatility of the Taiwan stock market.The authors indicated that the proposed model is superior to other methods with regard to error measures.Furthermore, Luna & Ballini (2012) introduced an adaptive fuzzy system to forecast financial time series volatility and compared their method with a GARCH model.The results indicate the higher performance of the adaptive fuzzy approach for volatility forecasting purposes.
A hybrid Fuzzy-GARCH model was suggested by Hung (2009b).The model comprises a functional fuzzy inference system with a GARCH model, optimized using a genetic algorithm framework.Similarly, Hung (2011a) and Hung (2011b) proposed a fuzzy system method to analyze clustering in GARCH models using genetic algorithms and particle swarm optimization to estimate the parameters, respectively.The author indicates that the model offers significant improvements in forecasting stock market volatility, overperforming some GARCH-family models.
Hence, this paper combines a GJR-GARCH model and fuzzy systems to develop a Fuzzy GJR-GARCH model.The proposed model is based on a collection of fuzzy rules in the form of IF-THEN statements, in which its structure comprises a GJR-GARCH model.The methodology considers both volatility asymmetry and volatility clustering as well.Moreover, the Fuzzy GJR-GARCH includes clustering estimation using subtractive clustering algorithm to provide a more autonomous model, based on data-knowledge, different from the approach proposed by Hung (2011a), which considers expert knowledge to determine the number of stock return clusters.Computational experiments illustrating the effectiveness of the Fuzzy GJR-GARCH model are provided by modeling and forecasting the volatility of S&P 500 (United States) and Ibovespa (Brazil) indexes from January 3, 2000 through September 30, 2011, in comparison with some GARCH-family models and a current Fuzzy-GARCH method proposed by Hung (2011a).
The choice regarding a GJR-GARCH specification is justified by the evidence of volatility asymmetry in stock market data (Martens et al., 2009).Furthermore, other studies have found evidence in favor of the GJR-GARCH model (Brailsford & Faff, 1996, Taylor, 2004).For emerging stock market data, Ng & McAleer (2004) suggested that GARCH and GJR-GARCH models are superior to the Risk-Metrics (Morgan, 1996) model in forecasting stock market volatility; however, neither GARCH nor GJR-GARCH dominates the other.Despite extensive literature on volatility model evaluation, no consensus exists suggesting the most appropriate model for providing which model has the optimal performance in forecasting volatility (Liu & Hung, 2010).
The process of optimizing the Fuzzy GJR-GARCH model parameters is highly complex and nonlinear.A Differential Evolution (DE) based parameter estimation algorithm is suggested in this paper in order to provide the optimal solution for the proposed model, since in this approach there is no requirement that the search space is differentiable or continuous, unlike some traditional optimization techniques.Nature-inspired metaheuristic, the DE algorithm, proposed by Storn & Price (1997), performs difference of the parameters vectors in order to search the optimal solution in a fitness function landscape.In a comparison to several other evolutionary computation algorithms, the main features that make the DE algorithm as an attractive optimization tool are: simplicity of implementation;2 high accuracy and robustness to deal with many problems including separable, nonseparable, modal and multimodal objective functions (Zhang & Sanderson, 2003); very few number of control parameters (three in classical DE); and finally, for expensive and large scale optimization problems, the space complexity of DE is quite slow as compared to most other evolutionary algorithms.
The Fuzzy GJR-GARCH model has novel features in comparison to the existing approaches in the literature.First, the proposed method combines a fuzzy scheme with a GRJ-GARCH model, providing a more realistic framework that captures both asymmetries and volatility clustering.Second, the Fuzzy GJR-GARCH model avoids the process of tuning the number of fuzzy rules by considering a subtractive clustering algorithm.Third, this paper proposes a simple and efficient differential evolution algorithm to optimize the model parameters.Finally, empirical results considers comparisons with GARCH-type models and also with a recent Fuzzy-GARCH approach proposed by Hung (2011a), estimated with particle swarm optimization as in its original version and with the differential evolution algorithm suggested.
After this introduction, the reminder of this paper is organized as follows.The proposed Fuzzy GJR-GARCH model and the current Fuzzy-GARCH approach related in the literature are described in Section 2. Section 3 discusses the DEbased optimization algorithm of the Fuzzy GJR-GARCH parameters estimation.Computational experiments evaluating the potential of the proposed method for stock market data volatility modeling and forecasting are reported in Section 4. Finally, Section 5 concludes and suggests issues for further investigation.

GARCH-type models
The Generalized Autoregressive Conditional Heteroskedasticity model, GARCH(p, q), considers the current conditional variance dependent on the p past conditional variances as well as the q past squared innovations.Let y t = 100 × (ln P t − ln P t−1 ) denote the continuously compounded rate of stock returns from time t − 1 to t, where P t is the daily closing stock price at time t.The GARCH(p, q) model can be written as: Maciel, L. (1) where ξ t is a sequence of independent and identically distributed (i.i.d.) random variables with zero-mean and unit variance, σ 2 t is the conditional variance of ξ t , and ω, α i and β j are unknown coefficients to be estimated, assuming (Bollerslev, 1986): ω > 0, α i ≥ 0, i = 1, 2, . . ., q, q > 0, The GARCH model reduces the number of parameters required by considering the information in the lag(s) of the conditional variance in addition to the lagged y 2 t−i term(s) as in ARCH-type models.GARCH models' simplicity and ability to capture persistence of volatility explain its empirical and theoretical appeal.However, it fails to capture the asymmetric influence of negative and positive shocks differently.On the other hand, the GJR-GARCH model provides a mechanism to account for leverage effects of price change on conditional variance.The GJR-GARCH(p, q) model is given by: where S − t−i = 1 if y t−i < 0, and S − t−i = 0 if y t−i ≥ 0, and γ i are unknown parameters to be estimated.In this case, the following conditions must be satisfied (Glosten et al., 1993): Despite the GJR-GARCH model includes the asymmetric responses of volatility to positive and negative shocks, it does not simulate stock fluctuations with volatility clustering well.This fact can lead to poor adequacy and forecast ability.Therefore, the proposal of a Fuzzy GJR-GARCH approach appears as a potential tool for volatility modeling and forecasting in a presence of volatility clustering and leverage effects.

The hybrid Fuzzy GJR-GARCH model
Fuzzy inference systems are universal approximations that can estimate nonlinear continuous functions uniformly with arbitrary accuracy (Hung, 2011b, Savran, 2007, Ji et al., 2007).Besides the GJR-GARCH model considers the differential effects of the propagations of volatility caused by negative and positive shocks, a fuzzy approach provides the capability to simulate stock fluctuations with volatility clustering.The proposed Fuzzy GJR-GARCH model is describe by a collection of fuzzy rules in the form of IF-THEN statements in order to describe the stock market fluctuations via a GJR-GARCH model.Therefore, the kth rule of the Fuzzy GJR-GARCH(p, q) is written as: where Γ k is the kth fuzzy set (membership function) to describe the stock market return y (for k = 1, 2, . . ., R), R is the number of fuzzy rules, and y t−1 is the previous value of the stock market's returns.
To describe the grade of membership of y t−1 in Γ k , it is assumed a Gaussian membership function3 : where c k is the center (focal point) of the kth local model, and r k is a positive constant which defines the zone of influence of the respective cluster.
The collection of the R rules assembles a model as a combination of local models.The contribution of a local model to the overall output is proportional to the normalized degree of firing of each rule, expressed as: The Fuzzy GJR-GARCH model output is calculated by weighted averaging of individual rules' contributions: and σ where 9) is a nonlinear time-varying equation to model the behavior of complex dynamic systems as stock market volatility processes, including mechanisms to ensure the description of volatility stylized facts, such as volatility dependence, leverage effects and volatility clustering.
The objective function in (10) is a highly nonlinear function of x.Simulations analysis showed that conventional gradient search methodologies produce poor estimates or even are not capable to find the global minimum of E(x) in Equation (10).In this paper, to estimate the Fuzzy GJR-GARCH parameters, a differential evolution algorithm was proposed.
The number of rules and its respective focal points (c k ) and spreads (r k ) were set using the Subtractive Clustering (SC) algorithm, proposed by Chiu (1994).SC uses the data points as candidate prototype cluster centers.The capability of a point to be a cluster center is evaluated through its potential, a measure of the spatial proximity between a particular point y i and all other data points: where ρ i denote the potential of the ith data point, r is a positive constant which defines the zone of influence of the cluster, and N is the number of points.
The value of the potential is higher for a data point that is surrounded by a large number of close data points.Therefore, it is reasonable to establish such a point to be the center of a cluster.The potential of all other data points is reduced by an amount proportional to the potential of the chosen point and inversely proportional to the distance to this center.The next center is found also as the data point with the highest (after this subtraction) potential.The procedure is repeated until the potential of all data points is reduced below a certain threshold.4

Current Fuzzy-GARCH model approach
For comparison purposes, the suggested Fuzzy GJR-GARCH model was also compared with a current Fuzzy-GARCH approach related in the literature and proposed by Hung (2011a).Hence, the kth rule of the Fuzzy-GARCH(p, q) method is written as: In this case, rules consequents are comprised by a GARCH(p, q) model, instead of a GRJ-GARCH(p, q) process.Similarly as in Equation ( 9), the Fuzzy-GARCH model output is calculated by weighted averaging of individual rules' contributions: Hung (2011a) applied a particle swarm optimization (PSO) algorithm to estimate the parameter set x = (ω 1 , . . ., ω R , α 11 , . . ., α Rq , β 11 , . . ., β Rp ) of the Fuzzy-GARCH model.Nevertheless, this paper adopts the same methodology as in Hung (2011a), i.e. using PSO, and also optimized Fuzzy-GARCH parameters with the differential evolution algorithm considered.Moreover, the number of rules and its focal points in Fuzzy-GARCH were performed with subtractive clustering as well.
The fitness (objective) function is given as follows: where Alternatively, the objective function associated with the Fuzzy-GARCH model is expressed as: In this paper, the optimization problems in ( 15) and ( 16) were solved by using a differential evolution algorithm for the Fuzzy GJR-GARCH and Fuzzy-GARCH models, respectively.Additionally, the PSO algorithm proposed by Hung (2011a) was considered to optimize the Fuzzy-GARCH approach as in its original version.
Differential Evolution is a vector population-based stochastic method for global optimization, introduced by Storn & Price (1997).The main idea behind the DE algorithm consists in a generation scheme of experimental parameters vectors that will disturb the population vector in order to evolve and find a desirable end.The main stages which comprises the DE algorithm are: initialization, mutation, crossover and selection.DE works through a simple cycle of these stages, illustrated in Figure 1.Each stage is described as follows for the case of Fuzzy GARCH-type parameters optimization.Finally, the PSO algorithm (Hung, 2011a) is also reported.

Figure 1
Main stages of the DE algorithm

Initialization
Consider an optimization problem in a D-dimensional real parameter space ℜ D .The population Φ is comprised by N P × D dimensional real-valued parameter vectors, described as: T , G = 0, 1, . . ., G max (17) where G = 0, 1, . . ., G max denotes the generation counter and N P the number of population vectors.
According to Storn & Price (1997), based on several benchmark optimization problems, a reasonable choice for N P is between 5 • D and 10 • D, but N P must be at least 4 to ensure that DE will have enough mutually different vectors with which to work.Moreover, the literature has been shown that DE control parameters setting is a specific empirical procedure (Das & Suganthan, 2011).In order to avoid non-convergence (low population size) and higher computational time complexity (high population size), N P = 10 • D was considered in this paper.
Each population vector, called also as chromosome, forms a candidate solution to the problem.In the case of the Fuzzy GJR-GARCH model estimation problem the real parameter space has dimension D = (R + R × q + R × p + R × q) = R(1 + 2q + p).Otherwise, when the Fuzzy-GARCH model is take into account D = (R + R × q + R × p) = R(1 + q + p).The initial population (at G = 0), as in the classical DE algorithm, is set as: Maciel, L.
x i,j,1 = ε i,j , j = 1, 2, . . ., D, i = 1, 2, . . ., N P where ε i,j is an uniformly distributed random number lying between 0 and 1 and is instantiated independently for each component of the ith vector.Das & Suganthan (2011) pointed out that evolutionary computation algorithms such as DE have the advantage of not depend on the choice of initial values, since any point in the parameter space has not null probability to be chosen.Moreover, the population is comprised only by viable solutions, i.e. satisfying Equations ( 3) and ( 5) for the Fuzzy-GARCH and Fuzzy GJR-GARCH models, respectively.

Mutation
The mutation is a perturbation in the gene characteristic of a chromosome with a random element.In DE algorithms, a parent vector from the current generation is called target vector, a mutant vector obtained through the differential mutation operation is the donor vector, and an offspring formed by recombining the donor with the target vector is known as trial vector.
In the classical DE-mutation operator, three distinct parameter vectors x r i 1 , x r i 2 and x r i 3 are sampled randomly from the current population, in order to create the donor vector for each ith target vector.The indices r i 1 , r i 2 and r i 3 , different from the base vector index i, are mutually exclusive integers randomly chosen from the range [1, N P ].The mutation operator can be described as: where v i,G is the donor vector and F is the mutation scale factor, a scalar number usually in the interval [0.4,1].
The mutation process is shown in Figure 6 on a two-dimensional parameter space, including constant cost contours of an arbitrary objective function.In this case, the mutation uses a randomly vector x r i 1 and only one weighted difference vector to perturb F • x r i 2 − x r i 3 it.However, there are several variants of mutation operator which are different to Expression (20).The perturbations, for example, which setting the base vector to the current best vector or a linear combination of various vectors are widely applied.In this work, the classical mutation operator described was considered, since performs very well for a wide range of problems, and as stated by Price and Rönkkönen ( 2006), mutation operators variants need further investigation under which circumstances they perform well.

Crossover
After generating the donor vector through mutation, the next step in the DE algorithm is to apply the crossover operator.This stage provides the potential diversity enhancement of the population, which exchanges parameters of the mutation vector v i,G and the target vector x i,G to generate the so-called trial vector u i,G = [u 1,i,G , u 2,i,G , . . ., u D,i,G ].In this paper, it was considered the most common form of crossover operation, uniform (or binomial), defined as: where ε i,j is a uniformly distributed random number lying between 0 and 1, which is called a new for each jth component of the ith parameter vector, j rand ∈ [1, 2, . . ., D] is an index chosen randomly, which guarantees that u i,G gets at least one component from v i,G , and finally, Cr is called the crossover rate, a control parameter of DE crossover operator, which lies in the interval [0, 1] and defines the number of parameters in expectation are changed in a population member.Figure 3 illustrates an example of a crossover process in differential evolution, where it is clear the enhance on the population diversity by exchanging some components of the target (x) and donor (v) vectors.

Selection
Finally, the selection step determines whether the target or the trial vector survives to the next generation G = G + 1, as the following condition: where E(x) is the fitness function to be minimized.The population size over generations is kept constant.Therefore, the new trial vector replaces the corresponding target vector in the next generation if provides an equal or lower value of the objective (fitness) function 5 .Otherwise, the target vector is retained in the population.It must be noted that, this selection criteria never deteriorates the fitness status, since the population either gets better or remains the same, in terms of objective function value.Nevertheless, this criteria enables DE-vectors to move over flat fitness landscapes with generations (Das & Suganthan, 2011).

Algorithm
Once described the classical DE algorithm steps, i.e. initialization, mutation, crossover and selection, the whole algorithm for the Fuzzy GJR-GARCH parameters identification in a pseudo-code is presented bellow.
Algorithm.DE algorithm for the Fuzzy GJR-GARCH identification Initialize randomly a initial population Φ 0 ∈ [0, 1] Initiate G = 1 while G < G max do for i = 1 to N P do for each individual sequentially Generate a donor vector v i,G Generate a trial vector In the algorithm suggested, the termination criteria is the number of iterations or generations, G max .However, it can be defined as termination condition when the best fitness of the population does not change according to a threshold over successive generations, or alternatively, when some pre-specified objective function value was reached.The control parameters are: F , Cr and G max .

The Fuzzy-GARCH PSO-based algorithm
In order to optimize the Fuzzy-GARCH model proposed by Hung (2011a), this paper considered the PSO-based algorithm, as reported on its original formulation by the author, and also the differential evolution methodology suggested, previously described.This subsection presents the PSO algorithm.
Particle swarm optimization is a population-based stochastic algorithm that attempts to generate better solutions to find a desirable end (Hung, 2011a).Each swarm consists of many particles, which represents a possible solution to an optimization problem.Each particle is characterized by its position and velocity.In PSO, particles evolve in search space in three ways: i) by moving in the previous direction; ii) by moving toward the optimum (present location); or iii) by mov-Maciel, L.
ing toward the best solution for the entire swarm (population).Over iterations an improvement in the performance is achieved.
The first step is to randomly generate a swarm of S particles in a form possible solution vectors for the Fuzzy-GARCH model.The position and velocity of the ith particle at the kth iteration are represented as ), respectively.Thus, the second step updates the position and velocity of each particle according to: where χ is the linearly decreasing inertia weight, µ 1 and µ 2 are acceleration coefficients, ε 1 and ε 2 are uniformly distributed random variables lying between 0 and 1, l i,j is the value of the objective function along the jth dimension for the best position for the ith particle, and g j is the value of the fitness function along the hth dimension for the global best position found by all particles in the swarm (Hung, 2011a).Parameters µ 1 , µ 2 and χ are related as follows (Clerc & Kennedy, 2002): In the third step, the fitness function value E(x) of each particle, updated with Equations ( 23) and ( 24), is calculated.If the objective function value for the new particle is higher than that of the local optimum (l i,j ) in its present location, then the local optimum is replaced by the new particle (Hung, 2011a).Moreover, if these previous condition holds and the fitness value for the new particle is higher than that of all the particles in the swarm, then this population's optimum value is also replaced by the new particle.This process repeats until the maximum number of iterations (G) is achieved, as the termination criteria. 6The PSO control parameters are: µ 1 , µ 2 , χ, S and G. Nevertheless, the number of rules and its focal points were set using subtractive clustering algorithm.

Empirical Results and Analysis
To illustrate the performance of the proposed Fuzzy GJR-GARCH model for modeling and forecasting stock market volatility, this paper focus on daily prices of S&P 500 (US) and Ibovespa (Brazil) over the period from January 3, 2000 through September 30, 2011, in comparison with GARCH and GJR-GARCH models, and with the Fuzzy-GARCH model proposed by Hung (2011a).Moreover, the Fuzzy-GARCH approach was optimized using PSO as in its original version and using the DE algorithm suggested in this paper.The daily stock return series were generated by taking the natural logarithm difference of the daily stock index and the previous day's stock index and multiplied by 100, as described previously.The data sample was partitioned into two parts.The in-sample period consists on data from January 3, 2000 through December 29, 2005.On the other hand, the forecast out-of-sample period is from January 2, 2006 through September 30, 2011.
Table 1 shows the basic statistical characteristics of the return series.The average daily returns are negative for S&P 500 and positive for Ibovespa.The daily returns display evidence of skewness and kurtosis.* Significantly at 5%.
The returns series are skewed towards the left, characterized by a distribution with tails that are significantly thicker than for a normal distribution.J-B test statistics further confirms that the daily returns are non-normal distributed.As compared with Gaussian distribution, the kurtosis in S&P 500 and Ibovespa suggest that their daily returns have fat-tailed (Table 1).Ibovespa index has a higher kurtosis than S&P 500, which explains the fact that emerging countries show in general a more leptokurtic behavior.Under the null hypothesis of no serial correlation in the squared returns, the Ljung-Box Q 2 (10) statistics infer a linear dependence for both series considered.Furthermore, the Engle's ARCH test for the squared returns reveals strong ARCH effects, which evidences in support of GARCH effects (i.e., heteroscedasticity).Accordingly, these preliminary analyses of the data encourage the adoption of a sophisticated model, which embody fat-tailed features, and of conditional models to allow for time-varying volatility.
Despite GARCH-type models are able to capture fat-tails and conditional volatility, they do not consider volatility clustering, as characterized by Fama (1965).The stock indexes are shown in Figure 4, and its correspondent returns are given in In order to select best lag parameters for the Fuzzy GJR-GARCH and Fuzzy-GARCH models, and GARCH-type specifications, also considered in this work, the Bayesian information criterion (BIC) and Akaike's information criterion (AIC) were performed (Akaike, 1974, Schwarz, 1978).The models with various combinations of (p, q) parameters ranging from (1, 1) to (15, 15) were calibrated on return data.According to BIC and AIC criteria the best specification for all GARCHtype models was (1, 1), i.e. p = 1 and q = 1.The next step is to select the appropriate number of fuzzy rules for each data series, therefore, the subtractive clustering algorithm was applied.Figure 5 reports the membership functions for S&P 500 and Ibovespa indexes, which illustrates that these financial markets are asymmetric.In both economies evaluated, the clustering algorithm found three rules, that could be interpreted as negative, medium and positive returns.Computational experiments were conducted to chose appropriate control parameters for the Fuzzy GARCH-type models, according to different values for these parameters and compared in terms of accuracy for a fixed number of generations.Simulations are comprised by 100 runs.Figures 7 and 8 display the boxplot analysis of the control parameters for S& P 500 and Ibovespa indexes, respectively, using the Fuzzy GJR-GARCH model.This approach is a convenient way of graphically depicting groups of numerical data through their five summaries: the sample minimum, sample maximum, median, upper quartile, and largest observation.A boxplot may also indicate which observations, if any, might be considered outliers.The spacings between the different parts of the box help indicate the degree of dispersion (spread) and skewness in the data.Therefore, control parameters identified by a low mean, in terms of objective function value, and low dispersion are chose as the ones which provides better results to the problem of volatility forecasting.

Figure 6
The membership functions for the S&P 500 and Ibovespa indexes  Considering the mutation scale factor (F ), shown in Figure 6, one must note that more stable results, in terms of lower median and spread, are attained on the intervals [0.8, 0.9] and [0.8, 0.95] for S&P500 and Ibovespa, respectively.Otherwise, higher spread and fitness value are observed, with high probability to provide suboptimal objective function values.On the other hand, the boxplot dispersions for the crossover rate (Cr) are more stable for both economies evaluated (Figure 7).A range of parameters which performs better accuracy for S&P 500 lies in the interval [0.86, 0.89], whereas for Ibovespa it corresponds the interval [0.89, 0.96].Figure 9 shown the objective function value (E(x)) behavior over the number of generations for S&P 500 and Ibovespa indexes to estimate the Fuzzy GJR-GARCH coefficients using the differential evolution algorithm proposed, and also the Fuzzy-GARCH model using particle swarm optimization and differential evolution.The fitness functions of the DE-based estimation method are exponential and rapidly converge at successful results in 25 generations, mainly when S&P 500 index is considered (Figure 9).Despite its simplicity, lower number of control parameters and no requirements about the parameter space stability, the DE also provides a fast convergence in a few number of iterations. 7In terms of fit and convergence, the Fuzzy GJR-GARCH model provides better results in comparison with the Fuzzy-GARCH model optimized with PSO and DE algorithms.
However, these differences became more clear when the adjustment, in terms of error measures, is considered.The parameters estimates for in-sample data associated with the S&P 500 and Ibovespa indexes are shown in Table 3. Considering stable conditions, the parameters estimates all converge and by taking the t-statistic into account, the proposed Fuzzy GJR-GARCH model effectively estimate the daily return data.8Moreover, it must be noted that US and Brazilian markets are asymmetric and exhibit the leverage effect as confirmed by the estimated coefficients.Volatility forecasts comparison was conducted for one-step ahead horizon in terms of mean squared forecast error (MSFE), mean absolute forecast error (MAFE), and mean percentage forecast error (MPFE) defined as follows: where M is the number of out-of-sample observations, σ 2 i is the actual volatility at forecasting period i, measured as the squared daily return, and σ2 i is the forecast volatility at i.
Table 4 provides the performance of the evaluated models to predict the S&P 500 and Ibovespa stock indexes volatilities.The proposed Fuzzy GJR-GARCH model performs better than all remaining models since their structure provides a combination of rules for estimating forecast errors and also a mechanism to deal with leverage effects.The Fuzzy-GARCH model performs better than GARCHfamily models, however, a better fit was attained when the model is optimized with the DE algorithm.The GJR-GARCH and GARCH models demonstrate similar results, but the GJR model presents slight lower errors than the GARCH model when the Ibovespa index is considered.Although all performance measures of forecasting accuracy that have been extensively employed in practice, they do not reveal whether the forecast of a model is statistically superior to another one.Therefore, it is imperative to use additional tests to help comparison among two or more competing models in terms of forecasting accuracy.
This paper adopts the parametric Morgan-Granger-Newbold (MGN) test, proposed by Diebold & Mariano (1995).This test is often employed when the assumption of contemporaneous correlation of errors is relaxed.The statistic for this test can be computed using: where ρsd is the estimated correlation coefficient between s = e 1 + e 2 , and d = e 1 − e 2 , with e 1 and e 2 the residuals of two models adjusted, e.g.Fuzzy GJR-GARCH model versus GARCH-type approaches.In this case, the statistic is distributed as Student distribution with M − 1 degrees of freedom, and M is the number of out-of-sample observations.For this test, if the estimates are equally accurate (null hypothesis), then the correlation between s and d will be zero.
The results from MGN test, shown in Table 5, are in line with our results.MGN statistics reveal that the proposed Fuzzy GJR-GARCH model is superior predictor for forecasting S&P 500 and Ibovespa indexes than traditional GARCH-type models and the Fuzzy-GARCH model, optimized both by PSO and DE algorithms.
The proposed hybrid model exhibits high capability to forecasting volatility of the real market returns evaluated by considering both stock market asymmetry and volatility clustering, also overperformed Fuzzy-GARCH, GARCH and GJR-GARCH methodologies in statistical terms.Moreover, a DE-based optimization algorithm improves the capability of the Fuzzy GJR-GARCH and Fuzzy-GARCH to obtain accurate parameters in a simple mechanism without restrictions about the parameter space and with a fast convergence.

Conclusion
Since the 1987 stock market crash, modeling and forecasting financial market volatility has received a great deal of attention from risk managers, regulators, academics and all markets participants in general.Volatility forecasting plays a central role in several financial applications such as asset allocation and hedging, option pricing and risk analysis.GARCH-family models are the most applied approach to model and forecast assets returns volatility, since accommodate some volatility stylized facts as persistence and time-varying volatility.However, despite the success of the GARCH model, it has been criticized for failing to capture asymmetric volatility.Therefore, the literature has been introduced a recent class of asymmetric GARCH specifications, as the GJR-GARCH model, for example.These models include the asymmetric responses of volatility to positive and negative shocks, but they do not simulate stock fluctuations with volatility clustering well.
Hence, this paper proposed a new hybrid model named Fuzzy GJR-GARCH for stock market return volatility modeling and forecasting.The model combines both fuzzy inference systems and the conditional variance GJR-GARCH model to deal with time-varying volatility in a presence of leverage effects and also volatility clustering.The proposed model is based on a collection of fuzzy rules in the form of IF-THEN statements, in which its structure comprises a GJR-GARCH model.This paper also suggested the use of a Differential Evolution (DE) algorithm to estimate the parameters of the proposed methodology, since it comprises a high nonlinear and complex optimization problem.Computational experiments illustrating the effectiveness of the Fuzzy GJR-GARCH model were provided by modeling and forecasting the volatility of S&P 500 (United States) and Ibovespa (Brazil) indexes from January 3, 2000 through September 30, 2011, in comparison with some GARCH-family models and with a current Fuzzy-GARCH method proposed by Hung (2011a), optimized with Particle Swarm Optimization (PSO), as in its original version, and with the DE algorithm proposed.
The results revealed the potential of the Fuzzy GJR-GARCH approach to the problem of volatility forecasting, providing more accurate results than GARCHtype models and the Fuzzy-GARCH method, estimated with PSO and DE algorithms, in terms of fitness accuracy and statistical tests.Moreover, the DE-based algorithm converged with stable condition in a simple structure, with low computation complexity and few control parameters, not requiring stable conditions to the parameter space like traditional gradient search methods.Future research shall include applications of the Fuzzy GJR-GARCH approach in financial decision making problems related to volatility such as option pricing, portfolio selection and risk modeling (Value-at-Risk).

Figure 2
Figure 2 DE mutation scheme in two-dimensional parametric space

Figure 3
Figure 3 DE uniform crossover scheme example Figure 4Daily closing stock price indexes for S&P 500 and Ibovespa Figure 5 S&P 500 and Ibovespa daily returns

Figure 7
Figure 7 Boxplot of the DE algorithm for different values of mutation scaling parameter (F ) according to the objective function E(x) value for S&P 500 and Ibovespa

Figure 9
Figure 9Objective function E(x) evaluated using DE algorithm for S&P 500 and Ibovespa

Table 1
Descriptive statistics of daily returns of S&P 500 and Ibovespa.ARCH Test (10) c 1109.1934 * 1082.7409* a is the statistics of Jarque-Bera normal distribution test. is the Ljung-Box Q test for the 10th order serial correlation of the squared returns.
*b c Engle's ARCH test also examines for autocorrelation of the squared returns.
Boxplot of the DE algorithm for different values of crossover rate (Cr) according to the objective function E(x) value for S&P 500 and IbovespaMoreover, the Fuzzy-GARCH control parameters setting adopts the same procedure, i.e., by the boxplot analysis with 100 runs for both DE and PSO-based optimization techniques.Table2indicates the control parameters associated with each model, stock index and optimization procedure.

Table 2
Fuzzy GARCH-family models control parameters

Table 4
Models performance to volatility forecasting for one-step ahead.