On the Statistical Validation of Technical Analysis

Technical analysis, or charting, aims on visually identifying geometrical patterns in price charts in order to anticipate price “trends”. In this paper we revisit the issue of technical analysis validation which has been tackled in the literature without taking care for (i) the presence of heterogeneity and (ii) statistical dependence in the analyzed data – various ag-glutinated return time series from distinct ﬁnancial securities. The main purpose here is to address the ﬁrst cited problem by suggesting a validation methodology that also “homog-enizes” the securities according to the ﬁnite dimensional probability distribution of their return series. The general steps go through the identiﬁcation of the stochastic processes for the securities returns, the clustering of similar securities and, ﬁnally, the identiﬁcation of presence, or absence, of informational content obtained from those price patterns. We illustrate the proposed methodology with a real data exercise including several securities of the global market. Our investigation shows that there is a statistically signiﬁcant informational content in two out of three common patterns usually found through technical analysis, namely: triangle, rectangle and head & shoulders


Introduction
Technical analysis (or charting) is an old day and empirical practice whose central target is the identification and anticipation of trends in the prices of financial securities, by means of recognizing geometrical patterns in the price charts.Following Murphy (2000), p. 49, we "define" trend by the simple direction to where the market is going to.This practice, although fully adopted in many financial institutions across the world, has been neglected in the academy.The main reason for that is its lack on scientific formalization which could have been directly confronted to empirical evidences, something that has not happened along other investment analysis based on the finance orthodox theory, from which we can cite sovereign examples such as the Portfolio Selection Theory conceived by Harry M. Markowitz (Markowitz, 1959), William F. Sharpe's Capital Asset Pricing Model (CAPM) (Sharpe, 1964) and the Black & Scholes option pricing model developed by Fisher Black and Myron S. Scholes (Black and Scholes, 1973).As another responsible we recall the fact that the technical analysis was forgotten by the academy is the traditional financial theory which has been built on efficient markets theory (Fama, 1965) that is inspired on the random walk theory (Bachelier, 1900).
Nevertheless there are references on technical analysis since the existence of the very first incipient financial markets like the rice market in feudal Japan.Actually the books in this field used to be heuristic in style and lacked on formalism.It was only after the appearance of some studies rejecting the random walk theory (Lo andMackinlay, 1988, 1999) that the first studies about this practice had appeared in the mainstream periodicals.With the increasing of the empirical results toward the validation of technical analysis, the academy has become more interested on this subject.
One of the papers that aimed on subjecting technical analysis to econometric framework is the one by Lo et al. (2000).That research turns to the mathematical formalization of the geometrical patterns, the automatization of the pattern identification and the validation of technical analysis by traditional Chi-Square and Kolmogorov-Smirnov goodness-of-fit1 tests (De Groot, 1986).Focusing on the last contribution of their paper, the validation of technical analysis, Lo et al. compared, by means of the cited tests applied to various agglutinated return series of several distinct financial securities, the empirical distribution of the return series after the geometrical patterns (the conditional returns) to the empirical distribution of the returns of all the complete series (the unconditional returns).Once the null hypothesis that the empirical distribution of the later adequately fits the empirical distribution of the former has been rejected, they took this result as a statistical evidence that there was informational content in the identified patterns.
Here we must make two important and somewhat obvious critics on this validation proposal.Firstly, it is well known that those tests presume statistical independence for the data, something that is trivially violated by financial data given the pretty established stylized fact of conditional heteroscedasticity (Engle, 1995, Mills, 1999).Secondly, and much more harmful, the agglutination of return series of different securities can very plausibly violate the first principle of identically distributed random variables, without which anything would make sense.In other simpler words: that paper applied traditional goodness-of-fit tests and analyzed the results using potentially dependent and/or heterogeneous data sets.
The central objective of this paper is to solve that problem of heterogeneity previously explained.We attempt to do this by stepping into some clustering device in order to collect series which appear to come from the same "world" or "population".In Section 2 we quickly present the general terms of the technical analysis, its assumptions and its practice.In Section 3 we formalize our methodology for the validation of technical analysis, while detailing the pertinent statistical framework, namely: the estimation of AR − GARCH models, the principal component analysis aimed on visual clustering and the goodness-of-fit tests.In Section 4 we illustrate the proposed methodology with several return series from different securities of the global market.Finally, in Section 5 we discuss possible extensions of the methodology.

Foundations of Technical Analysis: A briefing
Technical analysis is based on the idea that prices move in trends, which are naively defined as the directions of the market prices (Murphy (2000), p.49).According to Murphy (2000), a trend has three directions: uptrend, downtrend and sideways trend.Each part of this decomposition is determined by the changing attitudes of investors toward everything that economically, politically and psycho-logically can affect them.Another assumption to the effectiveness of technical analysis lies in the belief that history tends to repeat itself : investor's behavior is known by the presence of well-defined reactions to some stimulus.Every information that can impact prices is "translated" to investor's mind as greed or fear.The practice defends that recursive behavior of investors can be captured by the identification of geometrical patterns in price graphics.Pring (1991) asserts that the art of technical analysis consists of identifying trend changes in the early stages in order to maintain the investment posture until technical evidences indicate that trend has reversed.There are two categories of price patterns: the reversal and the continuation.The former category is responsible for the reverse of a previous trend and its five most commonly examples: the head & shoulders, triple tops and bottoms, double tops and bottoms, spike (or V) tops and bottoms, and rounding (or saucer).The later category responds for the continuation of a previous trend and the most used types are: triangles, flags & pennants, and rectangles.After a pattern has been identified, the analyst earns insights on the direction of a trend (if it will maintain or will reverse) after the end of the formation, also known as the rupture or breakout point.Figure 1 shows a head & shoulders formation with the new trend delineated after the breakout point.

Figure 1 Head & shoulders formation
There is also a specific rule according to each pattern, which defines the minimum size trend after the formation.So, the analyst can also determine the minimum target price of the new trend, as illustrated in Figure 2. Authors in the field have recently made difference between the tools used by a technical analyst: charting, technical or sentimental indicators and oscillators, statistical analysis, and black box approach.Although this increased specialization urges distinctions between these many tools, it is common sense that all of them but charting are secondary or tertiary elements in the practice.So we will follow the classical approach considering technical analysis as charting.
Despite the fact that most of papers rejecting technical analysis were based in random walk theory, there are works with different conclusions about the practice.Saffi (2003) found results disapproving the use of technical indicators and oscillators as a methodology to achieve returns above the market.On the other hand, Neftci (1991) investigated the ability of Technical Analysis to get algorithmically implemented and to explain stock price movements better than Wiener-Kolmogorov predictors; sometimes the ability has been confirmed in that paper.Finally, Ratner and Leal (1999) found results that partially support specific technical indicator, based on moving averages, in some emerging markets.
The main critics on the practice lie in the highly subjective nature on the identification of geometrical patterns in price charts and also in the fact that there is no scientific evidence about the validation of such patterns.Those critics were considered in a pioneering way by Lo et al. (2000).

A Methodology for Statistical Validation of Technical Analysis: Searching
for Homogeneity

The validation proposal of Lo et al.
We start by discussing the part of Lo et al. (2000) concerned on the validation of technical analysis by verifying the existence of informational content in the patterns extracted from price charts.We first detail their methodology and, after, stress its statistical drawbacks.

The original methodology
Firstly, we quote some terminology supervened from Lo et al. (2000).By conditional returns we mean the parts from the agglutinated return time series which follow after the identification of a given pattern in the correspondent portion of the price series.Figure 3 illustrates this concept within the geometrical pattern known as flags & pennants.And by unconditional returns we mean all the observations from all agglutinated return time series.
where Y i represents the total of conditional returns assuming values between the i th and the i − 1 th decile of the empirical distribution of the unconditional returns, n is the number of unconditional returns (that is, the number of observations from all series) and (0, 1) n is the expected absolute frequency of conditional returns to appear in some decile under this null hypothesis H 0 : "The theoretical distribution of the conditional returns is the same as the empirical distribution of the unconditional returns".On rejecting this null hypothesis (the adopted asymptotic null distribution is the usual Chi-Square with 10 − 1 = 9 degrees of freedom), Lo et al. conclude that the distributions of the conditional returns and the unconditional returns would not be the same and interpret this as some empirical evidence supporting an informational content identified by technical analysis.
The other test used by Lo et al. is the one due to Kolmogorov and Sminorv, which, in this context, has the same null hypothesis and aims at comparing the empirical distribution function of the conditional returns to the corresponding one of the unconditional returns.This is actually a traditional goodness-of-fit test and its details can be studied in De Groot (1986).

Statistical problems
We first should note that both Chi-Square and Kolmogorov-Sminorv tests have as basic presuppositions the independence and the homogeneity of data generating processes (for short: the data must be i.i.d.).Both conditions are trivially violated by the data considered in those applications of Lo et al.. Let us concentrate on this point for a while.
It is a well known fact that daily return series are not independent.They could be at most uncorrelated in time, but surely present at least some form of conditional heteroscedasticity.The conditional volatility is vastly discussed in the literature; see for instance Hamilton (1994), Engle (1995) and Mills (1999).Stepping further, we affirm that this dependence shall earn complexity whenever different series are agglutinated, since observations from different series from the same market (or even from different markets sometimes) sometimes evince high pairwise correlations besides other complicated types of dependence.
Secondly, the most important: the agglutination of different series and the treatment of all the observations as if they came from the same distribution is strongly criticizable, since the most plausible conclusion would be that this agglutination very possibly violates the distributional homogeneity.
Actually, Lo et al. (2000), p.1728, called the attention for these two issues and suggested that they would extend their analysis for the non i.i.d.framework.Here, in our paper, we concentrate on the most important second problem and propose an alternative methodology for attenuating the homogeneity violation.We leave the other problem (dependence of the data) for future research.

Preliminaries for an alternative methodology
We now present some general statistical background which are going to be used in the methodology to be presented in subsection 3.3.Quite briefly, the techniques are: (i) AR − GARCH modelling; (ii) visual clustering under principal component analysis; and (iii) goodness-of-fit tests.

AR − GARCH Models
The AR(1)−GARCH(1, 1) model (Engle (1995), for a quite exhaustive treatment on the subject) is quoted as where R t is the stochastic process representing some security daily return and, obviously, Nelson (1990), sufficient conditions for the ergodicity of the model in (2) would be This model, which is a very particular case of a general ARM A(p, q) − GARCH(s, t) structure, is in its own place here since there are plenty of practical evidences that it captures fairly well the dynamics of many return series.In practice the AR(1) structure is used to be "weak" in the sense that φ 1 ≈ 0. This latter stylized fact should be interpreted as some device to account the lack of efficiency of the subjacent financial market, but not as something to rely on if one attempts to make forecasts.
In general the estimation of AR(1)−GARCH(1, 1) models is consistently accomplished by quasi maximum likelihood estimation (Bollerslev and Wooldridge (1992) where the adopted quasi likelihood is usually Gaussian (Greene (2000), p. 802-807, for analytical expressions and derivatives used in numerical optimizations).

Principal component analysis
Visual clustering by means of principal component analysis (Johnson and Wichern, 1998, ch. 8) consists on visually grouping experimental units with similar values for the first components, precisely those whose variances account for great part of the variability came from the original variables.This is some kind of dimensional reduction where main and few orthogonal components (say, the very first, or the first and the second together) replace the original variables, permitting therefore a graphical depict of the experimental units.Surely, the technique would be only valuable and recommended if the adopted components "satisfactorily" represent the data and a crucial condition for this shall be moderated correlations among the original variables.
Consider that there are p variables observed on n desirably independent individuals (the experimental units).So we have X i = (X i1 , . . ., X ip ) ′ , i = 1 . . .n.We define the j th principal component of the i th individual as the scalar product of the normalized eigenvector associated with the j th greatest eigenvalue from the sample covariance matriz of the original variables.That is, By Johnson and Wichern (1998), ch.8, the sample variance of the j th component is given by the j th greatest eigenvalue: Those readers interested on more theoretical and methodological material concerning principal component analysis are referred to Johnson and Wichern (1998), ch.8.

Goodness-of-fit tests
As already specified in the subsection 3.1.1,the statistic given in ( 1) is used to compare the empirical distributions of the conditional and unconditional returns.Under the null, those squared differences (Y i − (0, 1) n) 2 would assume "small" values.Details on that test and on the Kolmogorov-Smirnov test, which was also discussed before and will be used here in this paper, can be found in De Groot (1986).

The methodology itself
Once formalized, those statistical techniques discussed along subsection 3.2 must be combined to form our methodology, which is given in the following 9step algorithm: 1. Obtain a relatively large number of price series from securities.This would be the raw data.
2. Estimate by quasi maximum likelihood AR(1) − GARCH(1, 1) models, as given in (2), for each of the return series calculated from the prices.
3. Consider as the new data set the 5 estimated coefficients (these are the variables!)from all securities.Then, explanatorily search for outliers (that is, estimated coefficients values that are "strange" as compared to the majority of the securities).This can be done by descriptives graphical devices, some "3-sigma" strategy and/or provisory principal component analysis.Once the outliers are found, remove the correspondent securities from the data and go to the next step.
4. Use the "outliers-free" data set to implement a definitive principal component analysis with those 5 variables (the estimated coefficients) in order to get a reduction from 5 to, let us say, 2 dimensions.Interpretation of components is optional.
5. Use the adopted principal components to realize a visual clustering attempt in order to obtain some homogeneous groups of securities in terms of the principal components values -and, consequently, in terms of the estimated coefficients values.
6. Without loss of generality, let us consider that the last step has produced one cluster.Within this cluster, search descriptively for "sub-clusters" by looking at the values of the original variables, the estimated coefficients, in the data set.
7. Use the securities from the (sub-)clusters to realize a technical analysis in order to find potential geometrical patterns.
8. For each type of geometrical pattern found in the technical analysis (cf.section 2 for the possible types), group the parts from the clustered return series which follow after the identification of a given pattern.These are the conditional returns.Observe that the number of data sets formed with conditional returns equals to the number of patterns found in the last step.Also construct the correspondent data sets formed with the unconditional returns.9.In this final step, perform the goodness-of-fit tests within each pattern.To say once more: the null is that the theoretical probability distribution of the conditional returns is adequately fitted by the empirical distribution raised by the unconditional returns.If the null is rejected, interpret this as an evidence of informational content came from that particular pattern.
A technical word.The so-searched and important homogeneity is just supposed to be tackled in steps 2 to 6.The strict stationarity of the AR(1)-GARCH(1,1) already discussed in subsection 3.2.1 is the building block of everything: the random variables associated with those processes with close estimated coefficient values are believed to have the "same" distribution (even though still presenting a rather complicated statistical dependence).

General points
Along this section we are going to illustrate the proposed methodology of the last section with real financial data.Each step(s) of the methodology, whenever passed through, is (are) indicated in the following subsections.The same is done on specific computational frameworks.All the implementations have been performed on a Pentium 4 with 3.2 GHz and 512 Mb RAM.

1st step: obtention of the data
We chose to work with daily series, each one comprising 1000 observations from 62 worldwide securities, such as stocks, commodities, several indexes and exchange rates.The period of analysis ranges from December, 20 th , 2001 till December, 9 th , 2005.Appendix A offers a table with general information on those securities.The data were obtained from Reuters (www.reuters.com).

2nd step: estimation of the AR(1) − GARCH(1, 1) models
We estimated AR(1) − GARCH(1, 1) models using quasi maximum likelihood for the 62 return series and stored the 5 estimated coefficients for each security.The implementation of this step has been accomplished in Ox language (www.oxmetrics.net) with the use of the package G@RCH (Laurent and Peters, 2006) and the computational time was 13 seconds.Appendix B shows the estimated coefficients, their associated t statistics and the corresponding p-values.We observe that, for all the securities but AL.N CLose (Appendix B), the Bonferroni conjoint significance test indicated that at least one of the theoretical coefficients is different from zero at the level of 1%.

3rd step: eliminating outliers
This step was performed in Minitab 12.1 (www.minitab.com).Although we do not detail the whole procedure in this paper, it should be mentioned that the data on estimated coefficients have been scrutinized under all the suggested devices listed in our methodology's 3rd step.The conclusion was that the securities listed in Table 1 were quite discordant in terms of their values.By using the remaining 50 securities, we move on.

4th and 5th steps: principal component analysis and clustering
Consider the data set with 5 estimated coefficients from the 50 securities, which is actually the data presented in Appendix B without the 12 lines corresponding to the excluded securities listed in Table 1.Now we present the details of the definitive principal component analysis, whose main output is in Table 2. Implementation has been done in Minitab 12.1.By looking at the output of the analysis, we find out that the two first principal components respond for 76,1% of the total variance of the original variables, the estimated coefficients.Interpretation of the components is direct.The first one, as it is more strongly weighted on the GARCH coefficients, is called GARCH Effect, and the second one, once being more strongly weighted on the AR coefficients, is called AR Effect.
With these two adopted and interpreted components, we tackle the visual clustering of securities.The following scatter plot in Figure 4 for the two components is an appropriate place to start.
Looking at the scatter plot, which "photographs"/projects the securities onto two dimentions, we decided to pick some points up, while respecting the following ranges: −1.5 < GARCH Ef f ect < −0.5 and −0.4 < AR Ef f ect < 0.3.The selected securities are circled in the scatter plot and have their names displayed in Table 3.

Figure 4
Scatter plot for the two first principal components detaching the first clustering attempt The clustered securities have been put together by solely looking to the first two principal components.Some information from the original variables has therefore been neglected.In order to remedy this, we refine this clustering process in the next step.

7th step: technical analysis
At this stage, geometrical patterns were extracted within technical analysis over the price time series due to the securities clustered in Tables 4 and 5.We would like to thank the JGP staff (www.jgp.com.br)from whom we obtained these charting results.In Table 6 we enumerate the types of identified patterns and their respective frequencies for each security.Additional information on the beginning and on the breakout points of these patterns, as well the raw material on the performed technical analysis -which is characterized by scrutinized price charts -, can be found in the appendix of Lorenzoni (2006).

Table 6
General information on the occurrence of the geometrical patterns along the securities from both clusters

8th and 9th steps: grouping return observations and the goodness-of-fit tests
For each of the patterns given in Table 6 we have grouped together the parts of all the return series corresponding to the conditional returns.These would be compared to all the agglutinated return series, the unconditional returns.Then we get everything needed for the application of the Chi-Square and Kolmogorov-Smirnov tests, whose implementation has been executed in Ox language; the computational time was derisive (much less than a second).
Tables 7 and 8 give information on the tests applied to the first cluster.By reading those tables we see that the triangle is the most uninformative pattern (see "big" p-values).In the other hand we find evidence on the presence of informational content came from rectangles and head & shoulders (see "small" p-values).Now we concentrate on the second cluster.Information from Table 9 tells us that both patterns, triangle and rectangle, are informative by at least one of the tests (see the p-values).We better mention that, although suggesting the effectiveness from the rectangle, care must be exercised on interpreting this result since the number of observations on conditional returns is not that large (only 122 observations); cf.Table 10.

Complementary analysis
The results from the application of the proposed methodology indicated, by remaining on the considered data set, two potentially important patterns (rectangle and head & shoulders), and rejected the triangle.
Although we are anchored at limited empirical evidence, triangles' failure seems to be corroborated by several technical analysts who frequently agree on the inconstancy of this particular geometrical pattern.On the other two accepted patterns, we understand that those results could go through the direction of prior belief on possible trend anticipation in price charts.

Discussion
In this paper's final section we attempt to further debate on the proposed methodology by suggesting extensions.We however advertise that the dependence issue is not entering in what follows; the next two subsections in fact deal with (i) an econometrically more rigorous framework for improving the sub-clustering 6th step of our methodology and (ii) possible advances on the understanding about how the real informational content statistically influences the conditional returns whenever it is uncovered by the goodness-of-fit tests.

A statistical test for homogeneity
Recall the 5th and 6th steps of the prosed methodology.They prove to be crucial to the whole work, since they try to overcome the undesirable distributional differences of the data by collecting together securities that seem to present homogeneous statistical properties.
However, some readers may find the elected clustering device somewhat subjective in the sense that one can understand that the formed clusters, visualized by another, are not appropriate and vice-versa.But we stick to the simple fact that any clustering attempt is subjective, regardless of its nature being more visual or more automatic.If we for instance go to Mardia et al. (1979), or to Johnson and Wichern (1998), plentiful discussion can be found on how a swap in any clustering device can dramatically change the obtained groups of individuals.
So, as little can be done on the attenuation of subjectiveness came from cluster analysis, we could step to the elimination of it!Indeed, we could try an econometrically more compelling way to form a group of homogeneous securities in order to apply the goodness-of-fit tests.In fact, we are working in this task even though implementations have not been accomplished yet.But we can outline the general points of it and leave the empirical results to an upcoming paper.
Firstly we assume something quite reasonable: the sub-clustered securitiessay k securities -harvested from the 6th step of the methodology have their dynamics adequately described by some kind of V AR(1) − GARCH(1, 1) model which necessarily admits AR(1) − GARCH(1, 1) models for each one of its components (the returns series for the securities).By denoting the vector of total parameters of the joint V AR(1) − GARCH(1, 1) model by ψ and the parameters of the marginal AR(1) − GARCH(1, 1) models by ψ j ≡ g j (ψ), where g j is an appropriate function, j = 1, . . ., k, we formalize this presupposition by displaying Those acquainted with GARCH literature and its multivariate extensions certainly knowledge that not every multivariate GARCH structure leads to marginal univariate GARCH structures, this not happening with the proposed model by Bollerslev (1990).So the latter could be an alternative.
Secondly we would establish the grounds for the estimation of the adequate multivariate GARCH model in (5) with the data on returns for the k securities.And we do this by (quasi) maximum likelihood framework.Within this set up it becomes possible to test the following hypothesis: Accepting the null in ( 6) is actually everything we need because this would reveal to us the lack of evidence came from the data against the fact that the securities present strictly the same marginal processes driving their dynamics.This implies in R 11 , . . ., R 1k , . . ., R T 1 , . . ., R T k being identically distributed, and therefore we are done: homogeneity has rigourously been achieved.
There are two issues to be considered.The first is on the test that should be evoked and performed to lead us on deciding between H 0 and H 1 .Since multivariate GARCH proposals usually contemplate lots of parameters, a smart choice would lead to the LM type tests owing to their sole necessity for estimation of reduced -and more parsimonious -models.Maybe some changes on the original test statistic due to quasi likelihood framework would be important (White, 1994).The second issue is on the distribution to be chosen for the multivariate error term associated to the V AR(1)−GARCH(1, 1) model.This is rather relevant because additional parameters coming from fat-tailed distributions (e.g. the degrees of freedom of multivariate t-Student distributions) could be used as additional variables in the clustering process.

Comparison of moments
When the null hypothesis is rejected by the goodness-of-fit tests, the data prove to furnish evidence on some differences between the probability distributions of the conditional and unconditional returns.Quoting the interpretation given in Lo et al. (2000) for this found, we say that, in such case, there is informational content came from the technical analysis.But, what is exactly this "informational content"?Is this latter connected to decisions made by technical analysts on their daily routines in banks, brokers, asset management and investment clubs?Some work shall be done in order to answer those last two questions.In fact, Narasimhan Jegadeesh already tried to step to this point in the discussion of Lo et al. (2000) by statistically testing if there were differences on the means of the conditional and unconditional returns.Although did the adopted tests not found evidences against differences between both means, this is not too relevant because, in practice, financial decisions are rarely made on basis of first order moments of return distributions.On the contrary, they actually have their grounds on the behavior of moments of greater orders.Even more, since the tests applied by Jegadeesh have used the same data sets from Lo et al., we are tended to unconsider those conclusions because the data still share the same heterogeneity problems.
Our suggestion for future research is the comparison between higher order moments from both conditional and unconditional returns with securities clustered by this paper's methodology.As an example of what could be uncovered, we cite possible differences on skewness (third order moments) which would certainly lead to better use of derivative strategies.

Figure 2
Figure 2Representation of a pattern known as triangle.The rule to measure the minimum size of the posterior trend consists on draw a parallel line upward from the top of the baseline (A) parallel to the lower line in the triangle.

Figure 3
Figure 3Price chart with detach to the portion (circled) which shall generate the conditional returns

Table 4
First cluster formed from the estimated coefficients.

Table 7
Goodness-of-fit tests for the first cluster

Table 8
Number of observations used in the tests for the first cluster

Table 9
Goodness-of-fit tests for the second cluster.

Table 10
Number of observations used in the tests for the second cluster.