Abstract
This paper describes a hybrid automated trading system (ATS) based on grammatical evolution and microeconomic analysis. The proposed system takes advantage from the flexibility of grammars for introducing and testing novel characteristics. The ATS introduces the self-generation of new technical indicators and multi-strategies for stopping unforeseen losses. Additionally, this work copes with a novel optimization method combining multi-objective optimization with a grammatical evolution methodology. We implemented the ATS testing three different fitness functions under three mono-objective approaches and also two multi-objective ATSs. Experimental results test and compare them to the Buy and Hold strategy and a previous approach, beating both in returns and in number of positive operations. In particular, the multi-objective approach demonstrated returns up to 20% in very volatile periods, proving that the combination of fitness functions is beneficial for the ATS.
Keywords
Introduction
In July 2015, the world federation of exchanges (www.world-exchanges.org/) approximated the total market capitalization of the major equity markets of the world in $62.4 trillion and it is projected to reach the $284.2 trillion in 2030. These figures gives us an idea of the critical role played by stock markets and an insight about the reached importance of the development of automatic trading systems (ATSs) able to successfully operate in the markets. Since the first approaches to trading systems some decades ago, the development and research in this area has been increased explosively. The growing availability of data for financial markets and companies, together with the increasing complexity of socio-economic and financial environment, makes it more difficult the decision making process for real time investments in stock markets. The huge number of potential interrelated factors and their changing time patterns, affecting financial assets, make the investment process remains a challenge.
Financial investments are affected by factors such as, government policies, natural factors, international trade, market sentiment, political factors etc. Thus, it is very complex to follow a successful flow of information and later conclude the consequences that the information implies. Furthermore, traders are affected by the so called behavioral finances [1]. Brokers or practitioners in general could be affected by human emotions, so their behavior in the stock market became not objective. The high pressure induced by handling a large volume of money is the main reason, which is able to trigger loss aversion, overconfidence, overreaction, and other behavioral biases.
The importance achieved by the information technology (IT) has drastically grown in the vast majority of areas. Thus, IT has become an important component of new capital investments systems and economists search into computers the best hope for a sustainable increase in economic growth rates. Furthermore, the development of IT and telecommunications has promoted the emergence of global processes. The globalization is symbolized for instance by the tremendous development of the “hedge funds industry”, a class of funds which invest in any kind of assets around the world (stocks, indexes, bonds, commodities, currencies, etc.) This new level of interconnectivity plays out in our financial markets where problems in one market have inescapable and often unpredictable effects on the rest of the markets worldwide.
ATSs are predictive engines based on rules using market, business or macroeconomic information embedded in algorithms which seek out best combinations of these rules to drive the stock trades in an attempt of obtaining the maximum possible return for a period. Finding optimal series of investment decision involves an inspection of the related search space which has increased its complexity while more features have been added to the trading engines. Thus, ATSs have evolved from very simple if-then algorithms to more sophisticated models that use methods like artificial intelligence, chaos theory, fractals, evolutionary algorithms, etc. This environment points to meta-heuristics as one of the best approaches to find optimal solutions.
Evolutionary meta-heuristics, commonly referred as evolutionary algorithms are a set of search and optimization methodologies inspired and based on principles and theories of the biology world. In the academic literature a large spread of previous studies document the use of evolutionary algorithms to design and optimize automated trading systems for the Stock Market. [2–8] among others. The application of meta-heuristics to ATSs has undergone a fast development in both the scientific and the professional world. This work is focused on a relatively new meta-heuristic methodology referred to as grammatical evolution (GE). The ultimate aim of the work is to build a novel ATS capable of analyzing a high number of companies providing, as final solution, a program optimized to earn high returns. We introduce and test innovative characteristics as the self-generation of new technical indicators and multi-strategies for stopping unforeseen losses. Additionally, this work copes with a novel optimization method combining a multi-objective optimization with a GE methodology.
The remainder of the paper is organized as follows. We introduce some economical definitions and financial indicators used to develop the work in Section 2. Next, we cope with the main part of the paper in which we detail the proposed ATS in Section 3. Section 4 presents the experimental results in four strands. First, we compare the GE approach, the buy and hold strategy and a previous genetic algorithm implementation published in the literature. Next, we test the three implemented fitness functions and multi-objective approaches. Finally, we test the ATS through several industrial sectors to test the existence of macroeconomic trends. Section 5 briefly reviews the literature on trading systems, especially using evolutionary algorithms. We conclude the paper in Section 6.
Concepts and financial specifications for the automated trading system
This work is based on the adaptive market hypothesis (AMH), a new novel theory about the behavior of the markets proposed by Andrew Lo in Ref. [9]. The AMH supports that the classical theory cannot reflect the behavior of the market in every case. Within the framework of the AMH, investment strategies are systems that evolve over time as investors learn which strategies work better in different circumstances and gradually implement them. Profitable strategies will progressively disappear, while new price patterns will emerge and new strategies will be developed to exploit them. They have tested empirically that financial decisions, finally human decisions, have a heuristic component. This heuristic component makes markets deviate from the financial theory.
If we observe investment analysts at exchange markets, the majority of them tend to fall into one of two schools of thought, namely those of fundamental or technical analysis. Technical analysis is focused on the price movement of a security and uses it to predict future price movements. Fundamental analysis, on the other hand, looks at a larger set of economic factors, known as fundamentals. Both analyses are commonly used by the investors, even both at the same time as they are somewhat complementary methods. For example many fundamental investors combine its primary knowledge with the technical methodology to decide entry and exit points. On the other hand many technical investors use fundamental techniques to limit the universe of possible stock of good companies. In this paper we implement a technical analysis methodology for obtaining signals of investments, so let us now review some important concepts related to it.
Technical analysis
Technical analysis is based on the idea that all the information of the market is already reflected in the price of the stock. Thus, price predictions are only extrapolations from historical price patterns. This analysis has demonstrated satisfactory behaviors to forecast trends. In fact, the majority of empirical researchers claim high profits in their results, at the same time theoretical evaluations often evaluate these strategies with a low predictability power [10]. Technical analysis uses technical indicators (TIs), that are variables derived from the price time series of a security. TIs have no predictive reliability by themselves and they should be combined with other indicators or investment tools to avoid false signals. Some TIs can exhibit good performance when applied to specific companies or markets and bad results for others. There are a countless number of technical variables; next we only enumerate those TIs useful along this paper. For more detailed information on TIs we refer to [11]. Simple moving average (SMA) Weighted moving average (WMA) Exponential moving average (EMA) Hull moving average (HMA) [12]
Automated trading systems based on grammatical evolution
Grammatical Evolution
Grammatical Evolution (GE) is a relatively new evolutionary alternative. This evolutionary computation technique was promoted by C. Ryan, JJ. Collins and M. O’Neill in 1998 [16]. We can summarize the definition of GE as a type of evolutionary algorithm designed to evolve computer programs defined by a grammar, usually in Backus normal form (BNF notation). The most similar procedure is genetic programing (GP) [17], which is also able to evolve computers programs. Although GP originally use Lisp as evolving language, there are also a lot of approaches using others languages. The main dissimilarity that makes GE an attractive and elegant solution is that it does not perform the evolutionary process on a specific language. GE evolves individuals as GA and GP does, and performs a mapping process to generate programs in any language. The GE approach becomes an attractive method thanks to its flexibility. It is closely correlated by the great modularity that a well-structured grammar provides. This feature is the main advantage of the GE solutions.
BNF is a notation technique for expressing context-free grammars. A BNF specification is a set of derivation rules, expressed in the form:
The rules are composed of sequences of terminals and non-terminals. Symbols that appear at the left are non-terminals while terminals never appear on a left side. In this sense, we can affirm that <symbol> is a non-terminal, and although this is not a complete BNF specification, we can affirm also that <expression> will be also a non-terminal since those are always enclosed between the pair <>. So, in this case the non-terminal <symbol> will be replaced (indicated by : : =) by an expression. The rest of the grammar must indicate the different possibilities. A grammar is represented by a 4-Tuple N, T, P, S, being N the non-terminal set, T is the terminal set, P the Production rules for the assignment of elements on N and T, and S a start symbol which should appear in N. The options within a production rule are separated by a “|” symbol.
We use the individual genotype to map the start symbol into terminals by reading groups (codons) of 8 bits. Each codon is represented by an integer value on the genotype. The mapping process is the transformation from the genotype to the phenotype. Thus, instead of representing the programs as a tree-solution (as GP), GE presents a chromosome composed by codons (genes in GA). Each codon is connected with a specific rule of the grammar. The chromosome itself is considered the genotype and the real code derived from the codons is called phenotype. To decode the genotype, it is typically used the modulo operator 1 (MOD) as mapping operator (genotype decodification). The final solution consists of a combination of terminals T, which are chosen by mapping the individual to the grammar. Figure 1 faces the general decodification processes of classical GA and GE individuals with the goal of understanding the features of GE chromosomes. The figure compares two similar chromosomes, the work-flows of the decodification processes and two solution instances. As the figure shows, the key difference between both methodologies is the transformation process between the genotype and phenotype, which is guided by a grammar as it was previously explained.
During the process of mapping the genotype we could reach the end of the chromosome. The method runs out of codons before all the non-terminals are turned into terminals. At this point, there are two basic options. First, we consider the individual as invalid and assign it a very low fitness value to stop reproducing. Second, we wrap the individual reusing the chromosome with the wrapping operator. This operator is inspired on the gene-overlapping phenomenon that has been observed in many organism [18]. Wrapping allows to reuse chromosome structures to obtain broader rules, thereby influencing in the quality and diversity of generated individuals becoming an advantageous operator [19] (although some authors disagree).
GE-ATS: Seeking for the best investments operations
The Grammatical Evolution based Automated Trading System (GE-ATS) produces solutions combining the use of the TIs described on Section 2; MAC, VPCI, MACD, SR and RSI. A final solution provides the set of TIs to apply. It is noteworthy that the appearance of a particular TI in the solution set is not limited to one occurrence. Each TI can be repeated any number of times, either with the same or deferent parameters. Each one of these TIs will give a signal for operating with the assets of the company. The signal types are buying, selling or neutral signals. Those indicators have been selected due to their utility in the professional and academic world of finances. Additionally, the ATS introduces a novelty which allow to expand the diversity of TIs. This feature brings the possibility of creating new indicators in execution time, it is detailed in Section 3.3.
Select Companies
grammar ← trading_grammar.bnf
T B ← Threshold buy
T S ← Threshold sell
G← #Generations
N← #Individuals
P c ← Crossover Probability
P m ← Mutation Probability
Optimize TIs parameters by GE (G, N, P c , P m )
Algorithm 1 describes the main process of the methodology. The algorithm assumes that we are working with a portfolio of companies. The optimization process is performed on every company in that portfolio. The result of this process is an optimized solution per company, which the ATS uses to invest in the market.
grammar ← trading_grammar.bnf
Pop← Generate (N, Seed)
Solutions ← Decode (Pop, grammar)
Evaluate (Solutions, T B , T S )
Pop← Selection(Pop)
Pop← Crossover(Pop,P c )
Pop← Mutation(Pop,P m )
Algorithm 2 shows the pseudo-code of the evolutionary optimization process. It is a standard GE algorithm, i.e. an evolutionary process guided by a grammar. Each individual decoded by the grammar provides a set of rules to invest. The fitness per individual is calculated using the accumulated value of the trading signals for the complete historical period of investments.
The grammar of the GE methodology is one of the key elements of the entire system. As all the systems based on GE, the grammar determines the production of the solutions and the complexity of the search space. The main features of the implemented ATS are coded into the grammar. Figure 2 presents a fragment of the defined grammar. Rules I and Rule III codify the implementation of an auxiliary strategy to stop the losses (see Section 3.4). Rule II defines the number of TIs and the weighting factors (see next paragraph). Rule IV indicates the type of TI (see Section 2.1. The remainder rules codify the indicators, both types and parameters.
The grammar presented in Fig. 2 shows how each indicator is associated with a particular weight value (<indicator><weight>). These values represent the importance of each indicator in the investment decision. The default values of the weights are initialized to one. The higher the value of the weight, the higher the number of signals produced by the associated indicator.
To extract the final signal from the accumulated values, the ATS is configured by two presets variables, the Threshold
buy
and the Threshold
sell
. The GE-ATS indicates a buy position if the number of buying signals exceeds the Threshold
buy
, a sell signal if the value is lower than the Threshold
sell
, and a neutral signal if the value is located between both thresholds. The thresholds value can be increased or decreased to profile the system behavior. Thus, the program may be initially more aggressive, or conservative, when it performs investments. After first GE generations, the GE system can alter both the number of indicators involved in the solution and their respective weights to alleviate the liability of the thresholds. Next, we review two examples for a better understanding of the operation of thresholds. Let us consider the following threshold configurations: {A} Buy threshold = 5 Sell threshold = -4 {B} Buy threshold = 20 Sell threshold = -20
The first configuration produces an initially aggressive ATS. The signals produced by the selected TIs can easily exceed the thresholds. Therefore, this configuration provides a higher number of investments. However, if the fitness function assesses a conservative scenario as a more profitable strategy, the solutions will eventually evolve into more conservative strategies, that is, strategies with a reduced number of indicators or indicator with low weighting values. The second configuration represents the opposite situation. Initially, the accumulated value of the signals rarely exceed the thresholds. Therefore, this configurations provides a lower number of investments. However, if the fitness function assesses an aggressive scenario as a more profitable strategy, the solutions will eventually evolve through the generations towards a more aggressive ATS, that is, strategies with a higher number of indicators or indicator with low weighting values.
Generating technical indicators
As we mentioned, our GE-ATS can produce new indicators as combinations or modifications of the set of TIs initially selected. There are countless TIs in the literature. Most of them are modifications of previous indicators, or combinations of already defined TIs. Furthermore, there is not a perfect formula to select and use TIs to beat the market. Aiming to provide greater flexibility to our solutions, we implement four different moving averages, which already were explained in Section 2. The objective is to offer a new level of flexibility in our ATS. The classical operators MAC, MACD, RSIO or RSIOD are defined by one or more moving averages. We implement variations of this classic TIs by combining the multiple versions of the moving the average operator with these operators. Other approaches have been focused on the generation of new indicators as the EDDIE project [20, 21]. The EDDIE 8 project [22] presents an ATS with a similar feature which improves previous results.
Next, let us consider a solution example related with the presented grammar (Fig. 2). The grammar may produce a solution expression containing the following MACD configuration.
The parameters 9, 18 and 11 are the period values (<params>) optimized by the GE for this indicator, company and period. WMA, HMA and EMA are the moving averages types (< type >< type > < type >) optimized by the GE. Using this configuration, the MACD indicator is computed in two steps. First, the MACD line is the difference between two moving averages: WMA(9) and HMA(18). Second, the signal line is a MA of the MACD line: EMA(11). Thus, the ATS creates a new indicator based on the classic MACD but working together with the HMA, which is a novel and effective MA. This contribution is an innovative feature easily expandable which allows to the ATS to expand the original set of TIs.
Solution = SMain + SAux
Fitness ←0
S x ← Obtain_signals (SMain, T B , T S )
X← Return (S x )
S y ← Obtain_signals (SAux, T B , T S )
Y← Return (S y )
Return i ← X
Return i ← Y
Fitness = Fitness + Return i
Combining strategies over a period
In line with the previous feature, the ATS implements the ability of applying different set of rules for the same period (multi-strategies). This new set of rules is applied as a strategy for stopping losses. The ATS selects a strategy depending on the behavior of the performed investments. First, the grammar provides a main set of rules, (SMain) which have been optimized for the training period. Second, the grammar provides an auxiliary set of rules (SAux) which has been optimized in a variable interval, which is always smaller than the training period. Thus, when the main strategy does not work properly in a given period, the ATS switches to an auxiliary investment strategy. Therefore, the ATS could be fitted to sudden changes in the behavior of stock markets. Additionally, this feature allows the ATS to optimize the size of the historical data used (See Algorithm 3).
Let us consider a possible solution in the following general form:
Where type
n
and params
k
are the types and the parameters optimized by the GE algorithm respectively. The expression (< range >) indicates a fragment of the period analyzed for our ATS. Let us focus on the parameter <range> and suppose a value of 56 for it. Of course, we will also obtain within a solution values for the rest of the parameters, but for the sake of clarity they will be left in their general form. Within the notation of Algorithm 3, we can identify SMain with
The return of the day i for SolutionExample is calculated following Algorithm 3. Thus, we calculate MAC values of the last 56 days. If the auxiliary strategy (SAux) fits better than the general strategy (SMain), the system uses the MAC; otherwise the system uses the MACD. The auxiliary strategy allows the system to stop losses and react to repetitive patterns of small periods, which are not exploited for the general strategy. We note that the work-flow is generic to any solution being (SMain) and (SAux) any possible combinations of indicators.
Operators
The individuals of the population are codified by integer strings and are evolved using classic operators with the following features: The offspring of the population is generated by the single point crossover operation (probability = 0.85). The mutation operator is implemented using the well-known integer flip mutation (probability = 0.02). We use also the distinctive operator of Grammatical Evolution, the wrapping operator (wrapping = 2).
Fitness functions and the multi-objective approach
The fitness function is responsible of assigning a value representing the merit of a particular individual. When applying EAs to finance problems we have a wide range of fitness functions, indeed any trading strategy performance criterion can serve as a method of evaluating individuals. In the literature There are functions as the K ratio [23], the maximum draw-down [24] or the pessimistic return on margin (PROM) [24], just for mentioning some of them. The ATS evaluates the trading strategies by three fitness functions: the accumulated return (AR), the Sharpe index (SI) and the correlation coefficient between the equity curve of a strategy and the perfect profit of the market (CECPP).
The AR is the aggregated return at the end of the trading period. The AR is calculated according Equations 1 and 2.
Where DR
i
denote the return of the day
i
and can be expressed as follow:
P
i
is the close price of the day
i
.
The Sharpe index (SI), proposed in Ref. [25] (Equation 3), measures the excess return per unit of deviation in an investment asset or a trading strategy. The top half of the index is related to the funds returned over a set period AR, and subtracts the FR as the return that an investor could have earned in a risk-free investment, typically defined as the return of the Treasury bills over the same period. The denominator is the standard deviation of the returns (σ), which measures the deviation of the profits from its average performance which is an indirect measure of the volatility of the strategy. In order to evaluate correctly the population of individuals, we modify the original expression incorporating an absolute value on the denominator of Equation 3. The aim of this refinement is to provide the correct rankings to individuals which provided periods of negative excess returns.
FR is the free risk return. σ
AR
represents the standard deviation.
Finally, we define the CECPP in Equation 4, which was presented in Ref. [24]. The perfect equity is a theoretical measure of market potential, which is related with the perfect investment in a period (maximum return possible). The equity curve is just the representation of the investment strategy to be evaluated. The CECPP is the correlation between both.
X is the perfect equity curve Y is the real equity curve σ represents the standard deviation
It is worth mentioning that neutral signals do not provide any return. Usually the returns of neutral signals are the risk-free return given by the country target of the analysis (e.g. the US Treasury Bills). Therefore, this represents a disadvantage regarding practitioners or other ATSs. Furthermore, we consider transaction costs both in the training period and in the final strategies. Trading costs are the expenses incurred when buying or selling securities and they are important due to its relation to the net returns: the more operations in the market, the higher expenses related. Transaction costs depend on the brokers, the capital invested and the stock exchanges where we invest. On the one hand, the implementation of real trading costs could be inefficient. On the other hand, avoiding transaction costs would provide an unfair advantage to our experiments. We solve the problem by setting a fixed commission fee of 0.1% on each operation.
Through the literature, we find some authors that claim the lack of proficiency of the AR function in the task of evaluating investments [24]. The AR function avoids a determinant factor in the evaluation of the investments, the risk valuation. Although the trading system itself uses TIs which support in some way the measure of the risk (e.g. the support and resistances), the highly volatile period requires a better risk assessment. In the search towards a better risk assessment, we have introduced SI and CECPP.
ATSs are usually guided by a unique fitness function, but different environments may require different factors and objectives. The financial works differ in the assessment of the importance related to the objectives. The ATS implemented in this study includes a novel methodology able to provide an optimal trade-off among different objectives. The ATS integrates a multi-objective optimization process. It provides solutions fitting better to different behaviors of industrial sectors or companies. We implement a version of the well-known multi-objective algorithm, the non-dominated sorting genetic algorithm [26] (NSGA -II) based in GE. We have chosen NSGA-II since it works quite well with a reduced number of objectives, which can be easily added or removed. Thus, we evaluate and sort by dominance obtaining a set of non-dominated solutions as better individuals. The procedure of the approach, based on NSGA-II and GE, is shown in Fig. 3. The differences between the multi-objective and the classic GE methodology are located in the process of evaluation and selection. Instead of evaluating the population with a unique objective, we use several fitness functions to evaluate the population. A population with k individuals is sorted and divided in non-dominated fronts F i where i ranges from 1 to n. The front F0 contains the best individuals. The solutions codified in these individuals are not dominated by any other solution in the population. After the process of evaluation, the algorithm proceed to the selection of the best individuals for the next generation. The method transfers k/2 individuals from the fronts to the tentative new population. In case the front F m overflows the k/2 individuals, the front members are sorted according to their crowding distance [26]. Since the front F m cannot be fully transferred, the method selects individuals in descending value. Next, the common grammatical evolution operators are applied. The process is repeated until the maximum number of generations is achieved.
The experimental results presented in this section are covered by four strands. In the first batch of experiments we compare (a) the GE approach, (b) the buy and hold strategy and (c) a previous genetic algorithm implementation published in the literature. The second batch of experiments tests the ATS with the three fitness functions proposed. The third batch of experiments executes several combinations of multi-objective functions. Finally, the last batch of experiments studies the feasibility of including a industrial analysis to identify macroeconomic trends.
Dataset: Europe in recession
Summarizing, the ATS invests in the market through the analysis of historical data. Determining the data periods to train and validate the ATS is a key decision. The performance of the ATS is biased by the stability and the profitability of the market in the analyzed periods. Thus, it should be easier to achieve better returns in a period of rising market than in a diminishing market. So, the challenge of building a trading system able to predict the intricate behavior of the stock market becomes even more difficult and interesting in a hostile environment. We study the performance of the implemented ATSs in one of the most hostile scenarios, the last economy recession. Our current dataset is located from 2001 to 2013, that is, in the middle of one of the most important economy cracks. The collected data are retrieved from the most important indexes of Europe, which has been one of the continents with more financial problems in the economic recession. The companies that compose our dataset are listed in Orbis https://orbis.bvdinfo.com/ and have been chosen according to the next criteria: Publicly Listed companies in Europe. Active companies with at least 5 last years of historical data in the stock market. Current market capitalization greater than 25 million euros. Excluding pension fund. Excluding financial firms (financial and insurance activities). Excluding Public administration. Excluding activities of extraterritorial organism.
The historical time series have been collected using Bloomberg software http://www.bloomberg.com/. The time series provide price and volumes on a daily frequency. Additionally, we noted that the companies used for our experiments were preselected randomly; therefore, we must diversify our investment in a wide range of companies or rely the decision on a expert opinion.
Comparing ATSs based on genetic algorithms and grammatical evolution
The first set of experimental results studies the returns provided by different trading strategies over a period of economy recession. The objective of this section is not only to validate and test the implemented ATSs, but also to compare it with a validated approach that uses an alternative optimization methodology. Besides the presented ATS, we include two additional trading strategies. First, we introduce a previous approach presented in Ref. [27]. This work proposed an ATS based on genetic algorithms (GA). It used a similar set of TIs and the AR fitness function. The ATS was validated with historical data from the Standard & Poor’s 500 (S&P500) from 1996 to 2006. The ATS provided annual average returns of 870%. Second, we include the buy and hold investment strategy (B&H) where an investor buys stocks and holds them for the complete period.
In order to offer a fair comparative between both ATSs, we include some implementation modifications for the first set of experiments. On the one hand, since the ATS based on GA is just guided by the AR function, the ATS based on GE is confined to the usage of the AR as unique fitness function. On the other hand, the set of indicators implemented in the ATS based on GA was adjusted. Both systems use the six TIs which were defined in the Section 2.1.
The batch of experimental results is showed in Fig. 4. The ATS executions are focused on a portfolio of Spanish companies, which are listed in one of most affected markets in the recession. The portfolio consists of a total of 43 companies, i.e., all the Spanish companies listed in our database (Section 4.1). The horizontal axis presents the companies with the related investment strategies. The vertical axis shows the aggregated returns of the companies. The ATS was trained with data from 2001 to 2011. We obtain average returns of –23.63% for the Buy and Hold strategy, 5.89% for the GA approach and 21.08% for the GE approach. The average number of operations with positive returns are 8, 28 and 29 respectively.
Testing ATSs based on mono-objective versions
After comparing the performance of the implemented GE- and GA-ATS implementations, we focus our experiments in the GE approach. The second batch of experiments presents an extension of our first batch of experiments using three ATS versions. Each ATS uses a different implementation of the fitness function: AR, IS and CECPP. In order to study the ATS operation in other countries, we change the sample of companies used in the last section. The ATS is validated with historical data from 41 randomly selected companies, which are listed in next countries: Germany, United Kingdom, Spain and France.
Figure 5 shows the results of a series of investments in 41 European companies. As the previous experiments, the ATS was trained with data from 2001 to 2011 and validated in 2012. The Y axis shows the returns obtained. The horizontal axis presents the fitness functions and the companies which have contributed to the experiment. The average returns provided by the system are 10.94% for the SI, 40.79% for the AR and 20.32% for the CECPP.
Despite obtaining the lowest return, SI is the strategy with the lowest number of operations providing negative returns. The SI provides an average of 12 unprofitable investments in a year, obtaining an average return of –0.7%. The ATS implementing the AR scores a mean of 16 operations with negative returns per year, which provides a total of –6%. Finally, the CECPP shows the worst behavior. It provides 18 failed investments with a the negative return of –8% while the average return slightly exceeds the 20%.
The conclusion of the results in Fig. 5 can be summarized as: We do not recommend the use of CECPP without analyzing the effects of this fitness functions. We will use SI when the portfolio is composed by few companies or is selected without any specific criteria. We will use SI if we want to diversify the investments in large set of companies or we have an expert knowledge about the portfolio analyzed.
A detailed study of the figure shows that, although in general terms SI and RA provide higher confidence, the CECPP achieves good results in some cases where the others functions show worse results. Thus, the CECPP provides some features able to get returns where SI and RA are unprofitable. This observation encouraged us to implement a multi-objective approach which tries to capture the advantages of several fitness functions aiming to improve the final returns (it was presented in Section 3.6).
Testing ATSs based on multi-objective versions
The third batch of experiments was conducted to test the multi-objective approach. As far as we know, the system based on GE and NSGA II (described in Section 3.6) is the first system combining a multi-objective approach and a GE methodology in Finance.
The multi-objective approach is tested using two ATS versions. Each ATS uses a different combination of fitness functions. First, we use a multi-objective approach based on IS, AR and CECPP (3 objectives). Second, we select IS and CECPP to build the second ATS (2 objectives). The second combination is chosen in order to take advantage of the complementary nature of both fitness functions. As in the previous section, the ATS is validated with historical data from 41 companies listed in Germany, United Kingdom, Spain and France.
Figure 6 presents the returns (Y-axis) of two multi-objective approaches and the B&H strategy. The ATS was also trained with data from 2001 to 2011 and validated in 2012. The investments are performed over the 41 European companies used in the last experiment (X-axis). The approach using 3 objectives reaches an average return of 11.07% and the version using 2 objectives achieves a 23.31% of average return. First, the version using all the fitness functions provides just a slightly increment above the mono-objective version using SI. However, the multi-objective approach provides 15 negative operations and obtains an average return of –2%. Therefore, this approach does not improve any result obtained from previous experiments. Second, the 2 objectives version achieves the lower number of negative operations and an increment in the returns. This version provides a total number of 10 negative operations and achieves the second position in terms of average returns (23%). It is worth of mention that despite the lower number of operations with negative results, the total average of negative returns exceeds the mono-objective approach reaching a –4%, which is large mainly because one company almost reaches an average return of –90%.
Testing the feasibility of the industrial analysis
The last batch of experiments is a proof of concept. It is focused on testing the feasibility of a macroeconomic analysis implementation (industrial analysis) in the framework of an ATS. Stock Markets are formed by many companies engaged in a lot of different activities. Events like the economic recession, the “boom” of the brick, the rising cost of oil, droughts, etc. affect differently to each industrial sector. A study showing this trends was presented in Ref. [28]. However, these events provide similar effects to companies belonging to the same industrial sectors. Thus, there are companies more interrelated than others, for example the behavior or trends, of BMW and Volkswagen are more dependent on each other than the BMW itself with Goggle. We perform an additional series of experimental executions with the aim of verifying the existence of trends among the existing industrial sectors. An ATS able to identify the most attractive sectors for investments could take advantage of the information restricting the universe of potential companies. We conduct the same experiments that we performed in the previous experiments but preselecting companies according to the statistical classification of economic activities in the European Community (NACE: second revision). The industrial sectors used in the experimental results are: C – Manufacturing D – Electricity, gas, steam and air conditioning F – Building L – Real estate activities G – Wholesale and retail trade, except repair of motor vehicles.
Figure 7 shows the returns (Y axis) of the six industrial sectors (six companies per sector) selected using 4 of the approaches implemented (X axis). The approaches used are the AR and CECPP mono-objective versions, and the two multi-objective versions based on 3 and 2 objectives. Results show very different returns depending on the sector invested. The most profitable sector is the G sector, which is related to wholesale. It is composed with companies such as Adolfo Dominguez or Inditex. Therefore, there is a general trend of a specific industry. A trading system able to analyze and choose the sector related with could obtain generous returns. Although this experiment is not conclusive, it supports the idea that trends of industrial sectors can be exploited to build a portfolio and increase the returns. This will be our next step of research.
Related work
An ATS is a computerized system that automatically submits trading orders to an exchange. The early projects developing ATSs were difficult although today the use of the computers to automate features related to the investments process has an important role. An ATS defines the investment problem as the process to maximize the risk adjusted return for a specific time period when investing long- or short-sell in a financial asset (in our case a stock). The performance of investment decisions in stock markets is influenced by a wideness of factors: political, macroeconomic, regulatory, local, international, etc. Since these factors are uncertain, there is not a single and perfect rule with a combination of parameters or threshold values that can be used to maximize future returns. The market investments are addressed by a broad set of potential rules which combine multiple factors. These factors are represented by indicators and ratios, which in turn, are driven by a range of parameters. Summarizing, the investment problem consists of two basic steps. First, finding the best combination of variables. Second, fine-tuning the parameters for the chosen set of variables. Thus, the input set of rules for an ATS is built with indicators and ratios to be used as investment criteria and their related parameters, while the output is the return obtained.
An ATSs could be a simple system using only a particular technical indicator as the moving average, or a complex system based on methodologies such as Fibonacci retracement [29], linear regression [30], neuronal networks [31], fuzzy logic [32], genetic programming [33], etc. In this section we present a brief overview of the related work present in the literature of both classical and bio-inspired trading systems.
In the work of Ratner and Leal [34] we can find a comparative study of ten static strategies based on moving averages (MAs) and the buy and hold strategy (B&H). This work is complemented in Ref. [35], where the authors compare strategies based on static MAs against dynamic MAs which responds to market volatility. All of this works apply only to MAs and with a reduced range of parameters.
GE was presented as a smart solution that fits perfectly in the context of the complexity of stock markets. However, we note that GE are not limited to the financial environment and they have been used in many topics. For example, Moore uses GE to generate optimal biochemical network models in Ref. [36]. Additionally, the literature offers GE applications to solve trigonometric identities [37], to optimize dynamic memory [38] and even to compose automatically music [39]. There are previous works already using GE as a methodology to optimize investments. The NCRA Group at the University of Edinburgh has performed an excellent work about this topic. For example, in Ref. [40] the authors propose a GE to evolve a financial trading system. In this approach the authors show an adaptive grammar with a variant of the moving window. The different rules of the grammar are in constant evolution during the execution of the trading system while new data is being uploaded. Other works of this group show the proficiency of GE in the foreign-exchange markets or in different indexes of the stock market as in Ref. [5] or Ref. [6] among others. Other authors have also proposed ATSs based on GE, as in Ref. [7], where the authors develop a system using co-evolution of signals and stop-loss conditions for short and long positions. This work was updated in Ref. [41] where they change the fitness function with a complex fitness proposed in Ref. [8]. Other approaches have been focused on the generation of new indicators as the EDDIE project [20, 21]. The EDDIE 8 software [22] presented the ability of generating new TIs. It improved the previous results despite the convergence problems triggered by the large space of solutions. Others works have used multi-objective optimization to trade stocks. For example in Ref. [42], the authors claim high profits with a multi-objective version of an ATS based on the AR and the SI. More recently, Ref. [43] and Ref. [3] showed a multi-objective genetic algorithm where the parameters of several technical indicators were optimized whenever a new data was received.
Conclusions and future research
In this paper we have presented the development of an automated trading system (ATS) able to analyze large amount of historical prices and volumes as source of information. We have developed an ATS based on grammatical evolution (GE) which is capable of generating complex strategies. The implemented features of our ATS, as the capacity to build its own technical indicators, exhibit good performances where other evolutionary approaches failed. Furthermore, we have introduced a novel multi-objective optimization method based on the non-dominated sorting genetic algorithm (NSGA-II) and GE. The multi-objective approach demonstrated high returns (average return of 20%) in very volatile periods, thus combining some of the best features of the three fitness functions employed. The future research work which should follow this paper is related with the last batch of experiments. As these have suggested, the binding of a macroeconomic and microeconomic analyses could significantly increase the benefits and reduce the risk of losses in ATSs. We are developing an ATS able to perform a hybrid analysis of technical, fundamental and macroeconomic factors. Other important proposals can be addressed in this research line, such as the ranking of fitness functions utility, including the grammar production to choose the fitness functions, or extend the set of technical indicators.
The methodology presented here can be applied to other problems. In particular we are extending the work made on the grammatical and evolutionary process to glucose time series prediction in the framework of the project Development of Adaptive and Bio-inspired Systems for Glycemic Control using Insulin Pumps and Continuous Glucose Monitors in Patients with Diabetes Mellitus and SMART Diabetes. The adaptation to the conditions of the patient is analogous to the process of adapting solutions to the market conditions followed in this paper.
Footnotes
1
The modulo operator calculates the remainder of division of one number by another.
Acknowledgments
This work was partially supported by the Spanish Government Minister of Science and Innovation under grant TIN2014-54806-R and IPT-2011-1198-430000 and the People program (Marie Curie Actions) of the European Union Seventh Framework Program (FP7/2007-2013) with the agreement no. 600388 of the REA and the Agencia per a la Competitivitat de l’Empresa (ACCIÓ). J.I.Hidalgo also acknowledges the support of the Spanish Ministry of Education mobility grant PRX16/00216.
