A hybrid automated trading system based on multi-objective grammatical evolution

Abstract

This paper describes a hybrid automated trading system (ATS) based on grammatical evolution and microeconomic analysis. The proposed system takes advantage from the flexibility of grammars for introducing and testing novel characteristics. The ATS introduces the self-generation of new technical indicators and multi-strategies for stopping unforeseen losses. Additionally, this work copes with a novel optimization method combining multi-objective optimization with a grammatical evolution methodology. We implemented the ATS testing three different fitness functions under three mono-objective approaches and also two multi-objective ATSs. Experimental results test and compare them to the Buy and Hold strategy and a previous approach, beating both in returns and in number of positive operations. In particular, the multi-objective approach demonstrated returns up to 20% in very volatile periods, proving that the combination of fitness functions is beneficial for the ATS.

Keywords

Time series stock market technical analysis grammatical evolution multi-objective optimization automatic trading systems

1 Introduction

In July 2015, the world federation of exchanges (www.world-exchanges.org/) approximated the total market capitalization of the major equity markets of the world in $62.4 trillion and it is projected to reach the $284.2 trillion in 2030. These figures gives us an idea of the critical role played by stock markets and an insight about the reached importance of the development of automatic trading systems (ATSs) able to successfully operate in the markets. Since the first approaches to trading systems some decades ago, the development and research in this area has been increased explosively. The growing availability of data for financial markets and companies, together with the increasing complexity of socio-economic and financial environment, makes it more difficult the decision making process for real time investments in stock markets. The huge number of potential interrelated factors and their changing time patterns, affecting financial assets, make the investment process remains a challenge.

Financial investments are affected by factors such as, government policies, natural factors, international trade, market sentiment, political factors etc. Thus, it is very complex to follow a successful flow of information and later conclude the consequences that the information implies. Furthermore, traders are affected by the so called behavioral finances [1]. Brokers or practitioners in general could be affected by human emotions, so their behavior in the stock market became not objective. The high pressure induced by handling a large volume of money is the main reason, which is able to trigger loss aversion, overconfidence, overreaction, and other behavioral biases.

The importance achieved by the information technology (IT) has drastically grown in the vast majority of areas. Thus, IT has become an important component of new capital investments systems and economists search into computers the best hope for a sustainable increase in economic growth rates. Furthermore, the development of IT and telecommunications has promoted the emergence of global processes. The globalization is symbolized for instance by the tremendous development of the “hedge funds industry”, a class of funds which invest in any kind of assets around the world (stocks, indexes, bonds, commodities, currencies, etc.) This new level of interconnectivity plays out in our financial markets where problems in one market have inescapable and often unpredictable effects on the rest of the markets worldwide.

ATSs are predictive engines based on rules using market, business or macroeconomic information embedded in algorithms which seek out best combinations of these rules to drive the stock trades in an attempt of obtaining the maximum possible return for a period. Finding optimal series of investment decision involves an inspection of the related search space which has increased its complexity while more features have been added to the trading engines. Thus, ATSs have evolved from very simple if-then algorithms to more sophisticated models that use methods like artificial intelligence, chaos theory, fractals, evolutionary algorithms, etc. This environment points to meta-heuristics as one of the best approaches to find optimal solutions.

Evolutionary meta-heuristics, commonly referred as evolutionary algorithms are a set of search and optimization methodologies inspired and based on principles and theories of the biology world. In the academic literature a large spread of previous studies document the use of evolutionary algorithms to design and optimize automated trading systems for the Stock Market. [2 –8] among others. The application of meta-heuristics to ATSs has undergone a fast development in both the scientific and the professional world. This work is focused on a relatively new meta-heuristic methodology referred to as grammatical evolution (GE). The ultimate aim of the work is to build a novel ATS capable of analyzing a high number of companies providing, as final solution, a program optimized to earn high returns. We introduce and test innovative characteristics as the self-generation of new technical indicators and multi-strategies for stopping unforeseen losses. Additionally, this work copes with a novel optimization method combining a multi-objective optimization with a GE methodology.

The remainder of the paper is organized as follows. We introduce some economical definitions and financial indicators used to develop the work in Section 2. Next, we cope with the main part of the paper in which we detail the proposed ATS in Section 3. Section 4 presents the experimental results in four strands. First, we compare the GE approach, the buy and hold strategy and a previous genetic algorithm implementation published in the literature. Next, we test the three implemented fitness functions and multi-objective approaches. Finally, we test the ATS through several industrial sectors to test the existence of macroeconomic trends. Section 5 briefly reviews the literature on trading systems, especially using evolutionary algorithms. We conclude the paper in Section 6.

2 Concepts and financial specifications for the automated trading system

This work is based on the adaptive market hypothesis (AMH), a new novel theory about the behavior of the markets proposed by Andrew Lo in Ref. [9]. The AMH supports that the classical theory cannot reflect the behavior of the market in every case. Within the framework of the AMH, investment strategies are systems that evolve over time as investors learn which strategies work better in different circumstances and gradually implement them. Profitable strategies will progressively disappear, while new price patterns will emerge and new strategies will be developed to exploit them. They have tested empirically that financial decisions, finally human decisions, have a heuristic component. This heuristic component makes markets deviate from the financial theory.

If we observe investment analysts at exchange markets, the majority of them tend to fall into one of two schools of thought, namely those of fundamental or technical analysis. Technical analysis is focused on the price movement of a security and uses it to predict future price movements. Fundamental analysis, on the other hand, looks at a larger set of economic factors, known as fundamentals. Both analyses are commonly used by the investors, even both at the same time as they are somewhat complementary methods. For example many fundamental investors combine its primary knowledge with the technical methodology to decide entry and exit points. On the other hand many technical investors use fundamental techniques to limit the universe of possible stock of good companies. In this paper we implement a technical analysis methodology for obtaining signals of investments, so let us now review some important concepts related to it.

2.1 Technical analysis

Technical analysis is based on the idea that all the information of the market is already reflected in the price of the stock. Thus, price predictions are only extrapolations from historical price patterns. This analysis has demonstrated satisfactory behaviors to forecast trends. In fact, the majority of empirical researchers claim high profits in their results, at the same time theoretical evaluations often evaluate these strategies with a low predictability power [10]. Technical analysis uses technical indicators (TIs), that are variables derived from the price time series of a security. TIs have no predictive reliability by themselves and they should be combined with other indicators or investment tools to avoid false signals. Some TIs can exhibit good performance when applied to specific companies or markets and bad results for others. There are a countless number of technical variables; next we only enumerate those TIs useful along this paper. For more detailed information on TIs we refer to [11].

Moving average (MA): A simple moving average is a convolution of the function of close prices with a pulse function which represents the period of the moving average. MAs are only used to build the other indicators and have no ability to produce signals by themselves. We use four different MAs:

Simple moving average (SMA)

Weighted moving average (WMA)

Exponential moving average (EMA)

Hull moving average (HMA) [12]

Moving average crossover (MAC): A technical indicator formed by two or more moving averages. We use multiple MAs with different periods. The buy and sell signals are triggered by the crosses of two MAs.

Moving average convergence/divergence (MACD): It is an evolution of the MAC strategy [13]. MACD works similar to MAC but using a MACD Line (the difference between two EMAs) and aSignal Line (an EMA of the MACD line).

Relative strength index (RSI): It is based on the relative strength factor (RS) of a certain period which compares individual upward or downward movements of successive closing prices [14]. RSI oscillates between 0 and 100. We use RSI to build two indicators:

Overbought/oversold (RSIO): We use the RSI range to identify the overbought and oversold levels. RSI values above the overbought level trigger a sell signal and values below oversold level a buy signal. When RSIO value falls below the overbought level a sell signal is generated. Similarly, a buy signal is generated when the indicator rises above the oversell line.

Divergences (RSID): It is based on the signals produced by the indicator when its movement diverges from the price action. A positive divergence means a buy signal. It appears when the RSI builds a positive trend despite the lower trending by the price. Similarly, a negative divergence occurs when the RSI starts a negative trend while the real price follows a higher trend.

Volume price confirmation indicator (VPCI) [15]. It measures the intrinsic relationship between the prevailing price trend and volume. The price-volume relationship confirms or contradicts the price trend. When volume increases, it confirms price direction; when volume decreases, it contradicts price direction.

Supports and resistances (SR): are basic concepts of the technical analysis also forming a classical indicator. A support is a level price, below the current price, where the buying power exceeds the sales. A resistance is the opposite concept.

3 Automated trading systems based on grammatical evolution

3.1 Grammatical Evolution

Grammatical Evolution (GE) is a relatively new evolutionary alternative. This evolutionary computation technique was promoted by C. Ryan, JJ. Collins and M. O’Neill in 1998 [16]. We can summarize the definition of GE as a type of evolutionary algorithm designed to evolve computer programs defined by a grammar, usually in Backus normal form (BNF notation). The most similar procedure is genetic programing (GP) [17], which is also able to evolve computers programs. Although GP originally use Lisp as evolving language, there are also a lot of approaches using others languages. The main dissimilarity that makes GE an attractive and elegant solution is that it does not perform the evolutionary process on a specific language. GE evolves individuals as GA and GP does, and performs a mapping process to generate programs in any language. The GE approach becomes an attractive method thanks to its flexibility. It is closely correlated by the great modularity that a well-structured grammar provides. This feature is the main advantage of the GE solutions.

BNF is a notation technique for expressing context-free grammars. A BNF specification is a set of derivation rules, expressed in the form: $< symbol > : : = < expression >$

The rules are composed of sequences of terminals and non-terminals. Symbols that appear at the left are non-terminals while terminals never appear on a left side. In this sense, we can affirm that <symbol> is a non-terminal, and although this is not a complete BNF specification, we can affirm also that <expression> will be also a non-terminal since those are always enclosed between the pair <>. So, in this case the non-terminal <symbol> will be replaced (indicated by : : =) by an expression. The rest of the grammar must indicate the different possibilities. A grammar is represented by a 4-Tuple N, T, P, S, being N the non-terminal set, T is the terminal set, P the Production rules for the assignment of elements on N and T, and S a start symbol which should appear in N. The options within a production rule are separated by a “|” symbol.

We use the individual genotype to map the start symbol into terminals by reading groups (codons) of 8 bits. Each codon is represented by an integer value on the genotype. The mapping process is the transformation from the genotype to the phenotype. Thus, instead of representing the programs as a tree-solution (as GP), GE presents a chromosome composed by codons (genes in GA). Each codon is connected with a specific rule of the grammar. The chromosome itself is considered the genotype and the real code derived from the codons is called phenotype. To decode the genotype, it is typically used the modulo operator 1 (MOD) as mapping operator (genotype decodification). The final solution consists of a combination of terminals T, which are chosen by mapping the individual to the grammar. Figure 1 faces the general decodification processes of classical GA and GE individuals with the goal of understanding the features of GE chromosomes. The figure compares two similar chromosomes, the work-flows of the decodification processes and two solution instances. As the figure shows, the key difference between both methodologies is the transformation process between the genotype and phenotype, which is guided by a grammar as it was previously explained.

During the process of mapping the genotype we could reach the end of the chromosome. The method runs out of codons before all the non-terminals are turned into terminals. At this point, there are two basic options. First, we consider the individual as invalid and assign it a very low fitness value to stop reproducing. Second, we wrap the individual reusing the chromosome with the wrapping operator. This operator is inspired on the gene-overlapping phenomenon that has been observed in many organism [18]. Wrapping allows to reuse chromosome structures to obtain broader rules, thereby influencing in the quality and diversity of generated individuals becoming an advantageous operator [19] (although some authors disagree).

3.2 GE-ATS: Seeking for the best investments operations

The Grammatical Evolution based Automated Trading System (GE-ATS) produces solutions combining the use of the TIs described on Section 2; MAC, VPCI, MACD, SR and RSI. A final solution provides the set of TIs to apply. It is noteworthy that the appearance of a particular TI in the solution set is not limited to one occurrence. Each TI can be repeated any number of times, either with the same or deferent parameters. Each one of these TIs will give a signal for operating with the assets of the company. The signal types are buying, selling or neutral signals. Those indicators have been selected due to their utility in the professional and academic world of finances. Additionally, the ATS introduces a novelty which allow to expand the diversity of TIs. This feature brings the possibility of creating new indicators in execution time, it is detailed in Section 3.3.

Algorithm 1 Main process

Require: Set of Technical Indicators (S_TI)

Select Companies

grammar ← trading_grammar.bnf

T_B ← Threshold_buy

T_S ← Threshold_sell

G← #Generations

N← #Individuals

P_c← Crossover Probability

P_m← Mutation Probability

for i = 1 to #Companies do

Optimize TIs parameters by GE (G, N, P_c, P_m)

end for

Ensure: Subset of S_TI & values for parameters

Algorithm 1 describes the main process of the methodology. The algorithm assumes that we are working with a portfolio of companies. The optimization process is performed on every company in that portfolio. The result of this process is an optimized solution per company, which the ATS uses to invest in the market.

Algorithm 2 Optimization of TIs by GE

Require: trading_grammar.bnf

grammar ← trading_grammar.bnf

Pop← Generate (N, Seed)

for i = 1 to G do

Solutions ← Decode (Pop, grammar)

Evaluate (Solutions, T_B, T_S)

Pop← Selection(Pop)

Pop← Crossover(Pop,P_c)

Pop← Mutation(Pop,P_m)

end for

Ensure: Subset of S_TI & values for parameters

Algorithm 2 shows the pseudo-code of the evolutionary optimization process. It is a standard GE algorithm, i.e. an evolutionary process guided by a grammar. Each individual decoded by the grammar provides a set of rules to invest. The fitness per individual is calculated using the accumulated value of the trading signals for the complete historical period of investments.

The grammar of the GE methodology is one of the key elements of the entire system. As all the systems based on GE, the grammar determines the production of the solutions and the complexity of the search space. The main features of the implemented ATS are coded into the grammar. Figure 2 presents a fragment of the defined grammar. Rules I and Rule III codify the implementation of an auxiliary strategy to stop the losses (see Section 3.4). Rule II defines the number of TIs and the weighting factors (see next paragraph). Rule IV indicates the type of TI (see Section 2.1. The remainder rules codify the indicators, both types and parameters.

The grammar presented in Fig. 2 shows how each indicator is associated with a particular weight value (<indicator><weight>). These values represent the importance of each indicator in the investment decision. The default values of the weights are initialized to one. The higher the value of the weight, the higher the number of signals produced by the associated indicator.

To extract the final signal from the accumulated values, the ATS is configured by two presets variables, the Threshold_buy and the Threshold_sell. The GE-ATS indicates a buy position if the number of buying signals exceeds the Threshold_buy, a sell signal if the value is lower than the Threshold_sell, and a neutral signal if the value is located between both thresholds. The thresholds value can be increased or decreased to profile the system behavior. Thus, the program may be initially more aggressive, or conservative, when it performs investments. After first GE generations, the GE system can alter both the number of indicators involved in the solution and their respective weights to alleviate the liability of the thresholds. Next, we review two examples for a better understanding of the operation of thresholds. Let us consider the following threshold configurations:

{A} Buy threshold = 5 Sell threshold = -4

{B} Buy threshold = 20 Sell threshold = -20

The first configuration produces an initially aggressive ATS. The signals produced by the selected TIs can easily exceed the thresholds. Therefore, this configuration provides a higher number of investments. However, if the fitness function assesses a conservative scenario as a more profitable strategy, the solutions will eventually evolve into more conservative strategies, that is, strategies with a reduced number of indicators or indicator with low weighting values. The second configuration represents the opposite situation. Initially, the accumulated value of the signals rarely exceed the thresholds. Therefore, this configurations provides a lower number of investments. However, if the fitness function assesses an aggressive scenario as a more profitable strategy, the solutions will eventually evolve through the generations towards a more aggressive ATS, that is, strategies with a higher number of indicators or indicator with low weighting values.

3.3 Generating technical indicators

As we mentioned, our GE-ATS can produce new indicators as combinations or modifications of the set of TIs initially selected. There are countless TIs in the literature. Most of them are modifications of previous indicators, or combinations of already defined TIs. Furthermore, there is not a perfect formula to select and use TIs to beat the market. Aiming to provide greater flexibility to our solutions, we implement four different moving averages, which already were explained in Section 2. The objective is to offer a new level of flexibility in our ATS. The classical operators MAC, MACD, RSIO or RSIOD are defined by one or more moving averages. We implement variations of this classic TIs by combining the multiple versions of the moving the average operator with these operators. Other approaches have been focused on the generation of new indicators as the EDDIE project [20, 21]. The EDDIE 8 project [22] presents an ATS with a similar feature which improves previous results.

Next, let us consider a solution example related with the presented grammar (Fig. 2). The grammar may produce a solution expression containing the following MACD configuration. $MACD (WMA, HMA, EMA, 9, 18, 11);$

The parameters 9, 18 and 11 are the period values (<params>) optimized by the GE for this indicator, company and period. WMA, HMA and EMA are the moving averages types (< type >< type > < type >) optimized by the GE. Using this configuration, the MACD indicator is computed in two steps. First, the MACD line is the difference between two moving averages: WMA(9) and HMA(18). Second, the signal line is a MA of the MACD line: EMA(11). Thus, the ATS creates a new indicator based on the classic MACD but working together with the HMA, which is a novel and effective MA. This contribution is an innovative feature easily expandable which allows to the ATS to expand the original set of TIs.

Algorithm 3 Evaluation of a Solution with a main strategy and an auxiliary strategy

Require: Historical Data

Solution = SMain + SAux

Fitness ←0

for i = 1 to #DAYS do

S_x← Obtain_signals (SMain, T_B, T_S)

X← Return (S_x)

S_y← Obtain_signals (SAux, T_B, T_S)

Y← Return (S_y)

if X > Y then

Return_i ← X

else

Return_i ← Y

endif

Fitness = Fitness + Return_i

end for

Ensure: Fitness (Solution) ← Fitness

3.4 Combining strategies over a period

In line with the previous feature, the ATS implements the ability of applying different set of rules for the same period (multi-strategies). This new set of rules is applied as a strategy for stopping losses. The ATS selects a strategy depending on the behavior of the performed investments. First, the grammar provides a main set of rules, (SMain) which have been optimized for the training period. Second, the grammar provides an auxiliary set of rules (SAux) which has been optimized in a variable interval, which is always smaller than the training period. Thus, when the main strategy does not work properly in a given period, the ATS switches to an auxiliary investment strategy. Therefore, the ATS could be fitted to sudden changes in the behavior of stock markets. Additionally, this feature allows the ATS to optimize the size of the historical data used (See Algorithm 3).

Let us consider a possible solution in the following general form: $\begin{matrix} SolutionExample \\ = {MACD (type 1, type 2, type 3, {params 1}); \\ AUX (< range >, MAC (type 5, type 6, \\ {params 2}));} \end{matrix}$

Where type_n and params_k are the types and the parameters optimized by the GE algorithm respectively. The expression (< range >) indicates a fragment of the period analyzed for our ATS. Let us focus on the parameter <range> and suppose a value of 56 for it. Of course, we will also obtain within a solution values for the rest of the parameters, but for the sake of clarity they will be left in their general form. Within the notation of Algorithm 3, we can identify SMain with $MACD (type 1, type 2, type 3, {params 1})$ and SAux with $AUX (56 MAC (type 5, type 6, {params 2})$

The return of the day_i for SolutionExample is calculated following Algorithm 3. Thus, we calculate MAC values of the last 56 days. If the auxiliary strategy (SAux) fits better than the general strategy (SMain), the system uses the MAC; otherwise the system uses the MACD. The auxiliary strategy allows the system to stop losses and react to repetitive patterns of small periods, which are not exploited for the general strategy. We note that the work-flow is generic to any solution being (SMain) and (SAux) any possible combinations of indicators.

3.5 Operators

The individuals of the population are codified by integer strings and are evolved using classic operators with the following features:

The offspring of the population is generated by the single point crossover operation (probability = 0.85).

The mutation operator is implemented using the well-known integer flip mutation (probability = 0.02).

We use also the distinctive operator of Grammatical Evolution, the wrapping operator (wrapping = 2).

3.6 Fitness functions and the multi-objective approach

The fitness function is responsible of assigning a value representing the merit of a particular individual. When applying EAs to finance problems we have a wide range of fitness functions, indeed any trading strategy performance criterion can serve as a method of evaluating individuals. In the literature There are functions as the K ratio [23], the maximum draw-down [24] or the pessimistic return on margin (PROM) [24], just for mentioning some of them. The ATS evaluates the trading strategies by three fitness functions: the accumulated return (AR), the Sharpe index (SI) and the correlation coefficient between the equity curve of a strategy and the perfect profit of the market (CECPP).

The AR is the aggregated return at the end of the trading period. The AR is calculated according Equations 1 and 2. $AR = \prod_{i = 1}^{n} (1 + {DR}_{i})$ (1)

Where DR_i denote the return of the day_i and can be expressed as follow: ${DR}_{i} = {\begin{matrix} \frac{P_{i} - P_{i - 1}}{P_{i - 1}} & If the ATS signal = Long \\ \frac{- (P_{i} - P_{i - 1})}{P_{i - 1}} & If the ATS signal = Short \\ 0 & If the ATS signal = Neutral \end{matrix}$ (2)

P_i is the close price of the day_i.

The Sharpe index (SI), proposed in Ref. [25] (Equation 3), measures the excess return per unit of deviation in an investment asset or a trading strategy. The top half of the index is related to the funds returned over a set period AR, and subtracts the FR as the return that an investor could have earned in a risk-free investment, typically defined as the return of the Treasury bills over the same period. The denominator is the standard deviation of the returns (σ), which measures the deviation of the profits from its average performance which is an indirect measure of the volatility of the strategy. In order to evaluate correctly the population of individuals, we modify the original expression incorporating an absolute value on the denominator of Equation 3. The aim of this refinement is to provide the correct rankings to individuals which provided periods of negative excess returns. $SI = \frac{AR - FR}{σ^{(AR - FR) / | (AR - FR) |}}$ (3)

FR is the free risk return.

σ_AR represents the standard deviation.

Finally, we define the CECPP in Equation 4, which was presented in Ref. [24]. The perfect equity is a theoretical measure of market potential, which is related with the perfect investment in a period (maximum return possible). The equity curve is just the representation of the investment strategy to be evaluated. The CECPP is the correlation between both. ${CECPP}_{i} = \sum_{i = 1}^{n} \frac{(X_{i} - \bar{X}) (Y_{i} - \bar{Y})}{((n - 1) (σ_{X}) (σ_{Y}))}$ (4)

X is the perfect equity curve

Y is the real equity curve

σ represents the standard deviation

It is worth mentioning that neutral signals do not provide any return. Usually the returns of neutral signals are the risk-free return given by the country target of the analysis (e.g. the US Treasury Bills). Therefore, this represents a disadvantage regarding practitioners or other ATSs. Furthermore, we consider transaction costs both in the training period and in the final strategies. Trading costs are the expenses incurred when buying or selling securities and they are important due to its relation to the net returns: the more operations in the market, the higher expenses related. Transaction costs depend on the brokers, the capital invested and the stock exchanges where we invest. On the one hand, the implementation of real trading costs could be inefficient. On the other hand, avoiding transaction costs would provide an unfair advantage to our experiments. We solve the problem by setting a fixed commission fee of 0.1% on each operation.

Through the literature, we find some authors that claim the lack of proficiency of the AR function in the task of evaluating investments [24]. The AR function avoids a determinant factor in the evaluation of the investments, the risk valuation. Although the trading system itself uses TIs which support in some way the measure of the risk (e.g. the support and resistances), the highly volatile period requires a better risk assessment. In the search towards a better risk assessment, we have introduced SI and CECPP.

ATSs are usually guided by a unique fitness function, but different environments may require different factors and objectives. The financial works differ in the assessment of the importance related to the objectives. The ATS implemented in this study includes a novel methodology able to provide an optimal trade-off among different objectives. The ATS integrates a multi-objective optimization process. It provides solutions fitting better to different behaviors of industrial sectors or companies. We implement a version of the well-known multi-objective algorithm, the non-dominated sorting genetic algorithm [26] (NSGA -II) based in GE. We have chosen NSGA-II since it works quite well with a reduced number of objectives, which can be easily added or removed. Thus, we evaluate and sort by dominance obtaining a set of non-dominated solutions as better individuals. The procedure of the approach, based on NSGA-II and GE, is shown in Fig. 3. The differences between the multi-objective and the classic GE methodology are located in the process of evaluation and selection. Instead of evaluating the population with a unique objective, we use several fitness functions to evaluate the population. A population with k individuals is sorted and divided in non-dominated fronts F_i where i ranges from 1 to n. The front F₀ contains the best individuals. The solutions codified in these individuals are not dominated by any other solution in the population. After the process of evaluation, the algorithm proceed to the selection of the best individuals for the next generation. The method transfers k/2 individuals from the fronts to the tentative new population. In case the front F_m overflows the k/2 individuals, the front members are sorted according to their crowding distance [26]. Since the front F_m cannot be fully transferred, the method selects individuals in descending value. Next, the common grammatical evolution operators are applied. The process is repeated until the maximum number of generations is achieved.

4 Experimental results

The experimental results presented in this section are covered by four strands. In the first batch of experiments we compare (a) the GE approach, (b) the buy and hold strategy and (c) a previous genetic algorithm implementation published in the literature. The second batch of experiments tests the ATS with the three fitness functions proposed. The third batch of experiments executes several combinations of multi-objective functions. Finally, the last batch of experiments studies the feasibility of including a industrial analysis to identify macroeconomic trends.

4.1 Dataset: Europe in recession

Summarizing, the ATS invests in the market through the analysis of historical data. Determining the data periods to train and validate the ATS is a key decision. The performance of the ATS is biased by the stability and the profitability of the market in the analyzed periods. Thus, it should be easier to achieve better returns in a period of rising market than in a diminishing market. So, the challenge of building a trading system able to predict the intricate behavior of the stock market becomes even more difficult and interesting in a hostile environment. We study the performance of the implemented ATSs in one of the most hostile scenarios, the last economy recession. Our current dataset is located from 2001 to 2013, that is, in the middle of one of the most important economy cracks. The collected data are retrieved from the most important indexes of Europe, which has been one of the continents with more financial problems in the economic recession. The companies that compose our dataset are listed in Orbis https://orbis.bvdinfo.com/ and have been chosen according to the next criteria:

Publicly Listed companies in Europe.

Active companies with at least 5 last years of historical data in the stock market.

Current market capitalization greater than 25 million euros.

Excluding pension fund.

Excluding financial firms (financial and insurance activities).

Excluding Public administration.

Excluding activities of extraterritorial organism.

The historical time series have been collected using Bloomberg software http://www.bloomberg.com/. The time series provide price and volumes on a daily frequency. Additionally, we noted that the companies used for our experiments were preselected randomly; therefore, we must diversify our investment in a wide range of companies or rely the decision on a expert opinion.

4.2 Comparing ATSs based on genetic algorithms and grammatical evolution

The first set of experimental results studies the returns provided by different trading strategies over a period of economy recession. The objective of this section is not only to validate and test the implemented ATSs, but also to compare it with a validated approach that uses an alternative optimization methodology. Besides the presented ATS, we include two additional trading strategies. First, we introduce a previous approach presented in Ref. [27]. This work proposed an ATS based on genetic algorithms (GA). It used a similar set of TIs and the AR fitness function. The ATS was validated with historical data from the Standard & Poor’s 500 (S&P500) from 1996 to 2006. The ATS provided annual average returns of 870%. Second, we include the buy and hold investment strategy (B&H) where an investor buys stocks and holds them for the complete period.

In order to offer a fair comparative between both ATSs, we include some implementation modifications for the first set of experiments. On the one hand, since the ATS based on GA is just guided by the AR function, the ATS based on GE is confined to the usage of the AR as unique fitness function. On the other hand, the set of indicators implemented in the ATS based on GA was adjusted. Both systems use the six TIs which were defined in the Section 2.1.

The batch of experimental results is showed in Fig. 4. The ATS executions are focused on a portfolio of Spanish companies, which are listed in one of most affected markets in the recession. The portfolio consists of a total of 43 companies, i.e., all the Spanish companies listed in our database (Section 4.1). The horizontal axis presents the companies with the related investment strategies. The vertical axis shows the aggregated returns of the companies. The ATS was trained with data from 2001 to 2011. We obtain average returns of –23.63% for the Buy and Hold strategy, 5.89% for the GA approach and 21.08% for the GE approach. The average number of operations with positive returns are 8, 28 and 29 respectively.

4.3 Testing ATSs based on mono-objective versions

After comparing the performance of the implemented GE- and GA-ATS implementations, we focus our experiments in the GE approach. The second batch of experiments presents an extension of our first batch of experiments using three ATS versions. Each ATS uses a different implementation of the fitness function: AR, IS and CECPP. In order to study the ATS operation in other countries, we change the sample of companies used in the last section. The ATS is validated with historical data from 41 randomly selected companies, which are listed in next countries: Germany, United Kingdom, Spain and France.

Figure 5 shows the results of a series of investments in 41 European companies. As the previous experiments, the ATS was trained with data from 2001 to 2011 and validated in 2012. The Y axis shows the returns obtained. The horizontal axis presents the fitness functions and the companies which have contributed to the experiment. The average returns provided by the system are 10.94% for the SI, 40.79% for the AR and 20.32% for the CECPP.

Despite obtaining the lowest return, SI is the strategy with the lowest number of operations providing negative returns. The SI provides an average of 12 unprofitable investments in a year, obtaining an average return of –0.7%. The ATS implementing the AR scores a mean of 16 operations with negative returns per year, which provides a total of –6%. Finally, the CECPP shows the worst behavior. It provides 18 failed investments with a the negative return of –8% while the average return slightly exceeds the 20%.

The conclusion of the results in Fig. 5 can be summarized as:

We do not recommend the use of CECPP without analyzing the effects of this fitness functions.

We will use SI when the portfolio is composed by few companies or is selected without any specific criteria.

We will use SI if we want to diversify the investments in large set of companies or we have an expert knowledge about the portfolio analyzed.

A detailed study of the figure shows that, although in general terms SI and RA provide higher confidence, the CECPP achieves good results in some cases where the others functions show worse results. Thus, the CECPP provides some features able to get returns where SI and RA are unprofitable. This observation encouraged us to implement a multi-objective approach which tries to capture the advantages of several fitness functions aiming to improve the final returns (it was presented in Section 3.6).

4.4 Testing ATSs based on multi-objective versions

The third batch of experiments was conducted to test the multi-objective approach. As far as we know, the system based on GE and NSGA II (described in Section 3.6) is the first system combining a multi-objective approach and a GE methodology in Finance.

The multi-objective approach is tested using two ATS versions. Each ATS uses a different combination of fitness functions. First, we use a multi-objective approach based on IS, AR and CECPP (3 objectives). Second, we select IS and CECPP to build the second ATS (2 objectives). The second combination is chosen in order to take advantage of the complementary nature of both fitness functions. As in the previous section, the ATS is validated with historical data from 41 companies listed in Germany, United Kingdom, Spain and France.

Figure 6 presents the returns (Y-axis) of two multi-objective approaches and the B&H strategy. The ATS was also trained with data from 2001 to 2011 and validated in 2012. The investments are performed over the 41 European companies used in the last experiment (X-axis). The approach using 3 objectives reaches an average return of 11.07% and the version using 2 objectives achieves a 23.31% of average return. First, the version using all the fitness functions provides just a slightly increment above the mono-objective version using SI. However, the multi-objective approach provides 15 negative operations and obtains an average return of –2%. Therefore, this approach does not improve any result obtained from previous experiments. Second, the 2 objectives version achieves the lower number of negative operations and an increment in the returns. This version provides a total number of 10 negative operations and achieves the second position in terms of average returns (23%). It is worth of mention that despite the lower number of operations with negative results, the total average of negative returns exceeds the mono-objective approach reaching a –4%, which is large mainly because one company almost reaches an average return of –90%.

4.5 Testing the feasibility of the industrial analysis

The last batch of experiments is a proof of concept. It is focused on testing the feasibility of a macroeconomic analysis implementation (industrial analysis) in the framework of an ATS. Stock Markets are formed by many companies engaged in a lot of different activities. Events like the economic recession, the “boom” of the brick, the rising cost of oil, droughts, etc. affect differently to each industrial sector. A study showing this trends was presented in Ref. [28]. However, these events provide similar effects to companies belonging to the same industrial sectors. Thus, there are companies more interrelated than others, for example the behavior or trends, of BMW and Volkswagen are more dependent on each other than the BMW itself with Goggle. We perform an additional series of experimental executions with the aim of verifying the existence of trends among the existing industrial sectors. An ATS able to identify the most attractive sectors for investments could take advantage of the information restricting the universe of potential companies. We conduct the same experiments that we performed in the previous experiments but preselecting companies according to the statistical classification of economic activities in the European Community (NACE: second revision). The industrial sectors used in the experimental results are:

C – Manufacturing

D – Electricity, gas, steam and air conditioning

F – Building

L – Real estate activities

G – Wholesale and retail trade, except repair of motor vehicles.

Figure 7 shows the returns (Y axis) of the six industrial sectors (six companies per sector) selected using 4 of the approaches implemented (X axis). The approaches used are the AR and CECPP mono-objective versions, and the two multi-objective versions based on 3 and 2 objectives. Results show very different returns depending on the sector invested. The most profitable sector is the G sector, which is related to wholesale. It is composed with companies such as Adolfo Dominguez or Inditex. Therefore, there is a general trend of a specific industry. A trading system able to analyze and choose the sector related with could obtain generous returns. Although this experiment is not conclusive, it supports the idea that trends of industrial sectors can be exploited to build a portfolio and increase the returns. This will be our next step of research.

5 Related work

An ATS is a computerized system that automatically submits trading orders to an exchange. The early projects developing ATSs were difficult although today the use of the computers to automate features related to the investments process has an important role. An ATS defines the investment problem as the process to maximize the risk adjusted return for a specific time period when investing long- or short-sell in a financial asset (in our case a stock). The performance of investment decisions in stock markets is influenced by a wideness of factors: political, macroeconomic, regulatory, local, international, etc. Since these factors are uncertain, there is not a single and perfect rule with a combination of parameters or threshold values that can be used to maximize future returns. The market investments are addressed by a broad set of potential rules which combine multiple factors. These factors are represented by indicators and ratios, which in turn, are driven by a range of parameters. Summarizing, the investment problem consists of two basic steps. First, finding the best combination of variables. Second, fine-tuning the parameters for the chosen set of variables. Thus, the input set of rules for an ATS is built with indicators and ratios to be used as investment criteria and their related parameters, while the output is the return obtained.

An ATSs could be a simple system using only a particular technical indicator as the moving average, or a complex system based on methodologies such as Fibonacci retracement [29], linear regression [30], neuronal networks [31], fuzzy logic [32], genetic programming [33], etc. In this section we present a brief overview of the related work present in the literature of both classical and bio-inspired trading systems.

In the work of Ratner and Leal [34] we can find a comparative study of ten static strategies based on moving averages (MAs) and the buy and hold strategy (B&H). This work is complemented in Ref. [35], where the authors compare strategies based on static MAs against dynamic MAs which responds to market volatility. All of this works apply only to MAs and with a reduced range of parameters.

GE was presented as a smart solution that fits perfectly in the context of the complexity of stock markets. However, we note that GE are not limited to the financial environment and they have been used in many topics. For example, Moore uses GE to generate optimal biochemical network models in Ref. [36]. Additionally, the literature offers GE applications to solve trigonometric identities [37], to optimize dynamic memory [38] and even to compose automatically music [39]. There are previous works already using GE as a methodology to optimize investments. The NCRA Group at the University of Edinburgh has performed an excellent work about this topic. For example, in Ref. [40] the authors propose a GE to evolve a financial trading system. In this approach the authors show an adaptive grammar with a variant of the moving window. The different rules of the grammar are in constant evolution during the execution of the trading system while new data is being uploaded. Other works of this group show the proficiency of GE in the foreign-exchange markets or in different indexes of the stock market as in Ref. [5] or Ref. [6] among others. Other authors have also proposed ATSs based on GE, as in Ref. [7], where the authors develop a system using co-evolution of signals and stop-loss conditions for short and long positions. This work was updated in Ref. [41] where they change the fitness function with a complex fitness proposed in Ref. [8]. Other approaches have been focused on the generation of new indicators as the EDDIE project [20, 21]. The EDDIE 8 software [22] presented the ability of generating new TIs. It improved the previous results despite the convergence problems triggered by the large space of solutions. Others works have used multi-objective optimization to trade stocks. For example in Ref. [42], the authors claim high profits with a multi-objective version of an ATS based on the AR and the SI. More recently, Ref. [43] and Ref. [3] showed a multi-objective genetic algorithm where the parameters of several technical indicators were optimized whenever a new data was received.

6 Conclusions and future research

In this paper we have presented the development of an automated trading system (ATS) able to analyze large amount of historical prices and volumes as source of information. We have developed an ATS based on grammatical evolution (GE) which is capable of generating complex strategies. The implemented features of our ATS, as the capacity to build its own technical indicators, exhibit good performances where other evolutionary approaches failed. Furthermore, we have introduced a novel multi-objective optimization method based on the non-dominated sorting genetic algorithm (NSGA-II) and GE. The multi-objective approach demonstrated high returns (average return of 20%) in very volatile periods, thus combining some of the best features of the three fitness functions employed. The future research work which should follow this paper is related with the last batch of experiments. As these have suggested, the binding of a macroeconomic and microeconomic analyses could significantly increase the benefits and reduce the risk of losses in ATSs. We are developing an ATS able to perform a hybrid analysis of technical, fundamental and macroeconomic factors. Other important proposals can be addressed in this research line, such as the ranking of fitness functions utility, including the grammar production to choose the fitness functions, or extend the set of technical indicators.

The methodology presented here can be applied to other problems. In particular we are extending the work made on the grammatical and evolutionary process to glucose time series prediction in the framework of the project Development of Adaptive and Bio-inspired Systems for Glycemic Control using Insulin Pumps and Continuous Glucose Monitors in Patients with Diabetes Mellitus and SMART Diabetes. The adaptation to the conditions of the patient is analogous to the process of adapting solutions to the market conditions followed in this paper.

Footnotes

1

The modulo operator calculates the remainder of division of one number by another.

Acknowledgments

This work was partially supported by the Spanish Government Minister of Science and Innovation under grant TIN2014-54806-R and IPT-2011-1198-430000 and the People program (Marie Curie Actions) of the European Union Seventh Framework Program (FP7/2007-2013) with the agreement no. 600388 of the REA and the Agencia per a la Competitivitat de l’Empresa (ACCIÓ). J.I.Hidalgo also acknowledges the support of the Spanish Ministry of Education mobility grant PRX16/00216.

References

Subrahmanyam

, Behavioural finance: A review and synthesis, European Financial Management 14(1) (2008), 12–29.

Núñez

, Trading systems designed by genetic algorithms, Managerial Finance 8(28) (2002), 87–106.

Soltero

F.J.

, Bodas

D.J.

, Hidalgo

J.I.

, Fernández

and Fernández De-Vega

, Optimization of technical indicators in real time with multiobjective evolutionary algorithms, in Proceedings of the 14th International Conference on Genetic and Evolutionary Computation Companion, 2012, pp. 1535–1536.

Lohpetch

and Corne

, Multiobjective algorithms for financial trading: Multiobjective out-trades single-objective, in IEEE Congress on Evolutionary Computation, 2011, pp. 192–199.

Brabazon

and ONeill

, Evolving technical trading rules for spot foreign-exchange markets using grammatical evolution, Computational Management Science 1(3) (2004), 311–327.

Dempsey

, O’Neill

and Brabazon

, Adaptive trading with grammatical evolution, in Proceedings of the 2006 IEEE Congress on Evolutionary Computation, IEEE Press, 2006, pp. 9137–9142.

Adamu

and Phelps

, Modelling financial time series using grammatical evolution, in Proceedings of the Workshop on Advances in Machine Learning for Computational Finance, 2009.

Saks

and Maringer

D.G.

, Evolutionary money management, in Applications of Evolutionary Computing, vol. 5484 of Lecture Notes in Computer Science, Springer, 2009, pp. 162–171.

A.W.

, The adaptive markets hypothesis: Market efficiency from an evolutionary perspective, The Journal of Portfolio Management 30 (2004), 15–44.

10.

Park

C.-H.

and Irwin

S.H.

, What do we know about the profitability of tecnhical analysis? Journal of Economic Surveys 21(4) (2007), 786–826.

11.

Colby

R.W.

and Meyers

T.A.

, The encyclopedia of technical market indicators, Irwin, 1988.

12.

Hull

, Active Investing, Wrightbooks, Wiley, 2010.

13.

Appel

, Technical Analysis: Power Tools for Active Investors, Prentice Hall, Financial Times, 2005.

14.

Wilder

J.W.

, Trend Research, New concepts in technical trading systems 1978.

15.

Dormeier

, Connection and affinity; between price and volume, Technical Analysis of Stocks and Commodities, 2007.

16.

Ryan

, Collins

J.J.

and O’Neill

, Grammatical evolution: Evolving programs for an arbitrary language, in Lecture Notes in Computer Science 1391, Proceedings of the First European Workshop on Genetic Programming, Springer-Verlag, 1998, pp. 83–95.

17.

Koza

J.R.

, Genetic programming: On the programming of computers by means of natural selection, vol. 1, MIT press, 1992.

18.

Lewin

, Genes VII, Oxford University Press, 1999.

19.

Hugosson

, Hemberg

, Brabazon

and O’Neill

, Genotype representations in grammatical evolution, Appl Soft Comput 10(1) (2010), 36–43.

20.

Tsang

E.P.

, Li

, Markose

, Er

, Salhi

and Iori

, Eddie in financial decision making, Journal of Management and Economics 4(4) (2000).

21.

Tsang

, Yung

and Li

, Eddie-automation, a decision support tool for financial forecasting, Decision Support Systems 37(4) (2004), 559–565.

22.

Kampouridis

and Tsang

, Eddie for investment opportunities forecasting: Extending the search space of the gp, in Evolutionary Computation (CEC), 2010 IEEE Congress on, 2010, pp. 1–8.

23.

Kerstner

, Quantitative Trading Strategies: Harnessing the Power of Quantitative Techniques to Create a Winning Trading Program. McGraw-Hill Trader’s Edge Series, 2003.

24.

Pardo

, The Evaluation and Optimization of Trading Strategies. Wiley Trading, John Wiley & Sons, 2008.

25.

Sharpe

W.F.

, Mutual fund performance, The Journal of Business 39(1) (1966), 119–138.

26.

Deb

, Pratap

, Agarwal

and Meyarivan

, A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Trans, Evolutionary Computation 6(2) (2002), 182–197.

27.

Contreras

, Hidalgo

J.I.

and Nunez Letamendia

, A GA combining technical and fundamental analysis for trading the stock market, in Applications of Evolutionary Computation, vol. 7248 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, 2012, pp. 174–183.

28.

Contreras

, Hidalgo

J.I.

and Nuñez

, Time series of stock prices and randomness: Undercover patterns, in Proceedings of The 2015 Northeast Decision Sciences Institute (NEDSI) Conference, 2015.

29.

Fischer

, The New Fibonacci Trader: Tools and Strategies for Trading Success, Wiley Trading, 2006.

30.

Lin

S.-Y.

, Chen

C.-H.

and Lo

C.-C.

, Currency exchange rates prediction based on linear regression analysis using cloud computing, International Journal of Grid and Distributed Computing 6(2) (2013)–.

31.

Gately

, Neural Networks for Financial Forecasting. John Wiley & Sons, Inc, 1995.

32.

Dourra

and Siy

, Investment using technical analysis and fuzzy logic, Fuzzy Sets and Systems 127(2) (2002), 221–240.

33.

Dempster

M.A.H.

and Jones

, A real-time adaptive trading system using genetic programming, Quantitative Finance 1(4) (2001), 397–413.

34.

Ratner

and Leal

R.P.

, Tests of technical trading strategies in the emerging equity markets of latin america and asia, Journal of Banking & Finance 23(12) (1999), 1887–1905.

35.

Ellis

C.A.

and Parbery

S.A.

, Is smarter better? a comparison of adaptive, and simple moving average trading strategies, Research in International Business and Finance 19(3) (2005), 399–411.

36.

Moore

J.H.

and Hahn

L.W.

, Petri net modeling of highorder genetic systems using grammatical evolution, Biosystems 72(12) (2003), 177–186.

37.

Ryan

, O’Neill

and Collins

, Grammatical evolution: Solving trigonometric identities, in Proceedings of Mendel ’98:4th International Conference on Genetic Algorithms, Optimization Problems, Fuzzy Logic, Neural Networks and Rough Sets, 1998, pp. 111–119.

38.

Colmenar

J.M.

, Risco-Martin

J.L.

, Atienza

, Garnica

, Hidalgo

J.I.

and Lanchares

, Gramáticas evolutivas aplicadas a la optimización de gestores de memoria dinámica, in Congreso Español sobre Metaheurísticas, Algoritmos Evolutivos y Bioinspirados, Ibergarceta Publicaiones, S.L., 2010, pp. 499–506.

39.

de la Puente

A.O.

, Alfonso

R.S.

and Moreno

M.A.

, Automatic composition of music by means of grammatical evolution, SIGAPL APL Quote Quad 32 (2002), 148–155.

40.

Dempsey

, O’Neill

and Brabazon

, Live trading with grammatical evolution, in GECCO 2004 Workshop Proceedings, 2004.

41.

Adamu

and Phelps

, Coevolution of technical trading rules for high frequency trading, in World Congress on Engineering, Lecture Notes in Engineering and Computer Science, Newswood Limited, 2010, pp. 96–101.

42.

Briza

A.C.

and Naval

P.C.

Jr , Stock trading system based on the multi-objective particle swarm optimization of technical indicators on end-of-day market data, Appl Soft Comput 11(1) (2011), 1191–1201.

43.

Bodas-Sagi

, Soltero

, Hidalgo

, Fernández

and Fernandez

, A technique for the optimization of the parameters of technical indicators with multi-objective evolutionary algorithms, in IEEE Congress on Evolutionary Computation, 2012, pp. 1–8.