Neural networks tend to be overconfident which leads to losses when trading; simple models can underfit, but layered approaches make up the difference
Trading in an efficient, complex market is not a cats versus dogs problem. Markets have short data series, are noisy, are random at times, and are not controlled under tight bounds. Price action is not even well understood by humans.
When 90% of human traders fail to make money in the stock market, how is one model — even if it is a neural network — supposed to make money, and not overfit, when markets are not well understood, non-deterministic, noisy, and have short data series?
Popularized applications of neural networks have similar stories: known rules (like chess or speech recognition), deterministic (operated machines), done well by humans (image classification), and lots of relevant data (machine translation).
Simple, layered approaches are ideal for generating good trade signals. In a recent book titled Advances in Financial Machine Learning, Marcos Lopez de Prado, 2019 quant of the year, talked about layering approaches when he discussed meta labeling.
Meta labeling is a way to layer size decisions (1K lots, 50K BBLs) on top of direction decisions (long or short). As an example, first build a primary model to predict a positive class well, then build a second — meta — model to differentiate regimes when the primary model is not correct. Use the direction from the primary model to place your bet; use the confidence of the meta model to size your bet.
To be completely transparent, it is not as clear to me whether the same models and signals used to generate signals should be used when exiting a position. I am still on a quest to learn this. If you have thoughts on this point, then I would love to see your comment below.
How to size a bet with probabilities: Simply multiply a probability by your desired max position. Then map that size to a step function, to reduce small adjustments to a prior bet.
Neural networks are a type of algorithm. They fall under the camp of machine learning. They were first introduced in the 40s and went in and out of popularity between the 60s until today.
Neural networks are versatile, scalable, and powerful, which makes them ideal for large, complex (even non-linear) machine learning tasks: computer vision, phonetic recognition, voice search, conversational speech recognition, speech and image feature coding, semantic utterance classification, hand-writing recognition, audio processing, visual object recognition, information retrieval, in the analysis of molecules that may lead to discovering new drugs, classifying billions of images, recommending the best videos to watch to hundreds of millions of users every day, learning to beat the world champion in the game of Chess or Go, and even powering common machine learning tasks when there is a lot of data.
I pulled the above paragraph from a PowerPoint slide deck I made in graduate school. Does any of that sound like finance and trading? “Large, complex,” and “a lot of data” are not phrases I hear often.
Some of my favorite models are the shallow and simple ones: NaĂŻve Bayes, logistic regression, decision trees, and support vector machines. You can use them as binary classifiers. Binary classifiers, at times, can be more valuable than continuous regression predictors.
Binary predictions are favorable because they emulate trade decisions: long or short. You can use the probabilities these models produce to help you size your bet, which is also favorable.
NaĂŻve Bayes
Based on Bayes Theorem, NaĂŻve Bayes uses prior probabilities based on historical data to make probable predictions on future data. You can use these probabilities to make predictions by creating a threshold, say 50%. Any predicted probability below 50% will become a short bet; any predicted probability above 50% will become a long bet.
from sklearn.naive_bayes import GaussianNBclf = GaussianNB().fit(X_backtest, y_backtest, w_backtest)predictions = clf.predict(X_live)probabilities = clf.predict_proba(X_live)[:, 1]
Logistic Regression
You can think of logistic regression as a linear equation that scales predictions between binary classes via a non-linear function. In the end, what you get is a prediction, long or short, and a probability to help you size.
from sklearn.linear_model import LogisticRegressionclf = LogisticRegression().fit(X_backtest, y_backtest, w_backtest)predictions = clf.predict(X_live)probabilities = clf.predict_proba(X_live)[:, 1]
Decision Tree
Decision trees use important market drivers to optimize trade decisions based on driver levels. They optimize trade decisions by finding favorable skew (a.k.a., probabilities). In a binary case, you get long and short predictions based on regimes (or decision paths) and a predicted probability.
from sklearn.tree import DecisionTreeClassifier as DTCclf = DTC().fit(X_backtest, y_backtest, w_backtest)predictions = clf.predict(X_live)probabilities = clf.predict_proba(X_live)[:, 1]
Support Vector Machine
Similar to linear regression, a support vector machine is a linear algorithm, but contrary to linear regression, they seek to separate long and short historical observations with as wide of a gap as possible. This is useful because we want to train algorithms to differentiate long and short bets. This algorithm also produces a predicted probability, whereas linear regression does not.
from sklearn.svm import SVCclf = SVC().fit(X_backtest, y_backtest, w_backtest)predictions = clf.predict(X_live)probabilities = clf.predict_proba(X_live)[:, 1]
The goal of a layered approach is to reduce the strength of a poor prediction and to magnify the strength of a good prediction through ensembling simple, diverse models. Each simple, diverse model has strengths and weaknesses. By combining them, we hope to size up a position as all models are in alignment, thus increasing our confidence.
This approach is favorable because sizing is where most people get messed up. If we know the direction a market is going, but we do not size well, then we can still end up losing money. If we do not know perfectly the direction a market is going, but we size our bets based on likely outcomes, then we have a better chance at making good money when markets move, and losing less money when markets are not moving. The sum is positive.
Layered approaches are favorable in trading. Neural networks are a cool class of algorithms, but they are overconfident with few observations.
Hope you enjoyed. Let me know if you have questions.
[1] D. Gillham, Trading the Stock Market — Why Most Traders Fail (2020), Wealth Within
[2] A. Rodriguez, Deep Learning Systems: Algorithms, Compilers, and Processors for Large-Scale Production (2020), Morgan & Claypool
[3] M. Lopez de Prado, Advances in Financial Machine Learning (2018), Wiley