You are here

Machine Learning for Financial Applications in Theory and Practice

Type: 
Master's Thesis
Corporate Partner: 
Zürcher Kantonalbank
Date Published: 
May 1, 2015

While machine learning is among the most recent topics to haunt artificial intelligence researchers’ and practitioners’ minds over the last few decades, those researchers are still far from a full understanding of the concept. The notion of machine learning—based, as it is, on the intersection of several disciplines including biology, psychology, mathematics, and robotics—is devoted to finding and specifying important, but often hidden, non-linear relationships among large amounts of data. Learning these relationships might be useful for an artificial system (in other words, for a machine) for developing certain decision rules, or regarding predicting patterns of data—an ability fundamental to truly intelligent behavior. Thus, one can draw parallels between the concept of a learning machine and a human behavior—both will learn from past mistakes and will try to avoid making them again in the future.

Problem Description
Machine learning has a proven track record of success in areas such as understanding speech and computer vision, as well as in the practical implementation of systems to understand natural language. But the application of machine learning to financial time series is on the leading edge of current research in the fields of data mining and computational finance. The findings of the recent literature agree, in general, that the returns of various financial instruments (e.g., stocks and indices, and exchange rates) can be predicted (with some limitations) from their past returns and sets of other variables, including technical and macroeconomic indicators.

With hundreds of models and algorithms employed, it is unclear whether there is a strong prevalence of some machine learning methods over others. Indeed, for practitioners, it is a hard task to settle on the optimal design of a model, given the variety of options presented in the academic literature.

Goal, Purpose, and Structure
In this work, an attempt to aggregate and systemize the findings of the academic literature on the topic of the application of machine learning to financial time series has been made. The author has tried to answer the research question: What are the current status, trends, and robustness of the research results on the usage of machine learning techniques for the forecasting of financial time series? By answering this question, the thesis seeks to review and classify the different approaches and to provide well-structured information that can help practitioners working with financial time series to choose an effective machine learning model design.

To reach this goal, the following strategy has been chosen: first, to describe current findings from the academic literature, exploring current usage of machine learning techniques by researchers and financial institutions (if this information is available); second, to compare results and identify trends over time; and finally to provide recommendations and suggestions based on these findings, where possible. The work includes a survey of the academic literature, focusing on the four main groups of models used for the prediction of financial time series—neural networks, support vector machines, nearest neighbors, and hybrid/evolutionary methods—and on all the subsequent steps required to build a model based on machine learning techniques. By this research the author is contributing to the current state of knowledge in the field because the survey includes a time period that has never before been covered in any relevant sources. Moreover, only a few surveys dedicated to this topic have ever been published.

The thesis shows what kinds of algorithms, input data, time intervals, financial instruments, etc. have been used and what the results were. A critical view on the validity of the results based on how they were achieved is provided, where possible.While machine learning is among the most recent topics to haunt artificial intelligence researchers’ and practitioners’ minds over the last few decades, those researchers are still far from a full understanding of the concept. The notion of machine learning—based, as it is, on the intersection of several disciplines including biology, psychology, mathematics, and robotics—is devoted to finding and specifying important, but often hidden, non-linear relationships among large amounts of data. Learning these relationships might be useful for an artificial system (in other words, for a machine) for developing certain decision rules, or regarding predicting patterns of data—an ability fundamental to truly intelligent behavior. Thus, one can draw parallels between the concept of a learning machine and a human behavior—both will learn from past mistakes and will try to avoid making them again in the future.

Problem Description
Machine learning has a proven track record of success in areas such as understanding speech and computer vision, as well as in the practical implementation of systems to understand natural language. But the application of machine learning to financial time series is on the leading edge of current research in the fields of data mining and computational finance. The findings of the recent literature agree, in general, that the returns of various financial instruments (e.g., stocks and indices, and exchange rates) can be predicted (with some limitations) from their past returns and sets of other variables, including technical and macroeconomic indicators.

With hundreds of models and algorithms employed, it is unclear whether there is a strong prevalence of some machine learning methods over others. Indeed, for practitioners, it is a hard task to settle on the optimal design of a model, given the variety of options presented in the academic literature.

Goal, Purpose, and Structure
In this work, an attempt to aggregate and systemize the findings of the academic literature on the topic of the application of machine learning to financial time series has been made. The author has tried to answer the research question: What are the current status, trends, and robustness of the research results on the usage of machine learning techniques for the forecasting of financial time series? By answering this question, the thesis seeks to review and classify the different approaches and to provide well-structured information that can help practitioners working with financial time series to choose an effective machine learning model design.

To reach this goal, the following strategy has been chosen: first, to describe current findings from the academic literature, exploring current usage of machine learning techniques by researchers and financial institutions (if this information is available); second, to compare results and identify trends over time; and finally to provide recommendations and suggestions based on these findings, where possible. The work includes a survey of the academic literature, focusing on the four main groups of models used for the prediction of financial time series—neural networks, support vector machines, nearest neighbors, and hybrid/evolutionary methods—and on all the subsequent steps required to build a model based on machine learning techniques. By this research the author is contributing to the current state of knowledge in the field because the survey includes a time period that has never before been covered in any relevant sources. Moreover, only a few surveys dedicated to this topic have ever been published.

The thesis shows what kinds of algorithms, input data, time intervals, financial instruments, etc. have been used and what the results were. A critical view on the validity of the results based on how they were achieved is provided, where possible.