ML4Trading

From Quant Fund Software Wiki

Jump to: navigation, search

ML4TradingOverview

Contents

Machine Learning for Trading

keywords: algorithmic, algorithms, trading, machine learning, ML, finance, equities, markets, quantitative finance, georgia tech, gt, Tucker Balch

Important Announcements

New:

  • Project 3 is posted. It is due Weds Dec 7 at 11;55PM
  • Project 4 is posted. It is due Thurs Dec 15 at 11;55PM

Older:

Course Overview

This course introduces students to the real world challenges of implementing machine learning based trading strategies including the algorithmic steps from information gathering to market orders. The focus is on how to apply probabilistic machine learning approaches to trading decisions. We consider statistical approaches like linear regression, KNN and regression trees and how to apply them to actual stock trading situations.

Who the Course is For

The course is open to and intended for graduate and upper level undergraduate students in Computing, ISYE, Math & Management.

Prerequisites: Students should have strong coding skills and some familiarity with equity markets. No finance or machine learning experience is assumed. Here's a short test to check if you have strong programming skills: quiz. If you don't do well on that quiz, you should either drop the course, or be sure to plan so that you can devote extra time to the course.

Note that this course serves CS major students with machine learning experience, as well as students in other majors such as ISYE, MGMT, or MATH who have different experiences. All types of students are welcome! The ML topics might be "review" for CS students, while finance parts will be review for finance students. However, even if you have experience in these topics, you will find that we consider them in a different way than you might have seen before, in particular with an eye towards implementation for trading.

Student Responsibilities

  • Read the emails sent to the course email list. Check at least daily.
  • Participate in class and via the piazza site.
  • Don't plagiarize.

Course Logistics

  • Instructor: Associate Professor Tucker Balch
    • Office hours: Tu/Th 4:30-5:30 (after class) or by appointment
    • firstname at cc.gatech.edu
    • phone 678-523-8685
  • TA Vishal Shekhar
    • Office hours: Thursdays 5:00-6:00PM (please schedule)
    • Office: CCB 308
    • mailvishalshekhar-at-gmail-dot-com
  • TA Joe Lin
    • Office hours: Tuesdays 4:30-6:00PM (please schedule)
    • Office: CCB 308
    • joelin-at-gmail-dot-com
  • Time/Location: Tuesday & Thursday 3:05 to 4:30, map: [1]
  • Course Website: http://wiki.quantsoftware.org/wiki/ML4Trading. (this webpage)
  • Some parts of the course will be based on readings from this book: Active Portfolio Management by Grinold & Kahn. You will have to complete the readings, but you don't have to buy the book if you are a GT faculty or student, you may be able to read the book online [here] at no cost.
  • Prerequisites: Machine learning and portfolio management experience is not assumed; the course is designed to provide students with the necessary background they will need on these topics. Programming will be in the Python language. Students are expected to be strong programmers (or willing to invest significant effort in learning to program in Python).
  • Computing_environment_for_ML4Trading
  • GT T-Square site for the class
  • discussion forum on piazza.com.
  • Resources

Goals For the Course

By the end of this course, students should be able to:

  • Understand data structures used for algorithmic trading.
  • Know how to construct software to access live equity data, assess it, and make trading decisions.
  • Understand 3 popular machine learning algorithms and how to apply them to trading problems.
  • Understand how to assess a machine learning algorithm's performance for time series data (stock price data).
  • Know how and why data mining (machine learning) techniques fail.
  • Construct a stock trading software system that uses current daily data.

Some limitations/constraints:

  • We use daily data. This is not an HFT course, but many of the concepts here are relevant.
  • We don't interact (trade) directly with the market, but we will generate equity allocations that you could trade if you wanted to.

Grading

  • Projects: 65%
  • Exam 1: 6%
  • Exam 2: 7%
  • Homeworks: 10%
  • Class participation: 5%
  • Exam 3: 7%

Plagiarism

In most cases I expect all code that you submit was written by you. I will present some libraries in class that you are allowed to use (such as pandas and numpy). Otherwise, all source code, images and write ups you provide should have been created by you alone.

What is allowed:

  • Meeting with other students to discuss implementations. You should talk about solutions at the pseudo code level.
  • Sharing snippets of code to solve specific (small) problems such as examples of how to address sections of arrays in Python. In this case the shared code should not be more than 5 lines.
  • Searching the web for other solution outlines that you may draw on (but not copy directly). If you are inspired by a solution on the web, you MUST cite that code with comments in your code.

What is not allowed:

  • Copying sections of code longer than 5 lines. Note that merely changing variable names does not suffice.
  • Copying code from the web.
  • Use of ideas from the web that are not cited in comments.

Projects & Homework

Other Resources

Syllabus

Day Date Lecture Topics Projects Due/Additional Info
August
Tuesday August 23 Class overview
Thursday August 25 lecture 1: Infrastructure of a quant shop.

Policy learning, prediction learning.
see also: Work_Flow_Guide

Tuesday August 30 lecture 2: Why does information impact stock price?

Lecture: What is a company worth?

Thursday August 25 The Capital Assets Pricing Model

Video: Cramer's Guide to the Stock Market
Lecture: Expectation, Risk, and Diversification

Read Chapter 1 & 2 from Grinold & Kahn

Homework 1 due at 11:55PM


Tuesday September 13 Work through setting up environment and tutorial1.py Homework 2 due at 11:55PM
Thursday September 15 Review of hedge fund infrastructure

What goes into an optimizer? And what comes out?
Efficient frontier

Homework 3 due at 11:55PM
Tuesday September 20 Lecture 3: Markets, Data
Thursday September 22 Technical Analysis & Event Profiling Part 1
Tuesday September 27 Technical Analysis & Event Profiling Part 2

paper about event studies Details of Homework 4

Tuesday October 4 Event Profiling Part 3

KNN (statement of problem in terms of regression) Andrew Moore's slides on KNN

Thursday October 6 Anatomy of a trade

KNN intro.

Tuesday October 11 Review of event study code and assignment.

KNN code intro.

Thursday October 13 Jonathan Clarke

Analysis of Spatial Arbitrage

...
Tuesday October 25 Eric Gilbert

Widespread Worry and the Stock Market

Thursday October 27 Full description of KNN project (Project 2)
Tuesday November 1 Assessing learners: Time, Correlation, Error
Thursday November 3 Using and building decision trees (and KD trees)

time complexity of KNN versus decision trees
problems with KNN
features: importance of -1 to 1: normalize by STD
momentum
daily return std
"bollinger number"
52 week high/low
price - moving average
days since crossover
volume / avg(52 week volume)
project 3

Tuesday November 8 Finding the best indicators to use

adding features method
lesioning features method

Thursday November 10 Finding the best indicators to use

adding features method
lesioning features method


Personal tools