CompInvestI Homework 1

From Quantwiki
Jump to: navigation, search

Overview

The purpose of this assignment is to

  • Introduce you to historical equity data
  • Introduce you to Python & Numpy, and
  • Give you a first look at portfolio optimization

We also hope it will get you started having opinions about equities. In this assignment you will create and optimize a portfolio for the year 2011.

Important note: This is not a realistic way to build a strong portfolio going forward. The intent is for you to learn how to assess a portfolio.

To do

Part 1: Examine QSTK_Tutorial_1. You can use that code as a template for this assignment.

Part 2: Write a Python function that can simulate and assess the performance of a 4 stock portfolio.

Inputs to the function include:

  • Start date
  • End date
  • Symbols for for equities (e.g., GOOG, AAPL, GLD, XOM)
  • Allocations to the equities at the beginning of the simulation (e.g., 0.2, 0.3, 0.4, 0.1)

The function should return:

  • Standard deviation of daily returns of the total portfolio
  • Average daily return of the total portfolio
  • Sharpe ratio (Always assume you have 252 trading days in an year. And risk free rate = 0) of the total portfolio
  • Cumulative return of the total portfolio

An example of how you might call the function in your program:

vol, daily_ret, sharpe, cum_ret = simulate(startdate, enddate, ['GOOG','AAPL','GLD','XOM'], [0.2,0.3,0.4,0.1])

Some assumptions:

  • Allocate some amount of value to each equity on the first day. You then "hold" those investments for the entire year.
  • Use adjusted close data. In QSTK, this is 'close'
  • Report statistics for the entire portfolio

Part 2.5: Make sure your simulate() function gives correct output. Check it against the examples below.

Part 3: Use your function to create a portfolio optimizer!

Create a for loop (or nested for loop) that enables you to test every "legal" set of allocations to the 4 stocks. Keep track of the "best" portfolio, and print it out at the end.

  • "Legal" set of allocations means: The allocations sum to 1.0. The allocations are in 10% increments.
    • Example legal allocations: [1.0, 0.0, 0.0, 0.0], [0.1, 0.1, 0.1, 0.7]
  • "Best" portfolio means: Highest Sharpe Ratio.

Part 4:

  • Create a chart that illustrates the value of your portfolio over the year and compares it to SPY.

Example output

Here's an example output for your program. These are actual correct examples that you can use to check your work.

Start Date: January 1, 2011
End Date: December 31, 2011
Symbols: ['AAPL', 'GLD', 'GOOG', 'XOM']
Optimal Allocations: [0.4, 0.4, 0.0, 0.2]
Sharpe Ratio: 1.02828403099
Volatility (stdev of daily returns):  0.0101467067654
Average Daily Return:  0.000657261102001
Cumulative Return:  1.16487261965
Start Date: January 1, 2010
End Date: December 31, 2010
Symbols: ['AXP', 'HPQ', 'IBM', 'HNZ']
Optimal Allocations:  [0.0, 0.0, 0.0, 1.0]
Sharpe Ratio: 1.29889334008
Volatility (stdev of daily returns): 0.00924299255937
Average Daily Return: 0.000756285585593
Cumulative Return: 1.1960583568

Minor differences in float values may arise due to different implementations.

Note: It might be a good idea before starting the program the homework to clear the cache. You'll need to go to the Scratch directory that gets printed every time you run the program. And Delete everything from that QSScratch directory. OR an easier way to do this will be to use :

c_dataobj = da.DataAccess('Yahoo', cachestalltime=0)

Implementation suggestions & assumptions

It is useful to look at QSTK_Tutorial_1 and QSTK_Tutorial_3, but please realize that the method in Tutorial 3 assumes daily rebalancing, which we do not use here.

Here is a suggested outline for your simulation() code:

  • Read in adjusted closing prices for the 4 equities.
  • Normalize the prices according to the first day. The first row for each stock should have a value of 1.0 at this point.
  • Multiply each column by the allocation to the corresponding equity.
  • Sum each row for each day. That is your cumulative daily portfolio value.
  • Compute statistics from the total portfolio value.

Here are some notes and assumptions:

  • When we compute statistics on the portfolio value, we include the first day.
  • We assume you are using the data provided with QSTK. If you use other data your results may turn out different from ours. Yahoo's online data changes every day. We could not build a consistent "correct" answer based on "live" Yahoo data.
  • Assume 252 trading days/year.

What to expect when you turn in your assignment

First, make sure your program is working correctly by checking your output against a few of our model examples. Once you're ready, take the quiz. The quiz will give you a start and end date, as well as a set of equities to use. You should run your program with those values. The quiz will ask you about the values your program calculates.

Extra challenges

If you felt that this assignment was "too easy" try these additional challenges. Sorry, but we don't offer extra credit:

  • Note that we requested the optimal portfolio in allocation "chunks" of 10%. This was to keep the search space down. There at most 10,000 legal portfolios (10*10*10*10). If we wanted a more precise answer, say in 1% increments, it would require you to check up to 100,000,000 portfolios, and may take to long for you to check them in a brute force manner. Challenge: Devise a way to efficiently search the space of possible portfolios so that you can find a more precise answer without having to test 100M portfolios (hint: gradient ascent).
  • Find the optimal portfolio of N equities given M equities as input. For example, what is the best portfolio of 10 stocks given all of the S&P 500?
  • Allow short positions (i.e., negative allocations).
  • Brag about your results on piazza!

Deliverables for on campus GT students

(Note that students taking the course via Coursera should complete this assignment by taking the quiz. It will ask you to optimize a different set of equities and confirm your results.)

GT students, please ensure that your program generates output as per the examples above. Your program should accept a command line like this:

python optimizer.py startyear startmonth startday endyear endmonth endday symbol1 symbol2 symbol3 symbol4

So for, example:

python optimizer.py 2010 1 1 2011 1 1 AAPL GLD GOOG XOM

Would optimize over the period from Jan 1 2010 to Jan 2 2011 using the symbols AAPL, GLD, GOOG and XOM.

Run your optimizer with the following command lines and place the output in your report:

python optimizer.py 2010 6 1 2011 6 1 AAPL GLD GOOG XOM
python optimizer.py 2004 1 1 2006 1 1 MMM MO MSFT INTC

Submit the following via T-square:

  • Your report: report.pdf, including
    • The output of the two command lines above
    • Details about any of the extra credit components you attempted, include results and commentary.
  • Your code: optimizer.py
  • Additional code that supports your extra credit components.