Coding Standard

From Quantwiki
Jump to: navigation, search

Contents

PEP 8

We follow python enhancement proposal 8, which is described here: PEP 8. To enforce this we ask that you install pylint and run it (with the default parameters) on code before you check it in. As a rule of thump we expect the code to be at least an 8.0/10 as reported by pylint. The code does not currently conform, but we are re-formatting it little by little. Please see qstkfeat/classes.py for a 10/10 pylint example.

Naming Conventions

Although PEP 8 allows for various naming schemes we use lower_case_with_underscores for variables, short, lowercase names for modules, and CamelCase for classes. If you are in doubt look at PEP 8.

Hungarian Notation

Because dealing with complex data types in a duck typed language can be extremely confusing, we are experimenting with Hungarian notation. We have generally found that the programmer already does this to a certain extent. E.g. names like symbol_list start_time data_array. These are all poor attempts to describe the actual intended type. These would become ls_symbols dt_start na_data using our Hungarian standard. A list of strings, a datetime, and a numpy array.

Additionally this often makes naming a much easier, more brainless procedure. For example if you have a numpy array of returns and you convert to a dataframe, you will have na_returns before and df_returns after. No need to try to think of two different names for exactly the same data. Additionally, when iterating over a list, simply remove the list prefix, e.g.

for s_sym in ls_sym:

Again, most programmers already do this:

for symbol in symbols:

or:

for symbol in all_symbol:

but in a non-standardized way.

Downsides/Exceptions

One downside is that you cannot reuse variable names, e.g. you have to change from na_returns to df_returns when you do the conversion. We do not view this as much of a downside since we strongly discourage variable name reuse anyway. The biggest downside is when writing functions which support multiple types, one thing that makes Python extremely powerful. For these cases you can use o_ and c_ as a generic object and class prefixes.

Expectations

Please use the hungarian notation below where it is already in place, and if you are writing something from scratch we strongly encourage you to try it. If you encounter a commonly used type or class which does not have a standard prefix, add it to the table. Whether you like or dislike this practice please report your thoughts to John Cornwell. Please see qstkfeat/classes.py for a good example of this notation.

Standard Types

Prefix Type Examples
i_* Integer i_index
l_* Long l_index
b_* Bollean b_flag
f_* Float f_returns
s_* String s_symbol
s<>_* Set of <> ss_stringset
t_* Tuple i_index
d_* Dict d_data
c_* Generic Class c_writer
d_* Dict d_data
fc_* Function fc_callback
l<>_* List of <> lf_float_list
ll<>_* List of list of <> llf_list_of_float_lists
fr_* File for reading fr_data
fw_* File for writing fw_data

3rd Party Types

Prefix Type Examples
na_* Numpy Array l_index
ts_* Pandas TimeSeries ts_close
df_* Pandas DataFrame df_close

Whitespace

We use all spaces, 0 tabs with 4 spaces to an indent, and unix line endings. No exceptions here.

Please follow the PEP 8 guidelines on whitespaces, this will be enforced by pylint.

Unit Testing

When adding modules to QSTK, we ask that they be tested as much as possible by unit tests. Please follow the basic example Here. Each package should have a sub-package named tests, which tests named test_<what_this_tests>.py. See quicksim/tests for a working example. If you download nose and run nosetests in the root directory of QSTK your tests should be run automatically. If they do not you most likely have something named incorrectly.

Additionally, you should install the coverage module so that you can run nosetests --with-coverage to see how much of the code your test cases actually cover.

Docstrings

Please have the following header at the top of each file you create:

'''
(c) 2011, 2012 Georgia Tech Research Corporation
This source code is released under the New BSD license.  Please see
http://wiki.quantsoftware.org/index.php?title=QSTK_License
for license details.

Created on Mar 20, 2012

@author: <author>
@contact: <contact>
@summary: <summary>
'''

# Python imports


# 3rd party imports


# QSTK imports


In addition, all of your public methods must have a docstring. We use Epydoc and a sample method docstring is shown below. The required parameters are summary and param, use all others at your own discretion, a full list of fields can be found here.


def quickSim( alloc, historic, start_cash ):
    """
    @summary Quickly back tests an allocation for certain historical data, 
             using a starting fund value.
    @param alloc: DataMatrix containing timestamps to test as indices and 
                 Symbols to test as columns, with _CASH symbol as the last 
                 column.
    @param historic: Historic dataframe of equity prices.
    @param start_cash: Integer specifing initial fund value.
    @return funds: TimeSeries with fund values for each day in the back test.
    @rtype TimeSeries
    @see: See note
    @note: Make sure to check warning
    @warning: Disregard
    """


__init__.py

If you are going to create a module or package, please read Python Modules before structuring your package. We are going to attempt to move away from syntax like the following:

import qstkfeat.featutil
import qstkfeat.features
import qstkfeat.classes  

qstkfeat.features.featFunc()
qstkfeat.featutil.utilFunc()

If you have related module functions or classes that will most likely be used in conjunction: simply include them all in __init__.py. That is why the file exists. Then we have the following syntax, which is cleaner, and still easy to track down function namespaces.

import qstkfeat as qsfeat

qsfeat.featFunc()
qsfeat.utilFunc()
Personal tools