We follow python enhancement proposal 8, which is described here: PEP 8. To enforce this we ask that you install pylint and run it (with the default parameters) on code before you check it in. As a rule of thump we expect the code to be at least an 8.0/10 as reported by pylint. The code does not currently conform, but we are re-formatting it little by little. Please see qstkfeat/classes.py for a 10/10 pylint example.
Although PEP 8 allows for various naming schemes we use
lower_case_with_underscores for variables, short, lowercase names for modules, and
CamelCase for classes. If you are in doubt look at PEP 8.
Because dealing with complex data types in a duck typed language can be extremely confusing, we are experimenting with Hungarian notation. We have generally found that the programmer already does this to a certain extent. E.g. names like
symbol_list start_time data_array. These are all poor attempts to describe the actual intended type. These would become
ls_symbols dt_start na_data using our Hungarian standard. A list of strings, a datetime, and a numpy array.
Additionally this often makes naming a much easier, more brainless procedure. For example if you have a numpy array of returns and you convert to a dataframe, you will have
na_returns before and
df_returns after. No need to try to think of two different names for exactly the same data. Additionally, when iterating over a list, simply remove the list prefix, e.g.
for s_sym in ls_sym:
Again, most programmers already do this:
for symbol in symbols:
for symbol in all_symbol:
but in a non-standardized way.
One downside is that you cannot reuse variable names, e.g. you have to change from
df_returns when you do the conversion. We do not view this as much of a downside since we strongly discourage variable name reuse anyway.
The biggest downside is when writing functions which support multiple types, one thing that makes Python extremely powerful. For these cases you can use o_ and c_ as a generic object and class prefixes.
Please use the hungarian notation below where it is already in place, and if you are writing something from scratch we strongly encourage you to try it. If you encounter a commonly used type or class which does not have a standard prefix, add it to the table. Whether you like or dislike this practice please report your thoughts to John Cornwell. Please see qstkfeat/classes.py for a good example of this notation.
|s<>_*||Set of <>||ss_stringset|
|l<>_*||List of <>||lf_float_list|
|ll<>_*||List of list of <>||llf_list_of_float_lists|
|fr_*||File for reading||fr_data|
|fw_*||File for writing||fw_data|
3rd Party Types
We use all spaces, 0 tabs with 4 spaces to an indent, and unix line endings. No exceptions here.
Please follow the PEP 8 guidelines on whitespaces, this will be enforced by pylint.
When adding modules to QSTK, we ask that they be tested as much as possible by unit tests. Please follow the basic example Here. Each package should have a sub-package named tests, which tests named test_<what_this_tests>.py. See quicksim/tests for a working example. If you download nose and run nosetests in the root directory of QSTK your tests should be run automatically. If they do not you most likely have something named incorrectly.
Additionally, you should install the coverage module so that you can run nosetests --with-coverage to see how much of the code your test cases actually cover.
Please have the following header at the top of each file you create:
''' (c) 2011, 2012 Georgia Tech Research Corporation This source code is released under the New BSD license. Please see http://wiki.quantsoftware.org/index.php?title=QSTK_License for license details. Created on Mar 20, 2012 @author: <author> @contact: <contact> @summary: <summary> ''' # Python imports # 3rd party imports # QSTK imports
In addition, all of your public methods must have a docstring. We use Epydoc and a sample method docstring is shown below. The required parameters are summary and param, use all others at your own discretion, a full list of fields can be found here.
def quickSim( alloc, historic, start_cash ): """ @summary Quickly back tests an allocation for certain historical data, using a starting fund value. @param alloc: DataMatrix containing timestamps to test as indices and Symbols to test as columns, with _CASH symbol as the last column. @param historic: Historic dataframe of equity prices. @param start_cash: Integer specifing initial fund value. @return funds: TimeSeries with fund values for each day in the back test. @rtype TimeSeries @see: See note @note: Make sure to check warning @warning: Disregard """
If you are going to create a module or package, please read Python Modules before structuring your package. We are going to attempt to move away from syntax like the following:
import qstkfeat.featutil import qstkfeat.features import qstkfeat.classes qstkfeat.features.featFunc() qstkfeat.featutil.utilFunc()
If you have related module functions or classes that will most likely be used in conjunction: simply include them all in __init__.py. That is why the file exists. Then we have the following syntax, which is cleaner, and still easy to track down function namespaces.
import qstkfeat as qsfeat qsfeat.featFunc() qsfeat.utilFunc()