Build Custom Algorithms Using the Base Algorithm Class¶
Algorithms automatically provide the following
member variables to any custom algorithm that derives
the analysis_engine.algo.BaseAlgo.process
method.
By deriving the process()
member method using an inherited
class, you can quickly build algorithms that
determine buy and sell conditions from
any of the automatically extracted
datasets from the redis pipeline:
self.df_daily
self.df_minute
self.df_calls
self.df_puts
self.df_quote
self.df_pricing
self.df_stats
self.df_peers
self.df_iex_news
self.df_financials
self.df_earnings
self.df_dividends
self.df_company
self.df_yahoo_news
self.df_tdcalls
self.df_tdputs
Recent Pricing Information
self.latest_close
self.latest_high
self.latest_open
self.latest_low
self.latest_volume
self.today_close
self.today_high
self.today_open
self.today_low
self.today_volume
self.ask
self.bid
Latest Backtest Date and Intraday Minute
self.latest_min
self.backtest_date
Note
self.latest_min - Latest minute row in self.df_minute
Note
self.backtest_date - Latest dataset date which is considered the
backtest date for historical testing with the data pipeline
structure (it’s the date
key in the dataset node root level)
Trading Strategy
self.trade_strategy = 'count'
- if the number of indicators- saying buy or sell exceeds the buy/sell rules
min_indicators
the algorithm will trigger a buy or sell
self.buy_reason
- derived algorithms can attach custom- buy reasons as a string to each trade order
self.sell_reason
- derived algorithms can attach custom- sell reasons as a string to each trade order
Timeseries
self.timeseries
- use an algorithm config to setday
orminute
to process daily or intraday minute by minute datasets. Indicators will still have access to all datasets, this just makes it easier to utilize the helper within an indicator to quickly get the correct dataset:df_status, use_df = self.get_subscribed_dataset( dataset=dataset)
Balance Information
self.balance
- current algorithm account balanceself.prev_bal
- previous balanceself.net_value
- total value the algorithm has- left remaining since starting trading. this includes
the number of
self.num_owned
shares with theself.latest_close
price included
self.net_gain
- amount the algorithm has- made since starting including owned shares
with the
self.latest_close
price included
Note
If a key is not in the dataset, the
algorithms’s member variable will be an empty
pandas DataFrame created with: pandas.DataFrame([])
except self.pricing
which is just a dictionary.
Please ensure the engine successfully fetched
and cached the dataset in redis using a tool like
redis-cli
and a query of keys *
or
keys <TICKER>_*
on large deployments.
Indicator Information
self.buy_rules
- optional - custom dictionary for passing- buy-side business rules to a custom algorithm
self.sell_rules
- optional - custom dictionary for passing- sale-side business rules to a custom algorithm
self.min_buy_indicators
- ifself.buy_rules
has- a value for buying if a
minimum
number of indicators detect a value that is within a buy condition
self.min_sell_indicators
- ifself.sell_rules
has- a value for selling if a
minimum
number of indicators detect a value that is within a sell condition
self.latest_ind_report
- latest dictionary of values- from the
IndicatorProcessor.process()
self.latest_buys
- latest indicators saying buyself.latest_sells
- latest indicators saying sellself.num_latest_buys
- latest number of indicators saying buyself.num_latest_sells
- latest number of indicators saying sellself.iproc
- member variables for theIndicatorProcessor
- that holds all of the custom algorithm indicators
Indicator buy and sell records in self.latest_buys
and
self.latest_sells
have a dictionary structure:
{
'name': indicator_name,
'id': indicator_id,
'report': indicator_report_dict,
'cell': indicator cell number
}
Supported environment variables
# to show debug, trace logging please export ``SHARED_LOG_CFG``
# to a debug logger json file. To turn on debugging for this
# library, you can export this variable to the repo's
# included file with the command:
export SHARED_LOG_CFG=/opt/sa/analysis_engine/log/debug-logging.json
-
class
analysis_engine.algo.
BaseAlgo
(ticker=None, balance=5000.0, commission=6.0, tickers=None, name=None, use_key=None, auto_fill=True, version=1, config_file=None, config_dict=None, output_dir=None, publish_to_slack=False, publish_to_s3=False, publish_to_redis=False, publish_input=True, publish_history=True, publish_report=True, load_from_s3_bucket=None, load_from_s3_key=None, load_from_redis_key=None, load_from_file=None, load_compress=False, load_publish=True, load_config=None, report_redis_key=None, report_s3_bucket=None, report_s3_key=None, report_file=None, report_compress=False, report_publish=True, report_config=None, history_redis_key=None, history_s3_bucket=None, history_s3_key=None, history_file=None, history_compress=False, history_publish=True, history_config=None, extract_redis_key=None, extract_s3_bucket=None, extract_s3_key=None, extract_file=None, extract_save_dir=None, extract_compress=False, extract_publish=True, extract_config=None, dataset_type=20000, serialize_datasets=['daily', 'minute', 'quote', 'stats', 'peers', 'news1', 'financials', 'earnings', 'dividends', 'company', 'news', 'calls', 'puts', 'pricing', 'tdcalls', 'tdputs'], timeseries=None, trade_strategy=None, verbose=False, verbose_processor=False, verbose_indicators=False, verbose_trading=False, verbose_load=False, verbose_extract=False, verbose_history=False, verbose_report=False, inspect_datasets=False, raise_on_err=True, **kwargs)[source]¶ Run an algorithm against multiple tickers at once through the redis dataframe pipeline provided by analysis_engine.extract.extract.
Data Pipeline Structure
This algorithm can handle an extracted dictionary with structure:
import pandas as pd from analysis_engine.algo import BaseAlgo ticker = 'SPY' demo_algo = BaseAlgo( ticker=ticker, balance=1000.00, commission=6.00, name=f'test-{ticker}') date = '2018-11-05' dataset_id = f'{ticker}_{date}' # mock the data pipeline in redis: data = { ticker: [ { 'id': dataset_id, 'date': date, 'data': { 'daily': pd.DataFrame([ { 'high': 280.01, 'low': 270.01, 'open': 275.01, 'close': 272.02, 'volume': 123, 'date': '2018-11-01 15:59:59' }, { 'high': 281.01, 'low': 271.01, 'open': 276.01, 'close': 273.02, 'volume': 124, 'date': '2018-11-02 15:59:59' }, { 'high': 282.01, 'low': 272.01, 'open': 277.01, 'close': 274.02, 'volume': 121, 'date': '2018-11-05 15:59:59' } ]), 'calls': pd.DataFrame([]), 'puts': pd.DataFrame([]), 'minute': pd.DataFrame([]), 'pricing': pd.DataFrame([]), 'quote': pd.DataFrame([]), 'news': pd.DataFrame([]), 'news1': pd.DataFrame([]), 'dividends': pd.DataFrame([]), 'earnings': pd.DataFrame([]), 'financials': pd.DataFrame([]), 'stats': pd.DataFrame([]), 'peers': pd.DataFrame([]), 'company': pd.DataFrame([]) } } ] } # run the algorithm demo_algo.handle_data(data=data) # get the algorithm results results = demo_algo.get_result() print(results)
-
build_progress_label
(progress, total)[source]¶ create a progress label string for the logs
Parameters: - progress – progress counter
- total – total number of counts
-
build_ticker_history
(ticker, ignore_keys)[source]¶ For all records in
self.order_history
compile a filter list of history records perticker
while pruning any keys that are in the list ofignore_keys
Parameters: - ticker – string ticker symbol
- ignore_history_keys – list of keys to not include in the history report
-
create_algorithm_ready_dataset
()[source]¶ Create the
Algorithm-Ready
dataset during theself.publish_input_dataset()
member method. Inherited Algorithm classes can derive how they build a customAlgorithm-Ready
dataset before publishing by implementing this method in the derived class.
-
create_buy_order
(ticker, row, minute=None, shares=None, reason=None, orient='records', date_format='iso', is_live_trading=False)[source]¶ create a buy order at the close or ask price
Note
setting the
minute
is required to build a minute-by-minuteTrading History
Parameters: - ticker – string ticker
- shares – optional - integer number of shares to buy
if None buy max number of shares at the
close
with the availablebalance
amount. - row –
dictionary
orpandas.DataFrame
row record that will be converted to a json-serialized string - minute – optional - string datetime when the order
minute the order was placed. For
day
timeseries this is the close of trading (16:00:00 for the day) and forminute
timeseries the value will be the latest minute from theself.df_minute
pandas.DataFrame
. Normally this value should be set to theself.use_minute
, and the format isae_consts.COMMON_TICK_DATE_FORMAT
- reason – optional - reason for creating the order which is useful for troubleshooting order histories
- orient – optional - pandas orient for
row.to_json()
- date_format – optional - pandas date_format
parameter for
row.to_json()
- is_live_trading – optional - bool for filling trades
for live trading or for backtest tuning filled
(default
False
which is backtest mode)
-
create_history_dataset
()[source]¶ Create the
Trading History
dataset during theself.publish_trade_history_dataset()
member method. Inherited Algorithm classes can derive how they build a customTrading History
dataset before publishing by implementing this method in the derived class.
-
create_report_dataset
()[source]¶ Create the
Trading Performance Report
dataset during theself.publish_input_dataset()
member method. Inherited Algorithm classes can derive how they build a customTrading Performance Report
dataset before publishing by implementing this method in the derived class.
-
create_sell_order
(ticker, row, minute=None, shares=None, reason=None, orient='records', date_format='iso', is_live_trading=False)[source]¶ create a sell order at the close or ask price
Note
setting the
minute
is required to build a minute-by-minuteTrading History
Parameters: - ticker – string ticker
- shares – optional - integer number of shares to sell
if None sell all owned shares at the
close
- row –
pandas.DataFrame
row record that will be converted to a json-serialized string - minute – optional - string datetime when the order
minute the order was placed. For
day
timeseries this is the close of trading (16:00:00 for the day) and forminute
timeseries the value will be the latest minute from theself.df_minute
pandas.DataFrame
. Normally this value should be set to theself.use_minute
, and the format isae_consts.COMMON_TICK_DATE_FORMAT
- reason – optional - reason for creating the order which is useful for troubleshooting order histories
- orient – optional - pandas orient for
row.to_json()
- date_format – optional - pandas date_format
parameter for
row.to_json()
- is_live_trading – optional - bool for filling trades
for live trading or for backtest tuning filled
(default
False
which is backtest mode)
-
determine_indicator_datasets
()[source]¶ Indicators are coupled to a dataset in the algorithm config file. This allows for identifying the exact datasets to pull from Redis to speed up backtesting.
-
get_indicator_process_last_indicator
()[source]¶ Used to pull the indicator object back up to any created
analysis_engine.algo.BaseAlgo
objectsTip
this is for debugging data and code issues inside an indicator
-
get_indicator_processor
(existing_processor=None)[source]¶ singleton for getting the indicator processor
Parameters: existing_processor – allow derived algos to build their own indicator processor and pass it to the base
Parameters: ticker – ticker to lookup
-
get_supported_tickers_in_data
(data)[source]¶ For all updates found in
data
compare to the supported list ofself.tickers
to make sure the updates are relevant for this algorithm.Parameters: data – new data stream to process in this algo
-
get_ticker_positions
(ticker)[source]¶ get the current positions for a ticker and returns a tuple:
num_owned (integer), buys (list), sells (list)`
num_owned, buys, sells = self.get_ticker_positions( ticker=ticker)
Parameters: ticker – ticker to lookup
-
get_trade_history_node
()[source]¶ Helper for quickly building a history node on a derived algorithm. Whatever member variables are in the base class
analysis_engine.algo.BaseAlgo
will be added automatically into the returned:historical transaction dictionary
Tip
if you get a
None
back it means there could be a bug in how you are using the member variables (likely created an invalid math calculation) or could be a bug in the helper: build_trade_history_entry
-
handle_daily_dataset
(algo_id, ticker, node)[source]¶ handle running the algorithm with daily values
This method will call
BaseAlgo.process()
once per day which is also utilizing the daily caching strategyParameters: - algo_id – string - algo identifier label for debugging datasets during specific dates
- ticker – string - ticker
- node – dataset to process
-
handle_data
(data)[source]¶ process new data for the algorithm using a multi-ticker mapping structure
Parameters: data – dictionary of extracted data from the redis pipeline with a structure:
ticker = 'SPY' # string usually: YYYY-MM-DD date = '2018-11-05' # redis cache key for the dataset format: <ticker>_<date> dataset_id = f'{ticker}_{date}' dataset = { ticker: [ { 'id': dataset_id, 'date': date, 'data': { 'daily': pd.DataFrame([]), 'minute': pd.DataFrame([]), 'quote': pd.DataFrame([]), 'stats': pd.DataFrame([]), 'peers': pd.DataFrame([]), 'news1': pd.DataFrame([]), 'financials': pd.DataFrame([]), 'earnings': pd.DataFrame([]), 'dividends': pd.DataFrame([]), 'calls': pd.DataFrame([]), 'puts': pd.DataFrame([]), 'pricing': pd.DataFrame([]), 'news': pd.DataFrame([]) } } ] }
-
handle_minute_dataset
(algo_id, ticker, node, start_row=0)[source]¶ handle running the algorithm with daily values
This method will call
BaseAlgo.process()
once per day which is also utilizing the daily caching strategyParameters: - algo_id – string - algo identifier label for debugging datasets during specific dates
- ticker – string - ticker
- node – dataset to process
- start_row – start row default is
0
-
inspect_dataset
(algo_id, ticker, dataset)[source]¶ Use this method inside of an algorithm’s
process()
method to view the available datasets in the redis cacheParameters: - algo_id – string - algo identifier label for debugging datasets during specific dates
- ticker – string - ticker
- dataset – a dictionary of identifiers (for debugging) and
-
load_custom_datasets
()[source]¶ Handler for loading custom datasets for indicators
Tip
Custom datasets allow indicators to analyze more than the default pricing data provided by
IEX Cloud
andTradier
. This is helpful for building indicators to analyze and train AI from a previous algorithmTrading History
.
-
load_from_config
(config_dict)[source]¶ support for replaying algorithms from a trading history
Parameters: config_dict – algorithm configuration values usually from a previous trading history or for quickly testing dataset theories in a development environment
-
load_from_dataset
(ds_data)[source]¶ Load the member variables from the extracted
ds_data
dataset.algorithms automatically provide the following member variables to
myalgo.process()
for quickly building algorithms:self.df_daily
self.df_minute
self.df_calls
self.df_puts
self.df_quote
self.df_pricing
self.df_stats
self.df_peers
self.df_iex_news
self.df_financials
self.df_earnings
self.df_dividends
self.df_company
self.df_yahoo_news
self.df_tdcalls
self.df_tdputs
Note
If a key is not in the dataset, the algorithms’s member variable will be an empty
pandas.DataFrame([])
. Please ensure the engine cached the dataset in redis using a tool likeredis-cli
to verify the values are in memory.Parameters: ds_data – extracted, structured dataset from redis
-
load_from_external_source
(path_to_file=None, s3_bucket=None, s3_key=None, redis_key=None)[source]¶ Load an algorithm-ready dataset for
handle_data
backtesting and trade performance analysis from:- Local file
- S3
- Redis
Parameters: - path_to_file – optional - path to local file
- s3_bucket – optional - s3 s3_bucket
- s3_key – optional - s3 key
- redis_key – optional - redis key
-
plot_trading_history_with_balance
(algo_id, ticker, node)[source]¶ This will live plot the trading history after each day is done
Parameters: - algo_id – string - algo identifier label for debugging datasets during specific dates
- ticker – string - ticker
- node – dataset to process
-
populate_intraday_events_dict
(start_min, end_min)[source]¶ For tracking intraday buy/sell/news events with indicators use this method to build a dictionary where keys are the minutes between
start_date
andend_date
. If both areNone
then theself.df_minute
Parameters: - start_min – start datetime for building the
self.intraday_events
dictionary keys - end_min – end datetime for building the
self.intraday_events
dictionary keys
- start_min – start datetime for building the
-
prepare_for_new_indicator_run
()[source]¶ Call this for non-daily datasets specifically if the algorithm is using
minute
timeseries
-
process
(algo_id, ticker, dataset)[source]¶ Derive custom algorithm buy and sell conditions before placing orders. Just implement your own
process
method.Parameters: - algo_id – string - algo identifier label for debugging datasets during specific dates
- ticker – string - ticker
- dataset –
a dictionary of identifiers (for debugging) and multiple pandas
pandas.DataFrame
objects. Dictionary where keys represent a label from one of the data sources (IEX Cloud
orTradier
). Here is the supported dataset structure for the process method:Note
There are no required keys for
data
, the list below is not hard-enforced by default. This is just a reference for what is available with the v1 engine.dataset = { 'id': <string TICKER_DATE - redis cache key>, 'date': <string DATE>, 'data': { 'daily': pd.DataFrame([]), 'minute': pd.DataFrame([]), 'quote': pd.DataFrame([]), 'stats': pd.DataFrame([]), 'peers': pd.DataFrame([]), 'news1': pd.DataFrame([]), 'financials': pd.DataFrame([]), 'earnings': pd.DataFrame([]), 'dividends': pd.DataFrame([]), 'calls': pd.DataFrame([]), 'puts': pd.DataFrame([]), 'pricing': pd.DataFrame([]), 'news': pd.DataFrame([]) } }
example:
dataset = { 'id': 'SPY_2018-11-02 'date': '2018-11-02', 'data': { 'daily': pd.DataFrame, 'minute': pd.DataFrame, 'calls': pd.DataFrame, 'puts': pd.DataFrame, 'news': pd.DataFrame } }
-
publish_input_dataset
(**kwargs)[source]¶ publish input datasets to caches (redis), archives (minio s3), a local file (
output_file
) and slackParameters: kwargs – keyword argument dictionary Returns: tuple: status
,output_file
-
publish_report_dataset
(**kwargs)[source]¶ publish trade history datasets to caches (redis), archives (minio s3), a local file (
output_file
) and slackParameters: kwargs – keyword argument dictionary Returns: tuple: status
,output_file
-
publish_trade_history_dataset
(**kwargs)[source]¶ publish trade history datasets to caches (redis), archives (minio s3), a local file (
output_file
) and slackParameters: kwargs – keyword argument dictionary Returns: tuple: status
,output_file
-
record_trade_history_for_dataset
(node)[source]¶ Build a daily or minute-by-minute trading history
To run an algorithm minute-by-minute set the configuration to use:
'timeseries': 'minute'
Parameters: node – cached dataset dictionary node
-
reset_for_next_run
()[source]¶ work in progress - clean up all internal member variables for another run
Note
random or probablistic predictions may not create the same trading history_output_file
-
sell_reason
= None¶ if this is in a juptyer notebook this will show the plots at the end of each day… please avoid with the command line as the plot’s window will block the algorithm until the window is closed
-
trade_off_indicator_buy_and_sell_signals
(ticker, algo_id, reason_for_buy=None, reason_for_sell=None)[source]¶ Check if the minimum number of indicators for a buy or a sell were found. If there were, then commit the trade.
if self.trade_off_num_indicators: if self.num_latest_buys >= self.min_buy_indicators: self.should_buy = True elif self.num_latest_sells >= self.min_sell_indicators: self.should_sell = True
Parameters: - ticker – ticker symbol
- algo_id – string algo for tracking internal progress for debugging
- reason_for_buy – optional - string for tracking why the algo bought
- reason_for_sell – optional - string for tracking why the algo sold
-
view_date_dataset_records
(algo_id, ticker, node)[source]¶ View the dataset contents for a single node - use it with the algo config_dict by setting:
"run_this_date": <string date YYYY-MM-DD>
Parameters: - algo_id – string - algo identifier label for debugging datasets during specific dates
- ticker – string - ticker
- node – dataset to process
-