Extract - Stock Datasets¶
Extract provides a data pipeline for analyzing stock data straight from the redis cache.
Extraction API Examples
Extract All Data for a Ticker
import analysis_engine.extract as ae_extract
print(ae_extract.extract('SPY'))
Extract Latest Minute Pricing for Stocks and Options
import analysis_engine.extract as ae_extract
print(ae_extract.extract(
'SPY',
datasets=['minute', 'tdcalls', 'tdputs']))
Extract Historical Data
Extract historical data with the date
argument formatted YYYY-MM-DD
:
import analysis_engine.extract as ae_extract
print(ae_extract.extract(
'AAPL',
datasets=['minute', 'daily', 'financials', 'earnings', 'dividends'],
date='2019-02-15'))
Additional Extraction APIs
IEX Cloud Extraction API Reference
Tradier Extraction API Reference
-
analysis_engine.extract.
extract
(ticker=None, datasets=None, tickers=None, use_key=None, extract_mode='all', iex_datasets=None, date=None, redis_enabled=True, redis_address=None, redis_db=None, redis_password=None, redis_expire=None, s3_enabled=True, s3_address=None, s3_bucket=None, s3_access_key=None, s3_secret_key=None, s3_region_name=None, s3_secure=False, celery_disabled=True, broker_url=None, result_backend=None, label=None, verbose=False)[source]¶ Extract all cached datasets for a stock
ticker
or a list oftickers
and returns a dictionary. Please make sure the datasets are already cached in Redis before running this method. If not please refer to theanalysis_engine.fetch.fetch
function to prepare the datasets on your environment.Python example:
from analysis_engine.extract import extract d = extract(ticker='NFLX') print(d) for k in d['NFLX']: print(f'dataset key: {k}')
Extract Intraday Stock and Options Minute Pricing Data
This works by using the
date
anddatasets
arguments as filters:import analysis_engine.extract as ae_extract print(ae_extract.extract( ticker='SPY', datasets=['minute', 'tdcalls', 'tdputs'])
This was created for reducing the amount of typying in Jupyter notebooks. It can be set up for use with a distributed engine as well with the optional arguments depending on your connectitivty requirements.
Note
Please ensure Redis and Minio are running before trying to extract tickers
Stock tickers to extract
Parameters: - ticker – single stock ticker/symbol/ETF to extract
- tickers – optional - list of tickers to extract
- use_key – optional - extract historical key from Redis
usually formatted
<TICKER>_<date formatted YYYY-MM-DD>
(Optional) Data sources, datafeeds and datasets to gather
Parameters: - iex_datasets – list of strings for gathering specific IEX
datasets
which are set as consts:
analysis_engine.iex.consts.FETCH_*
. - date – optional - string date formatted
YYYY-MM-DD
- if not set use last close date - datasets – list of strings for indicator
dataset extraction - preferred method
(defaults to
BACKUP_DATASETS
)
(Optional) Redis connectivity arguments
Parameters: - redis_enabled – bool - toggle for auto-caching all
datasets in Redis
(default is
True
) - redis_address – Redis connection string format:
host:port
(default islocalhost:6379
) - redis_db – Redis db to use
(default is
0
) - redis_password – optional - Redis password
(default is
None
) - redis_expire – optional - Redis expire value
(default is
None
)
(Optional) Minio (S3) connectivity arguments
Parameters: - s3_enabled – bool - toggle for auto-archiving on Minio (S3)
(default is
True
) - s3_address – Minio S3 connection string format:
host:port
(default islocalhost:9000
) - s3_bucket – S3 Bucket for storing the artifacts
(default is
dev
) which should be viewable on a browser: http://localhost:9000/minio/dev/ - s3_access_key – S3 Access key
(default is
trexaccesskey
) - s3_secret_key – S3 Secret key
(default is
trex123321
) - s3_region_name – S3 region name
(default is
us-east-1
) - s3_secure – Transmit using tls encryption
(default is
False
)
(Optional) Celery worker broker connectivity arguments
Parameters: - celery_disabled – bool - toggle synchronous mode or publish
to an engine connected to the Celery broker and backend
(default is
True
- synchronous mode without an engine or need for a broker or backend for Celery) - broker_url – Celery broker url
(default is
redis://0.0.0.0:6379/13
) - result_backend – Celery backend url
(default is
redis://0.0.0.0:6379/14
) - label – tracking log label
(Optional) Debugging
Parameters: verbose – bool - show extract warnings and other debug logging (default is False) Supported environment variables
export REDIS_ADDRESS="localhost:6379" export REDIS_DB="0" export S3_ADDRESS="localhost:9000" export S3_BUCKET="dev" export AWS_ACCESS_KEY_ID="trexaccesskey" export AWS_SECRET_ACCESS_KEY="trex123321" export AWS_DEFAULT_REGION="us-east-1" export S3_SECURE="0" export WORKER_BROKER_URL="redis://0.0.0.0:6379/13" export WORKER_BACKEND_URL="redis://0.0.0.0:6379/14"