Fetch - Stock Datasets¶
Fetch populates redis caches as a stock data pipeline. Data can be pulled at any time using: analysis_engine.extract.extract
Dataset Fetch API
-
analysis_engine.fetch.
fetch
(ticker=None, tickers=None, fetch_mode=None, iex_datasets=None, redis_enabled=True, redis_address=None, redis_db=None, redis_password=None, redis_expire=None, s3_enabled=True, s3_address=None, s3_bucket=None, s3_access_key=None, s3_secret_key=None, s3_region_name=None, s3_secure=False, celery_disabled=True, broker_url=None, result_backend=None, label=None, verbose=False)[source]¶ Fetch all supported datasets for a stock
ticker
or a list oftickers
and returns a dictionary. Once run, the datasets will all be cached in Redis and archived in Minio (S3) by default.Python example:
from analysis_engine.fetch import fetch d = fetch(ticker='NFLX') print(d) for k in d['NFLX']: print(f'dataset key: {k}')
By default, it synchronously automates:
- fetching all datasets
- caching all datasets in Redis
- archiving all datasets in Minio (S3)
- returns all datasets in a single dictionary
This was created for reducing the amount of typying in Jupyter notebooks. It can be set up for use with a distributed engine as well with the optional arguments depending on your connectitivty requirements.
Note
Please ensure Redis and Minio are running before trying to extract tickers
Stock tickers to fetch
Parameters: - ticker – single stock ticker/symbol/ETF to fetch
- tickers – optional - list of tickers to fetch
(Optional) Data sources, datafeeds and datasets to gather
Parameters: - fetch_mode – data sources - default is
all
(both IEX and Yahoo),iex
for only IEX,yahoo
for only Yahoo. - iex_datasets – list of strings for gathering specific IEX
datasets
which are set as consts:
analysis_engine.iex.consts.FETCH_*
.
(Optional) Redis connectivity arguments
Parameters: - redis_enabled – bool - toggle for auto-caching all
datasets in Redis
(default is
True
) - redis_address – Redis connection string format:
host:port
(default islocalhost:6379
) - redis_db – Redis db to use
(default is
0
) - redis_password – optional - Redis password
(default is
None
) - redis_expire – optional - Redis expire value
(default is
None
)
(Optional) Minio (S3) connectivity arguments
Parameters: - s3_enabled – bool - toggle for auto-archiving on Minio (S3)
(default is
True
) - s3_address – Minio S3 connection string format:
host:port
(default islocalhost:9000
) - s3_bucket – S3 Bucket for storing the artifacts
(default is
dev
) which should be viewable on a browser: http://localhost:9000/minio/dev/ - s3_access_key – S3 Access key
(default is
trexaccesskey
) - s3_secret_key – S3 Secret key
(default is
trex123321
) - s3_region_name – S3 region name
(default is
us-east-1
) - s3_secure – Transmit using tls encryption
(default is
False
)
(Optional) Celery worker broker connectivity arguments
Parameters: - celery_disabled – bool - toggle synchronous mode or publish
to an engine connected to the Celery broker and backend
(default is
True
- synchronous mode without an engine or need for a broker or backend for Celery) - broker_url – Celery broker url
(default is
redis://0.0.0.0:6379/13
) - result_backend – Celery backend url
(default is
redis://0.0.0.0:6379/14
) - label – tracking log label
(Optional) Debugging
Parameters: verbose – bool - show fetch warnings and other debug logging (default is False) Supported environment variables
export REDIS_ADDRESS="localhost:6379" export REDIS_DB="0" export S3_ADDRESS="localhost:9000" export S3_BUCKET="dev" export AWS_ACCESS_KEY_ID="trexaccesskey" export AWS_SECRET_ACCESS_KEY="trex123321" export AWS_DEFAULT_REGION="us-east-1" export S3_SECURE="0" export WORKER_BROKER_URL="redis://0.0.0.0:6379/13" export WORKER_BACKEND_URL="redis://0.0.0.0:6379/14"