Name	Name	Last commit message	Last commit date
parent directory ..
Anyscale_cluster_create.png	Anyscale_cluster_create.png
Anyscale_cluster_start.png	Anyscale_cluster_start.png
Anyscale_config.png	Anyscale_config.png
README.md	README.md
train.py	train.py
train_prophet.ipynb	train_prophet.ipynb

forecasting_demos

Multi-model training, tuning, and serving are common tasks in machine learning. They require training and tuning multiple models, on the same or different data segments. The data segments typcially correspond to different locations, products, or groups of locations or products, etc. Using distributed compute to train hundreds or thousands of models takes less time than traditional Python because the data and model training/tuning/inferencing can be split up into batches and run in parallel!

These notebooks demonstrate how to use Ray v2 for quick and easy distributed forecasting - a special case of multi-model training, tuning, inferencing, and prediction. You will learn how to convert existing code so it can run in parallel on multiple compute nodes. The compute can be cores on your laptop or clusters in the cloud.

Ray can be used with any AI/ML Python library! But, in these notebooks, we will demo:

Prophet

Data

These notebooks use the public NYC Taxi rides dataset.

Raw data original source: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page
Raw data hosted publicly on AWS: s3://anonymous@air-example-data/ursa-labs-taxi-data/by_year/
8 months of cleaned data in this repo under folder data/

👩 Setup Instructions for Anyscale

We recommend running Ray on Anyscale to take full advantage of developing on a personal laptop, then quickly spinning up resources in a cloud to run your same laptop code on bigger compute resources.

To configure an Anyscale cluster Configuration, use the latest Ray (right now it is v2.2) on a Python 3.8 ML docker image, example anyscale/ray-ml:2.2.0-py38-gpu. Don't worry, you can on-the-fly remove the GPU per cluster just before you spin one up, if you don't need expensive GPU. 'ml' docker image means standard ml libraries automatically installed, e.g. pandas, matplotlib. Python3.8 is important! Since, at the time of writing this, Prophet still has this dependency.

The first time you configure your cluster:

In your browser, open `console.anyscale.com`.
Click on `Configurations` > `Create a new environment`.
Give the configuration a name example `myname-forecasting`.
Select a base docker image, example `anyscale/ray-ml:2.2.0-py38-gpu`.
Specify `Pip packages` in this order:
For PyTorch Forecasting specify `Conda packages` in this order:
Put your github repo in the `Post build commands` section:
- If you have a project name:
  - git clone your-git-repo-url ../your-project-name/
- Otherwise if you do not have a project:
  - git clone your-git-repo-url
Click 'Create'.

The first time you spin up a cluster:

In your browser, open `console.anyscale.com`.
Click on `Clusters` > `Create`.
Give the cluster a name.
Select a project that the cluster belongs to.
Select the latest cluster environment name that you just created, example `myname-forecasting` and latest version.
Leave the default radio button on `Compute config` = `Create a one-off configuration`.
Select a default cloud config from your organization, e.g. AWS, region=us-west-2, zones=any.
Node types. Here is where you can delete the GPU if you are not going to use it, example Remove `g4dn.4xlarge`. You can also specify min/max number of worker node clusters, memory, and AWS spot instances option here.
Click `Start`.
Wait until the cluster is ready, then click `Jupyter` button.

Anyscale by default will automatically shut down your cluster for you after 2 hours of inactivity. That way you don't have to worry about accidentally leaving it running over a weekend.

From now on, whenever you want to spin up a cluster, it will be quicker:

In your browser, open `console.anyscale.com`.
Click on `Clusters` > `Created by me`.
Click on the cluster.
Click `Start`.
Wait until the cluster is ready, then click `Jupyter` button.

🎓 To further speed up your development process (especially convenient if you are contributing to open-source Ray), use Anyscale Workspaces, to develop and save your code directly on a cloud, instead of on your laptop!

Let's have fun 😜 and Thank you 🙏.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

README.md

forecasting_demos

Data

👩 Setup Instructions for Anyscale

The first time you configure your cluster:

The first time you spin up a cluster:

From now on, whenever you want to spin up a cluster, it will be quicker:

Uh oh!

FilesExpand file tree

forecast_demo

Directory actions

More options

Directory actions

More options

Latest commit

History

forecast_demo

Folders and files

parent directory

README.md

forecasting_demos

Data

👩 Setup Instructions for Anyscale

The first time you configure your cluster:

The first time you spin up a cluster:

From now on, whenever you want to spin up a cluster, it will be quicker: