Skip to content

[2/n] [Serve] poll outbound deployments into deployment state#58350

Merged
abrarsheikh merged 17 commits into
masterfrom
SERVE-1425-abrar-controller
Nov 14, 2025
Merged

[2/n] [Serve] poll outbound deployments into deployment state#58350
abrarsheikh merged 17 commits into
masterfrom
SERVE-1425-abrar-controller

Conversation

@abrarsheikh

@abrarsheikh abrarsheikh commented Nov 1, 2025

Copy link
Copy Markdown
Contributor

fetch outbound deployments from all replicas at initialization.

Next PR -> #58355

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh changed the base branch from master to dag-of-deployments November 1, 2025 01:23
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Nov 1, 2025
abrarsheikh added a commit that referenced this pull request Nov 6, 2025
## Summary
Adds a new method to expose all downstream deployments that a replica
calls into, enabling dependency graph construction.

## Motivation
Deployments call downstream deployments via handles in two ways:
1. **Stored handles**: Passed to `__init__()` and stored as attributes →
`self.model.func.remote()`
2. **Dynamic handles**: Obtained at runtime via
`serve.get_deployment_handle()` → `model.func.remote()`

Previously, there was no way to programmatically discover these
dependencies from a running replica.

## Implementation

### Core Changes
- **`ReplicaActor.list_outbound_deployments()`**: Returns
`List[DeploymentID]` of all downstream deployments
- Recursively inspects user callable attributes to find stored handles
(including nested in dicts/lists)
- Tracks dynamic handles created via `get_deployment_handle()` at
runtime using a callback mechanism

- **Runtime tracking**: Modified `get_deployment_handle()` to register
handles when called from within a replica via
`ReplicaContext._handle_registration_callback`


Next PR: #58350

---------

Signed-off-by: abrar <abrar@anyscale.com>
Base automatically changed from dag-of-deployments to master November 6, 2025 21:33
@abrarsheikh abrarsheikh marked this pull request as ready for review November 6, 2025 22:03
@abrarsheikh abrarsheikh requested a review from a team as a code owner November 6, 2025 22:03
Comment thread python/ray/serve/_private/deployment_state.py Outdated
Signed-off-by: abrar <abrar@anyscale.com>
Comment thread python/ray/serve/_private/deployment_state.py Outdated
@ray-gardener ray-gardener Bot added the serve Ray Serve Related Issue label Nov 7, 2025
YoussefEssDS pushed a commit to YoussefEssDS/ray that referenced this pull request Nov 8, 2025
…58345)

## Summary
Adds a new method to expose all downstream deployments that a replica
calls into, enabling dependency graph construction.

## Motivation
Deployments call downstream deployments via handles in two ways:
1. **Stored handles**: Passed to `__init__()` and stored as attributes →
`self.model.func.remote()`
2. **Dynamic handles**: Obtained at runtime via
`serve.get_deployment_handle()` → `model.func.remote()`

Previously, there was no way to programmatically discover these
dependencies from a running replica.

## Implementation

### Core Changes
- **`ReplicaActor.list_outbound_deployments()`**: Returns
`List[DeploymentID]` of all downstream deployments
- Recursively inspects user callable attributes to find stored handles
(including nested in dicts/lists)
- Tracks dynamic handles created via `get_deployment_handle()` at
runtime using a callback mechanism

- **Runtime tracking**: Modified `get_deployment_handle()` to register
handles when called from within a replica via
`ReplicaContext._handle_registration_callback`


Next PR: ray-project#58350

---------

Signed-off-by: abrar <abrar@anyscale.com>

@zcin zcin left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general question: since deployment handles stored on replicas is not something that is often changed, in fact in 99% of cases it will be exactly the set that it started with, is consistent polling the best way to support this?

Comment thread python/ray/serve/_private/deployment_state.py Outdated
Comment thread python/ray/serve/_private/deployment_state.py Outdated
Comment thread python/ray/serve/_private/constants.py Outdated
@abrarsheikh

Copy link
Copy Markdown
Contributor Author

general question: since deployment handles stored on replicas is not something that is often changed, in fact in 99% of cases it will be exactly the set that it started with, is consistent polling the best way to support this?

You are right, polling only helps with the case where user uses get_deployment_handle calls within the endpoint. Or an extreme case is when user has a get_deployment_handle behind a conditional if blocks that gets executed in 1% of the case. Polling ensures that we capture it eventually.

But i do agree this is rare. I am okay to keep this simple to begin with. wdyt?

@zcin

zcin commented Nov 13, 2025

Copy link
Copy Markdown
Contributor

Yeah, agree with keep simple and add support incrementally.

@abrarsheikh

Copy link
Copy Markdown
Contributor Author

Yeah, agree with keep simple and add support incrementally.

changing it

Comment thread python/ray/serve/_private/deployment_state.py
Comment thread python/ray/serve/_private/deployment_state.py Outdated
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Comment thread python/ray/serve/_private/deployment_state.py
Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh merged commit 9433631 into master Nov 14, 2025
6 checks passed
@abrarsheikh abrarsheikh deleted the SERVE-1425-abrar-controller branch November 14, 2025 18:33
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…58345)

## Summary
Adds a new method to expose all downstream deployments that a replica
calls into, enabling dependency graph construction.

## Motivation
Deployments call downstream deployments via handles in two ways:
1. **Stored handles**: Passed to `__init__()` and stored as attributes →
`self.model.func.remote()`
2. **Dynamic handles**: Obtained at runtime via
`serve.get_deployment_handle()` → `model.func.remote()`

Previously, there was no way to programmatically discover these
dependencies from a running replica.

## Implementation

### Core Changes
- **`ReplicaActor.list_outbound_deployments()`**: Returns
`List[DeploymentID]` of all downstream deployments
- Recursively inspects user callable attributes to find stored handles
(including nested in dicts/lists)
- Tracks dynamic handles created via `get_deployment_handle()` at
runtime using a callback mechanism

- **Runtime tracking**: Modified `get_deployment_handle()` to register
handles when called from within a replica via
`ReplicaContext._handle_registration_callback`


Next PR: ray-project#58350

---------

Signed-off-by: abrar <abrar@anyscale.com>
SheldonTsen pushed a commit to SheldonTsen/ray that referenced this pull request Dec 1, 2025
…58345)

## Summary
Adds a new method to expose all downstream deployments that a replica
calls into, enabling dependency graph construction.

## Motivation
Deployments call downstream deployments via handles in two ways:
1. **Stored handles**: Passed to `__init__()` and stored as attributes →
`self.model.func.remote()`
2. **Dynamic handles**: Obtained at runtime via
`serve.get_deployment_handle()` → `model.func.remote()`

Previously, there was no way to programmatically discover these
dependencies from a running replica.

## Implementation

### Core Changes
- **`ReplicaActor.list_outbound_deployments()`**: Returns
`List[DeploymentID]` of all downstream deployments
- Recursively inspects user callable attributes to find stored handles
(including nested in dicts/lists)
- Tracks dynamic handles created via `get_deployment_handle()` at
runtime using a callback mechanism

- **Runtime tracking**: Modified `get_deployment_handle()` to register
handles when called from within a replica via
`ReplicaContext._handle_registration_callback`


Next PR: ray-project#58350

---------

Signed-off-by: abrar <abrar@anyscale.com>
SheldonTsen pushed a commit to SheldonTsen/ray that referenced this pull request Dec 1, 2025
…oject#58350)

fetch outbound deployments from all replicas at initialization.

Next PR -> ray-project#58355

---------

Signed-off-by: abrar <abrar@anyscale.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…58345)

## Summary
Adds a new method to expose all downstream deployments that a replica
calls into, enabling dependency graph construction.

## Motivation
Deployments call downstream deployments via handles in two ways:
1. **Stored handles**: Passed to `__init__()` and stored as attributes →
`self.model.func.remote()`
2. **Dynamic handles**: Obtained at runtime via
`serve.get_deployment_handle()` → `model.func.remote()`

Previously, there was no way to programmatically discover these
dependencies from a running replica.

## Implementation

### Core Changes
- **`ReplicaActor.list_outbound_deployments()`**: Returns
`List[DeploymentID]` of all downstream deployments
- Recursively inspects user callable attributes to find stored handles
(including nested in dicts/lists)
- Tracks dynamic handles created via `get_deployment_handle()` at
runtime using a callback mechanism

- **Runtime tracking**: Modified `get_deployment_handle()` to register
handles when called from within a replica via
`ReplicaContext._handle_registration_callback`

Next PR: ray-project#58350

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…oject#58350)

fetch outbound deployments from all replicas at initialization.

Next PR -> ray-project#58355

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

2 participants