use cached contexts for access logs in request path#55166
Conversation
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
There was a problem hiding this comment.
Summary of Changes
Hello @abrarsheikh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request primarily focuses on optimizing the logging mechanism within Ray Serve, particularly for access logs. It introduces a strategy to pre-compute and cache common logging context attributes at the proxy and replica levels, reducing the overhead of dynamically gathering this information for each log entry. This change also includes a performance improvement for JSON log serialization and refactors how core Ray context is added to logs.
Highlights
- Optimized Access Logging: I've implemented a strategy to pre-compute and cache common Ray core and Serve-specific context attributes (such as job ID, worker ID, component type, etc.) for access logs within Serve proxies and replicas. This significantly reduces the overhead of dynamically gathering this information for each log entry, as it avoids expensive runtime lookups.
- Logging Filter Bypass Mechanism: A new
skip_context_filtermechanism has been introduced. This allows log records that have already been pre-populated with context to bypass certain context-enriching logging filters, preventing redundant processing and improving efficiency. - Faster JSON Serialization: For JSON log formatting, I've switched the underlying serialization library from Python's built-in
jsonmodule toorjson.orjsonis known for its higher performance, which will lead to faster log processing. - Refactored Core Context Retrieval: The logic for retrieving Ray core logging context has been centralized into a reusable class method within
CoreContextFilter. This improves code organization and reusability across different logging components. - Streamlined Replica Logging: Redundant 'started executing request' log messages have been removed from Serve replicas. The new optimized access logging mechanism now provides sufficient detail, making these previous logs unnecessary.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces an optimization for access logging by caching the logging context. The changes are well-structured and the main idea is sound. However, I've identified a few issues:
- The implementation for non-JSON (i.e., TEXT) formatted logs is incorrect, which would lead to missing context information in access logs for both proxies and replicas.
- The
application_nameis missing from the proxy's access logs. - There's some repetitive code in the logging filters that could be refactored for better maintainability.
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
|
@can-anyscale could you help stamp as core codeowner? |
### Performance Comparison Table | Metric | GET / (Before) | GET / (After) | Aggregated (Before) | Aggregated (After) | |--------------------------|----------------|----------------|----------------------|---------------------| | # Requests | 27,390 | 284,097 | 27,390 | 284,097 | | # Fails | 0 | 0 | 0 | 0 | | Median (ms) | 450 | 400 | 450 | 400 | | 95%ile (ms) | 520 | 520 | 520 | 520 | | 99%ile (ms) | 540 | 570 | 540 | 570 | | Average (ms) | 451.13 | 400.84 | 451.13 | 400.84 | | Min (ms) | 226 | 160 | 226 | 160 | | Max (ms) | 728 | 777 | 728 | 777 | | Average size (bytes) | 13 | 13 | 13 | 13 | | Current RPS | 220.5 | 256.8 | 220.5 | 256.8 | | Current Failures/s | 0 | 0 | 0 | 0 | --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Andrew Grosser <dioptre@gmail.com>
### Performance Comparison Table | Metric | GET / (Before) | GET / (After) | Aggregated (Before) | Aggregated (After) | |--------------------------|----------------|----------------|----------------------|---------------------| | # Requests | 27,390 | 284,097 | 27,390 | 284,097 | | # Fails | 0 | 0 | 0 | 0 | | Median (ms) | 450 | 400 | 450 | 400 | | 95%ile (ms) | 520 | 520 | 520 | 520 | | 99%ile (ms) | 540 | 570 | 540 | 570 | | Average (ms) | 451.13 | 400.84 | 451.13 | 400.84 | | Min (ms) | 226 | 160 | 226 | 160 | | Max (ms) | 728 | 777 | 728 | 777 | | Average size (bytes) | 13 | 13 | 13 | 13 | | Current RPS | 220.5 | 256.8 | 220.5 | 256.8 | | Current Failures/s | 0 | 0 | 0 | 0 | --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: jugalshah291 <shah.jugal291@gmail.com>
### Performance Comparison Table | Metric | GET / (Before) | GET / (After) | Aggregated (Before) | Aggregated (After) | |--------------------------|----------------|----------------|----------------------|---------------------| | # Requests | 27,390 | 284,097 | 27,390 | 284,097 | | # Fails | 0 | 0 | 0 | 0 | | Median (ms) | 450 | 400 | 450 | 400 | | 95%ile (ms) | 520 | 520 | 520 | 520 | | 99%ile (ms) | 540 | 570 | 540 | 570 | | Average (ms) | 451.13 | 400.84 | 451.13 | 400.84 | | Min (ms) | 226 | 160 | 226 | 160 | | Max (ms) | 728 | 777 | 728 | 777 | | Average size (bytes) | 13 | 13 | 13 | 13 | | Current RPS | 220.5 | 256.8 | 220.5 | 256.8 | | Current Failures/s | 0 | 0 | 0 | 0 | --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>

Performance Comparison Table