Data Processing Solutions

Data Processing

Data engines ready for AI.

Overview

New Data Demands

To transform your enterprise, AI agents need continuous access to your data, putting strain on data infrastructure not designed for agentic reasoning loops.

By accelerating unstructured and structured data processing with NVIDIA cuDF and NVIDIA cuVS, enterprises can meet the new volume and velocity of data demands from AI, while leveraging the data infrastructure they've invested in for years.

The world's most popular data engines run on the accelerated computing platform—helping agents access structured data living in tables and unstructured data living as PDFs, emails, images, and videos across the enterprise.

NVIDIA cuDF and cuVS Adopted by World's Leading Data Platforms

Learn how leading data platforms are using NVIDIA cuDF and cuVS to accelerate structured analytics and unstructured vector search for AI-ready data.

Read the Blog

Benefits

Transform Your Data for AI

Massive Performance Gains

The accelerated computing platform delivers up to 20x speedup for data processing, enabling enterprises to take action faster with new use cases.

Significant Cost Savings

By running on the NVIDIA optimized stack, organizations have saved 80% in costs or more, helping your data infrastructure do more with less.

Easy to Adopt

The world’s most popular analytics and vector data engines have drop-in accelerators to make adoption straightforward, including Apache Spark, OpenSearch, and more.

AI-Ready Data

With context from 90% of enterprise data stored in PDFs, messages, and emails with NVIDIA cuVS, and ground truth from terabytes of structured data processed in minutes with NVIDIA cuDF, your data is ready for agentic AI.

Products

CUDA-X for Data Processing

cuDF and cuVS are CUDA-X™ toolkits, built on highly optimized CUDA® primitives, to accelerate the data processing ecosystem.

cuDF for Structured Data

Accelerates analytics engines on NVIDIA GPUs
Includes drop-in accelerators for Apache Spark, Presto, Polars, and DuckDB
Executes analytical queries in minutes from hours

Learn More About cuDF

cuVS for Unstructured Data

GPU-accelerated vector search and index building for RAG and AI pipelines
Integrates with OpenSearch, Elastic, Milvus, and more
Reduces vector index build times from hours to minutes

Learn More About cuVS

Adopters

Data Processing Ecosystem

From analytical SQL queries to vector search, organizations are adopting NVIDIA's accelerated computing platform into their existing data platforms to accelerate AI-ready pipelines.

Data Processing on NVIDIA Vera

For enterprises running agentic AI workloads at scale, AI agents dramatically increase concurrent, continuous small-scale querying of structured enterprise data. NVIDIA Vera has 1.2 TB/s of memory bandwidth and high-speed on-chip fabric that offers the per-core performance, high throughput, and predictability under load that supports the increased volume and velocity of queries. For the Starburst analytics engine, NVIDIA Vera processed queries 3x faster compared to x86, reducing query execution from minutes to seconds, while the Redpanda streaming engine saw a 6x improvement in p99 versus x86, enhancing the reliability of the data engine.

Coming soon.

CPU for the Age of AI

Resources

The Latest in Data Processing

Blogs
Sessions
Videos

NVIDIA cuDF and cuVS Adopted by World's Leading Data Platforms

NVIDIA's accelerated computing platform is fueling modern enterprise data processing. Integrated with the world's most widely used open source data engines—downloaded over 200 million times monthly by developers—these libraries are harnessed across enterprise data platforms, databases, and data lakes.

Learn More

How Snap Scaled A/B Testing With NVIDIA cuDF

Snap processes 10+ petabytes daily for A/B testing across 940M+ users. Accelerating Apache Spark with NVIDIA cuDF on Google Cloud delivered 4x faster runtimes and 76% cost savings.

Learn More

Accelerating Large-Scale Analytics With Velox and NVIDIA cuDF

IBM and NVIDIA integrate cuDF with the Velox execution engine, enabling GPU-native query execution for Presto and Apache Spark—delivering up to 12x faster analytics than CPU-only systems.

Learn More

Data Is the Ground Truth and Context for AI

Hear CEO Jensen Huang's thoughts on the role of the data processing ecosystem in the age of agentic AI.

Watch Keynote

IBM Reinvents Data Processing

IBM watsonx.data SQL analytics engine Presto is accelerated by cuDF for 5x speedup and 83% cost savings.

Watch Demo

Processing 100 Million Rows of Data in Under 2 Seconds With Polars

Polars GPU Engine executes polars code on GPUs for massive speedups.

Watch Demo

Next Steps

Ready to Learn More?

Get the latest on data processing news, content, and events.

Stay Informed

cuDF

Open source toolkit for structured data using GPU parallelism and memory bandwidth to accelerate data processing and analytics workflows.

Get Started With cuDF

cuVS

Open source library for unstructured vector search and data clustering that enables faster vector searches and index builds.

Get Started With cuVS

Data Processing Solutions

Data Processing

New Data Demands

NVIDIA cuDF and cuVS Adopted by World's Leading Data Platforms

Benefits

Transform Your Data for AI

Massive Performance Gains

Significant Cost Savings

Easy to Adopt

AI-Ready Data

CUDA-X for Data Processing

cuDF for Structured Data

cuVS for Unstructured Data

Data Processing Ecosystem

Data Processing on NVIDIA Vera

The Latest in Data Processing

NVIDIA cuDF and cuVS Adopted by World's Leading Data Platforms

How Snap Scaled A/B Testing With NVIDIA cuDF

Accelerating Large-Scale Analytics With Velox and NVIDIA cuDF

Data Is the Ground Truth and Context for AI

IBM Reinvents Data Processing

Processing 100 Million Rows of Data in Under 2 Seconds With Polars

Next Steps

Ready to Learn More?

cuDF

cuVS

Sign up to receive data science news

AWS

Dell AI Data Platform

Google Cloud

HPE

IBM watsonx.data

Oracle

Redpanda

Starburst