Build and Deploy a Remote MCP Server to GKE in 30 Minutes

Abdelfettah Sghiouar
Cloud Developer Advocate, Google Cloud
Build and Deploy a Remote MCP Server to GKE in 30 Minutes
Integrating context from tools and data sources into LLMs can be challenging, which impacts the ease of development for AI agents. To address this challenge, Anthropic introduced the Model Context Protocol (MCP), which standardizes how applications provide context to these models. Developers often want to build an MCP server for their APIs to make them available to fellow developers, allowing them to use it as context in their own applications. Google Kubernetes Engine (GKE) provides a scalable, reliable, and secure environment to deploy these remote MCP servers.
This guide shows the straightforward process of setting up a secure remote MCP server on GKE.
MCP transports
The Model Context Protocol follows a client-server architecture. It initially only supported running the server locally using the stdio transport. The protocol has since evolved and now supports remote access transports, specifically Streamable HTTP.
With Streamable HTTP, the server operates as an independent process that can handle multiple client connections. This transport uses HTTP POST and GET requests. The server must provide a single HTTP endpoint path that supports both POST and GET methods, such as https://example.com/mcp. You can learn more about the different transports in the official documentation.
Benefits of running an MCP server on GKE
Running an MCP server remotely on GKE provides several architecture benefits:
- Scalability: GKE Autopilot is built to handle highly variable traffic. Since MCP Servers are stateless, GKE can scale horizontally to handle spikes in demand efficiently.
- Centralized access: Teams can share access to a centralized MCP server, allowing developers to connect from local machines, Agents or pipelines instead of running redundant local servers. Updates to the central server immediately benefit everyone.
- Enhanced security: The Kubernetes Gateway API combined with SSL certificates provides an easy way to force secure, encrypted traffic. This allows only secure connections to the MCP server, preventing unauthorized access.
Prerequisites
Before starting, ensure the following tools are installed:
- python 3.10 or higher
- uv (for package and project management, see the installation documentation)
- Google Cloud SDK (
gcloud) kubectlcommand-line tool
Installation
Prepare environment variables
Create a folder, mcp-on-gke, to store the code for the server and deployment.
Now configure the Google Cloud credentials and set the active project.
Initiate the GKE Autopilot cluster creation in the background. This process takes a few minutes, so starting it now allows the cluster to provision while you complete the rest of the setup. Make sure to use an Autopilot version that ensures Cost-Optimized Compute (CCOP) is enabled for fast autoscale.
Use uv to create a project, which will generate a pyproject.toml file.
Next, create the additional files needed: server.py for the MCP server code, test_server.py for testing, and a Dockerfile for the container deployment.
Math MCP server
Large language models are excellent at non-deterministic tasks, such as generating text, summarizing ideas, and reasoning about concepts. However, they can be unreliable for deterministic tasks like math operations. To solve this, developers can create tools that provide valuable context. Using FastMCP, a framework for building MCP servers in Python, it is possible to create a simple math server with two tools: add and subtract.
First, add FastMCP as a dependency.
Copy the following code into server.py to create the server.
This example uses the streamable-http transport, which is recommended for remote servers. The script encapsulates the logic needed to run a scalable MCP endpoint.
Testing the MCP server locally
Create the test_mcp_server.py script to connect to test the MCP Server. This will be useful to test the MCP server before deploying it to GKE.
Run the MCP server locally to test the connection:
Then execute the test script in a new terminal to verify the connection.
The output should print available tools and the results of invocing the add and subtract tools confirming the MCP server is functional.
Building the container image
To speed up the deployment process, build the container image while the cluster is still creating.
First, prepare the Dockerfile:
Now, set up the Artifact Registry and build the container image.
Set up Artifact Registry
Build and push the image in parallel
Once the image build is complete, verify that the cluster is ready and retrieve the credentials. If the output of the cluster is not "RUNNING" wait for it to be ready.
Deploying to GKE with Gateway API and SSL
The next step involves deploying the server workloads and exposing them securely using the Kubernetes Gateway API rather than the legacy Ingress. This guarantees secure, encrypted traffic via SSL certificates.
Create a deployment.yaml file to define the Kubernetes Deployment and Service. Replace the placeholders with your actual project ID and region.
Apply this configuration to the cluster:
Check the pods are up and running
To ensure our remote MCP Server is accessible let's try to reach it with a port-forward.
Run the test script to verify the connection. make sure to edit the MCP Server URL in the test script to http://localhost:8080/mcp.
Now let's secure the connection. To do so, we'll use a Google-managed SSL certificate and attach it to a Gateway API resource. First, reserve a static IP address for your load balancer:
Point your domain's DNS A record at $MCP_SERVER_IP. Example: mcp.yourdomain.com
Create a Google-Managed Certificate. Replace mcp.yourdomain.com with your actual domain.
Create a gateway.yaml file to provision the load balancer and configure Transport Layer Security (TLS) termination.
Deploying this configuration creates the infrastructure required to route external traffic securely to the MCP server.
Wait a few minutes for the load balancer to become active and the certificate to provision. Developers can check the status using kubectl get gateway mcp-gateway.
Try to reach the remote MCP Server. Run the test script to verify the connection. make sure to edit the MCP Server URL in the test script to https://mcp.yourdomain.com/mcp.
Cleanup
Continue reading
Deploying Model Context Protocol servers to Kubernetes enables new use cases for integrated agents and AI workflows. To dive deeper into these capabilities, explore the following resources:


