Create an endpoint
To create a new Serverless endpoint through the Runpod web interface:
- Navigate to the Serverless section of the Runpod console.
- Click New Endpoint.
- On the Deploy a New Serverless Endpoint screen, choose your deployment source:
- Import Git Repository (if GitHub is connected) (see [Deploy from GitHub] for details(/serverless/workers/github-integration).
- Import from Docker Registry (see Deploy from Docker for details.
- Or select a preconfigured endpoint under Ready-to-Deploy Repos.
- Follow the UI steps to configure your selected source (Docker image, GitHub repo), then click Next.
- Configure your endpoint settings:
- Set the Endpoint Name
- Choose your Endpoint Type: select Queue for traditional queue-based processing or Load balancer for direct HTTP access (see Load balancing endpoints for details)
- Under GPU Configuration, select the appropriate GPU types and configure worker settings
- Set Environment Variables and other options as needed. For a full list of options, see Endpoint configurations
- Click Create Endpoint to deploy.
You can optimize cost and availability by specifying GPU preferences in order of priority. Runpod attempts to allocate your first choice GPU. If unavailable, it automatically uses the next GPU in your priority list, ensuring your workloads run on the best available resources.You can enable or disable particular GPU types using the Advanced > Enabled GPU Types section.
https://api.runpod.ai/v2/{endpoint_id}/
) that you can use to send requests. For information on how to interact with your endpoint, see Endpoint operations.
Edit an endpoint
You can modify your endpoint’s configuration at any time:- Navigate to the Serverless section in the Runpod console.
- Click the three dots in the bottom right corner of the endpoint you want to modify.
- Click Edit Endpoint.
-
Update any configuration parameters as needed:
- Endpoint name
- Worker configuration
- Docker configuration (container image or version)
- Environment variables
- Storage
- Click Save Endpoint to apply your changes.
To force an immediate configuration update, temporarily set Max Workers to 0, trigger the Release, then restore your desired worker count and update again.
Add a network volume
Attach persistent storage to share data across workers:- Navigate to the Serverless section in the Runpod console.
- Click the three dots in the bottom right corner of the endpoint you want to modify.
- Click Edit Endpoint.
- Expand the Advanced section.
- Select a volume from the dropdown below Network Volume.
- Click Save Endpoint to attach the volume to your endpoint.
Delete an endpoint
When you no longer need an endpoint, you can remove it from your account:- Navigate to the Serverless section in the Runpod console.
- Click the three dots in the bottom right corner of the endpoint you want to delete.
- Click Delete Endpoint.
- Type the name of the endpoint, then click Confirm.
Best practices for endpoint management
- Start small and scale: Begin with fewer workers and scale up as demand increases.
- Monitor usage: Regularly check your endpoint metrics to optimize worker count and GPU allocation.
- Use GPU prioritization: Set up fallback GPU options to balance cost and availability.
- Leverage network volumes for large models or datasets rather than embedding them in your container image.
- Set appropriate timeouts based on your workload’s processing requirements.