Introduction
CosmicAC provides managed compute for machine learning workloads. Infrastructure setup can delay execution and divert attention from model development. CosmicAC abstracts this setup, allowing jobs to run immediately and scale as needed without manual server reconfiguration.
Job Types
CosmicAC supports several job types for different ML workflows.
GPU Container
High-performance containers with direct GPU access for training, experimentation, and development.
GPU containers let you:
- Run on-demand GPU compute without managing infrastructure.
- Access GPU hardware directly through secure device plugins.
- Work in VM-level isolated environments for secure, dedicated compute.
- Maintain full control over your environment — install packages, run scripts, and configure as needed.
See CLI Commands: Job Management for the full reference on jobs init, jobs create, jobs list, and jobs shell.
npx cosmicac jobs init
npx cosmicac jobs create
npx cosmicac jobs list
npx cosmicac jobs shell <jobId> <containerId>See Getting Started: Creating a GPU Container Job to create your first container.
Managed Inference
Run inference on open-source models like Qwen through a managed API.
Managed Inference lets you:
- Access open-source models without deploying or managing serving infrastructure.
See CLI Commands: Managed Inference for the full reference on inference init, inference list-models, and inference chat.
npx cosmicac inference init
npx cosmicac inference list-models
npx cosmicac inference chat --message "Explain quantum computing."See Getting Started: Creating a Managed Inference Job to deploy your first model.
Continued Pre-training
Extend base models on your own data for domain-specific tasks.
Continued Pre-training lets you:
- Train on your own datasets.
- Save checkpoints at intervals during training.
Why CosmicAC?
Minimal setup — Submit jobs via the CLI or web interface. CosmicAC provisions GPU resources and schedules your workload automatically, with no manual server requests or environment configuration.
Secure, isolated environments — Each workload runs inside a KubeVirt virtual machine, providing VM-level isolation while maintaining direct GPU access.
Fast provisioning — Start workloads in minutes, not days. CosmicAC replaces manual SLURM-based workflows with automated provisioning and scheduling.
Built-in inference serving — Deploy models instantly via the Managed Inference API. CosmicAC handles API key authentication, load balancing, and service discovery.
Real-time notifications — Receive email and push notifications when costs exceed thresholds or errors occur.
Who is CosmicAC for?
| Role | Use Case |
|---|---|
| ML Engineers | Train models, run experiments |
| Data Scientists | Deploy inference pipelines |
| Software Engineers | Integrate inference API into applications |
| DevOps Teams | Manage GPU infrastructure at scale |
Core Architecture
CosmicAC uses Kubernetes for orchestration and KubeVirt for secure workload isolation. Kubernetes schedules containers, allocates GPU resources, and manages job lifecycle. KubeVirt runs each workload in an isolated virtual machine without requiring privileged containers, applying standard Kubernetes security controls (RBAC, SELinux, network policies) while exposing GPU devices through secure device plugins.
Kubernetes Implementation
CosmicAC uses Kubernetes as its core orchestration layer, replacing manual SLURM-based workflows with automated provisioning and scheduling.
| Before (SLURM) | After (Kubernetes) |
|---|---|
| Request servers manually | Submit jobs via CosmicAC |
| Configure SLURM | Provision infrastructure automatically |
| Set up the environment | Schedule containers automatically |
| Wait days for setup | Start workloads in minutes |
See System Components for detailed documentation of the architecture.
What's next?
Overview
- GPU Container — Learn about GPU container jobs and direct GPU access.
- Managed Inference — Run open-source models without managing serving infrastructure.
- Continued Pre-training — Extend base models on your own data for domain-specific tasks.
Getting Started
- Installation — Install and configure the CosmicAC CLI.
- GPU Container Job — Create your first GPU container job.
- Managed Inference Job — Create an API key and connect to a managed inference service.
GPU Container
- How To Create a GPU Container — Create a GPU container job using the CLI.
- How To Access a GPU Container — Connect to your container and open a shell session.
Managed Inference
- How to Connect to a Managed Inference API — Configure your API key and send inference requests.
References
- CLI Commands — Full command reference for the CosmicAC CLI.
- Authentication — Authenticate the CLI with your CosmicAC account.
- Job Management — Create, manage, and monitor jobs from the terminal.
- Managed Inference Commands — Configure and run inference from the terminal.
- GPU Types — Available GPU hardware configurations and vRAM options.
- API Key Management — Generate, store, and manage API keys.
- Managed Inference I/O Specs — Supported input modes, response formats, and request schemas.
- System Component — Core platform architecture and component interactions.