TorchForge RL pipelines run on Together AI Instant Clusters with OpenEnv and CodeSandbox integration
AI Impact Summary
TorchForge RL pipelines are now deployable on Together AI Instant Clusters with distributed training and sandboxed environments, enabling CPU/GPU co-scheduling and RDMA-optimized communication for scalable RL workloads. The integration ties OpenEnv, Together CodeSandbox, Together Code Interpreter, TorchForge, Monarch, TorchStore, and a vLLM policy server into a cohesive workflow, supporting tool-based environments and code execution within RL loops. A practical demo trains a Qwen 1.5B model to play BlackJack using GRPO, with Kubernetes manifests and a getting-started guide indicating an operational path via kubectl. This establishes a foundation for Together AI’s next-gen RL service, signaling a roadmap-aligned capability for customers seeking cloud-native, multi-component RL pipelines with sandboxed tool usage.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info