OpenAIs Foundry will let customers buy dedicated compute to run its AI models

News Summary
- (The context window refers to the text that the model considers before generating additional text; longer context windows allow the model to “remember” more text essentially.)
- (GPT-3.5 Turbo appears to be referring to the ChatGPT Turbo model)Expanded product brief and full source: https://t.co/GhBlSOR0ZA pic.twitter.com/FL0r4uCEiR— Travis Fischer (@transitive_bs) February 21, 2023Eagle-eyed Twitter and Reddit users spotted that one of the text-generating models listed in the instance pricing chart has a 32k max context window.
- In screenshots of documentation published to Twitter by users with early access, OpenAI describes the forthcoming offering, called Foundry, as “designed for cutting-edge customers running larger workloads.”“[Foundry allows] inference at scale with full control over the model configuration and performance profile,” the documentation reads.
- GPT-3.5, OpenAI’s latest text-generating model, has a 4k max context window, suggesting that this mysterious new model could be the long-awaited GPT-4 — or a stepping stone toward it.OpenAI is under increasing pressure to turn a profit after a multibillion-dollar investment from Microsoft.
- In addition, Foundry will provide some level of version control, letting customers decide whether or not to upgrade to newer model releases, as well as “more robust” fine-tuning for OpenAI’s latest models.Foundry will also offer service-level commitments for instance uptime and on-calendar engineering support.
- Rentals will be based on dedicated compute units with three-month or one-year commitments; running an individual model instance will require a specific number of compute units (see the chart below).Instances won’t be cheap.
OpenAI is quietly launching a new developer platform that lets customers run the companys newer machine learning models, like GPT3.5, on dedicated capacity. In screenshots of documentation publishe [+3219 chars]