OpenAIs Foundry will let customers buy dedicated compute to run its AI models

News Summary

  • (The context window refers to the text that the model considers before generating additional text; longer context windows allow the model to “remember” more text essentially.)
  • (GPT-3.5 Turbo appears to be referring to the ChatGPT Turbo model)Expanded product brief and full source:— Travis Fischer (@transitive_bs) February 21, 2023Eagle-eyed Twitter and Reddit users spotted that one of the text-generating models listed in the instance pricing chart has a 32k max context window.
  • In screenshots of documentation published to Twitter by users with early access, OpenAI describes the forthcoming offering, called Foundry, as “designed for cutting-edge customers running larger workloads.”“[Foundry allows] inference at scale with full control over the model configuration and performance profile,” the documentation reads.
  • GPT-3.5, OpenAI’s latest text-generating model, has a 4k max context window, suggesting that this mysterious new model could be the long-awaited GPT-4 — or a stepping stone toward it.OpenAI is under increasing pressure to turn a profit after a multibillion-dollar investment from Microsoft.
  • In addition, Foundry will provide some level of version control, letting customers decide whether or not to upgrade to newer model releases, as well as “more robust” fine-tuning for OpenAI’s latest models.Foundry will also offer service-level commitments for instance uptime and on-calendar engineering support.
  • Rentals will be based on dedicated compute units with three-month or one-year commitments; running an individual model instance will require a specific number of compute units (see the chart below).Instances won’t be cheap.
OpenAI is quietly launching a new developer platform that lets customers run the companys newer machine learning models, like GPT3.5, on dedicated capacity. In screenshots of documentation publishe [+3219 chars]