Apple/OpenELM Efficient OpenSource Family Language Models

Image for article Apple/OpenELM Efficient OpenSource Family Language Models
News Source : Huggingface.co

News Summary

  • We release both pretrained and instruction tuned models with 270M, 450M, 1.1B and 3B parameters.Our pre-training dataset contains RefinedWeb, deduplicated PILE, a subset of RedPajama, and a subset of Dolma v1.6, totaling approximately 1.8 trillion tokens.
  • OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy.
  • Trained on publicly available datasets, these models are made available without any safety guarantees.
  • Thus, it is imperative for users and developers to undertake thorough safety testing and implement appropriate filtering mechanisms tailored to their specific requirements.If you find our work useful, please cite:
  • Consequently, there exists the possibility of these models producing outputs that are inaccurate, harmful, biased, or objectionable in response to user prompts.
  • We pretrained OpenELM models using the CoreNet library.
Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin, Chenfan Sun, Iman Mirzadeh, Mahyar Najibi, Dmitry Belenko, Peter Zatloukal, Mohammad RastegariWe introduce OpenELM, [+7059 chars]

Must read Articles