jang 1.1.0

Image for article jang 1.1.0
News Source : Pypi.org

News Summary

  • JANG (Jang Adaptive N-bit Grading) is an open-source quantization format and toolkit that makes large language models run on Apple Silicon at 2-bit precision while staying coherent.
  • JANG classifies tensors by sensitivity and gives critical layers (attention) more bits while aggressively compressing the bulk (MLP/experts) The result: a 122B model fits in 46 GB of GPU memory and answers questions correctly.
  • The GGUF equivalent for MLX — models stay quantized in GPU memory at full Metal speed.
A required part of this site couldnt load. This may be due to a browser extension, network issues, or browser settings. Please check your connection, disable any ad blockers, or try using a diffe [+12 chars]

Must read Articles