Show HN We made our own inference engine for Apple Silicon

Image for article Show HN We made our own inference engine for Apple Silicon
News Source : Github.com

News Summary

  • uzu uses its own model format.
  • To export a specific model, use lalamo.
  • First, get the list of supported models: Then, export the specific one: Alternatively, you can download a prepared model using the sample script:You can run uzu in a CLI mode:First, add the uzu dependency to your Cargo.toml:Then, create an inference Session with a specificmodel and configuration:Here are the performance metrics for various models:Note that all performance comparisons were done using bf16/f16 precision.
A highperformance inference engine for AI models on Apple Silicon. Key featuresSimple, highlevel APIHybrid architecture, where layers can be computed as GPU kernels or via MPSGr [+2192 chars]

Must read Articles