This is an 8-bit quantized version of Google's Gemma 3 1B instruction-tuned model converted to MLX format for efficient inference on Apple Silicon and other supported hardware. The model is based on the original google/gemma-3-1b-it and maintains its instruction-following capabilities while being optimized for reduced memory usage and faster inference through quantization.
This is an 8-bit quantized version of Google's Gemma 3 1B instruction-tuned model converted to MLX format for efficient inference on Apple Silicon and other supported hardware. The model is based on the original google/gemma-3-1b-it and maintains its instruction-following capabilities while being optimized for reduced memory usage and faster inference through quantization.
This is an 8-bit quantized version of Google's Gemma 3 1B instruction-tuned model converted to MLX format for efficient inference on Apple Silicon and other supported hardware. The model is based on the original google/gemma-3-1b-it and maintains its instruction-following capabilities while being optimized for reduced memory usage and faster inference through quantization.