This is an 8-bit quantized version of Google's Gemma 3 1B instruction-tuned model converted to MLX format for efficient inference on Apple Silicon and other supported hardware. The model is based on the original google/gemma-3-1b-it and maintains its instruction-following capabilities while being optimized for reduced memory usage and faster inference through quantization.
available local models on Mirai:
available local models on Mirai:
Name
Quantisation
Size
gemma-3-1b-it
uint8
1B
Quant.
uint8
Size
1B
gemma-3-27b-it
uint8
27B
Quant.
uint8
Size
27B
gemma-3-4b-it
uint8
4B
Quant.
uint8
Size
4B
gemma-3-1b-it-4bit
uint8
1B
Quant.
uint8
Size
1B
gemma-3-1b-it-8bit
uint8
1B
Quant.
uint8
Size
1B
gemma-3-27b-it-4bit
uint8
27B
Quant.
uint8
Size
27B
gemma-3-27b-it-8bit
uint8
27B
Quant.
uint8
Size
27B
gemma-3-4b-it-4bit
uint8
4B
Quant.
uint8
Size
4B
gemma-3-4b-it-8bit
uint8
4B
Quant.
uint8
Size
4B
This is an 8-bit quantized version of Google's Gemma 3 1B instruction-tuned model converted to MLX format for efficient inference on Apple Silicon and other supported hardware. The model is based on the original google/gemma-3-1b-it and maintains its instruction-following capabilities while being optimized for reduced memory usage and faster inference through quantization.
available local models on Mirai:
Name
Quantisation
Size
gemma-3-1b-it
uint8
1B
Quant.
uint8
Size
1B
gemma-3-27b-it
uint8
27B
Quant.
uint8
Size
27B
gemma-3-4b-it
uint8
4B
Quant.
uint8
Size
4B
gemma-3-1b-it-4bit
uint8
1B
Quant.
uint8
Size
1B
gemma-3-1b-it-8bit
uint8
1B
Quant.
uint8
Size
1B
gemma-3-27b-it-4bit
uint8
27B
Quant.
uint8
Size
27B
gemma-3-27b-it-8bit
uint8
27B
Quant.
uint8
Size
27B
gemma-3-4b-it-4bit
uint8
4B
Quant.
uint8
Size
4B
gemma-3-4b-it-8bit
uint8
4B
Quant.
uint8
Size
4B