This is an MLX-format conversion of Google's Gemma 3 27B instruction-tuned model, quantized to 8-bit precision for efficient inference on Apple Silicon devices. The model is a multimodal language model capable of understanding both images and text, allowing it to process visual content and generate text responses based on combined image and text inputs. It's optimized for running on MLX, Apple's machine learning framework, making it suitable for on-device inference with reduced memory requirements compared to the full-precision version.
This is an MLX-format conversion of Google's Gemma 3 27B instruction-tuned model, quantized to 8-bit precision for efficient inference on Apple Silicon devices. The model is a multimodal language model capable of understanding both images and text, allowing it to process visual content and generate text responses based on combined image and text inputs. It's optimized for running on MLX, Apple's machine learning framework, making it suitable for on-device inference with reduced memory requirements compared to the full-precision version.
This is an MLX-format conversion of Google's Gemma 3 27B instruction-tuned model, quantized to 8-bit precision for efficient inference on Apple Silicon devices. The model is a multimodal language model capable of understanding both images and text, allowing it to process visual content and generate text responses based on combined image and text inputs. It's optimized for running on MLX, Apple's machine learning framework, making it suitable for on-device inference with reduced memory requirements compared to the full-precision version.