Fine-tune
LoRA on top of any OptIQ quant, MLX-native, no PyTorch.
Sensitivity-aware rank scaling (by_bits default) is baked in, so layers kept at higher precision get more adapter capacity. A live train-loss sparkline streams from mlx-lm's TrainingCallback. Save the adapter and push it to HF when you're done.

See the LoRA fine-tuning guide for the underlying method and the CLI equivalent.