Few-Shot Fine-Tuning of Large Language Models: When Data Is Scarce
Susannah Greenwood
Susannah Greenwood

I'm a technical writer and AI content strategist based in Asheville, where I translate complex machine learning research into clear, useful stories for product teams and curious readers. I also consult on responsible AI guidelines and produce a weekly newsletter on practical AI workflows.

7 Comments

  1. Ian Cassidy Ian Cassidy
    March 12, 2026 AT 22:58 PM

    LoRA and QLoRA are game-changers. I’ve used QLoRA on a 13B model with an RTX 4090 - no cloud needed. Just 48GB of VRAM and you’re fine-tuning like it’s 2020 again. The real win? No more begging your boss for AWS credits. One guy at my shop cut his dev cycle from 3 weeks to 3 days. That’s not progress - that’s liberation.

    Also, 50 examples? Yeah, but make sure they’re not garbage. I once trained on 60 messy labels and the model started calling every patient ‘John Doe’. Learned the hard way.

  2. Zach Beggs Zach Beggs
    March 14, 2026 AT 12:15 PM

    Honestly, I’m impressed this even works. I thought you needed thousands of examples to make a model behave. But seeing 99.4% accuracy on math tasks with QLoRA? Wild. Makes you wonder if we’ve been over-engineering this whole time.

  3. Kenny Stockman Kenny Stockman
    March 16, 2026 AT 11:27 AM

    Big fan of this stuff. I’ve been using this in our clinic for summarizing notes. We only had 75 labeled examples - and it’s now doing 83% accuracy on discharge summaries. The docs love it. No more staying late to type up charts.

    Key thing? Don’t rush the data. One bad example can throw the whole thing off. Took us two weeks just to pick the right 80 notes. But once we did? Magic. Also, keep your learning rate low. 5e-5 saved us from total collapse.

    And yeah, hallucinations happen. We added negative examples like ‘this is NOT a diagnosis’ and it dropped from 18% to 5%. Simple fix. Don’t overcomplicate it.

  4. Antonio Hunter Antonio Hunter
    March 17, 2026 AT 11:56 AM

    It’s fascinating how the field has evolved from brute-force fine-tuning to these surgical interventions on model weights. The shift from updating billions of parameters to manipulating only a few thousand - through low-rank decomposition - represents not merely an optimization but a philosophical reorientation in how we approach machine learning. We are no longer training models; we are gently nudging them. This is akin to teaching a pianist a new piece not by retraining their entire nervous system, but by adding a subtle finger extension device. The elegance of this approach is profound. And QLoRA, by compressing the weight representations into 4-bit space, is not just an engineering feat - it’s a democratization of computational access. Suddenly, a single engineer with a consumer GPU can do what once required a data center. The implications for small clinics, regional law firms, and academic labs are not merely economic - they are epistemological. We are witnessing the decentralization of AI expertise, and it is happening faster than most policymakers realize.

  5. Paritosh Bhagat Paritosh Bhagat
    March 18, 2026 AT 07:41 AM

    LOL you all act like this is some breakthrough. I’ve been doing this since 2021. You think you’re the first to use LoRA? Nah. Also, ‘50 examples’? That’s laughable. If your data isn’t clean, you’re just teaching the model to lie. And don’t get me started on QLoRA - 4-bit quantization? That’s just throwing away precision like it’s trash. You think you’re saving money? You’re just making models dumber and more prone to hallucinations. I’ve seen it. My cousin works at a hospital and their ‘AI assistant’ started diagnosing pneumonia from coughs. It was wrong 80% of the time. And they blamed ‘training data’. No. You used garbage. And now you’re all patting yourselves on the back like you invented fire. Wake up.

  6. Ben De Keersmaecker Ben De Keersmaecker
    March 19, 2026 AT 08:06 AM

    Minor correction: the Hugging Face Transformers v4.38 release was in February 2025, not 2026. Also, the 63% failure rate from bad learning rates comes from a 2024 arXiv paper by Wu et al., not general consensus. And while QLoRA does reduce memory use, the 80–90% figure is only true for 70B+ models - for 13B, it’s more like 65%. Precision matters. Also, ‘RTX 4090’ isn’t a ‘consumer-grade GPU’ - it’s a high-end workstation card. A ‘gaming GPU’ implies RTX 3060 or lower. Just saying. I’m not trying to be pedantic - I just want the record straight. This stuff is too important to get sloppy.

  7. Aaron Elliott Aaron Elliott
    March 20, 2026 AT 12:41 PM

    Let me ask you this: if we can achieve 99.4% accuracy with 50 examples, then why do we need models at all? Why not just write a rule-based system? After all, 50 examples is not learning - it’s memorization. And if the model is merely memorizing a pattern, then it is not intelligent. It is a parrot with a spreadsheet. The entire premise of fine-tuning is predicated on the illusion of generalization. But when the training set is so small, generalization becomes statistically impossible. We are not building AI. We are building a very sophisticated lookup table. And we are calling it ‘progress’. This is not innovation. This is a theological surrender to the cult of data efficiency. We have abandoned the pursuit of understanding and replaced it with a ritual of parameter manipulation. What have we become?

Write a comment