Smorty [she/her]@lemmy.blahaj.zone to

Free Open-Source Artificial Intelligence@lemmy.worldEnglish · 21 days ago

What should I use: big model-small quant or small model-no quant?

27

What should I use: big model-small quant or small model-no quant?

Smorty [she/her]@lemmy.blahaj.zone to

Free Open-Source Artificial Intelligence@lemmy.worldEnglish · 21 days ago

For about half a year I stuck with using 7B models and got a strong 4 bit quantisation on them, because I had very bad experiences with an old qwen 0.5B model.

But recently I tried running a _smaller _model like llama3.2 3B with 8bit quant and qwen2.5-1.5B-coder on full 16bit floating point quants, and those performed super good aswell on my 6GB VRAM gpu (gtx1060).

So now I am wondering: Should I pull strong quants of big models, or low quants/raw 16bit fp versions of smaller models?

What are your experiences with strong quants? I saw a video by that technovangelist guy on youtube and he said that sometimes even 2bit quants can be perfectly fine.

UPDATE: Woah I just tried llama3.1 8B Q4 on ollama again, and what a WORLD of difference to a llama3.2 3B 16fp!

The difference is super massive. The 3B and 1B llama3.2 models seem to be mostly good at summarizing text and maybe generating some JSON based on previous input. But the bigger 3.1 8B model can actually be used in a chat environment! It has a good response length (about 3 lines per message) and it doesn’t stretch out its answer. It seems like a really good model and I will now use it for more complex tasks.

Chat

Smorty [she/her]@lemmy.blahaj.zoneOP
link
fedilink
English
arrow-up
1·
28 days ago
Pulled an 7B Q4 model just now an woah, yeah, they really are a lot better. I guess the smaller models really are just for devices with less than 1 GB of RAM to spare… Like ma phone, which runs Llama3.2 3B just fine…

Free Open-Source Artificial Intelligence@lemmy.world

fosai@lemmy.world

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !fosai@lemmy.world

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

GitHub Stars

FOSAI Time Capsule

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
9 users / week
114 users / month
447 users / 6 months
20 local subscribers
2.89K subscribers
245 Posts
627 Comments
Modlog