Discovering Locally Run Language Models: Share Your Favorites/Not So Favorites!

dtlnx@beehaw.org · edit-2 2 years ago

Discovering Locally Run Language Models: Share Your Favorites/Not So Favorites!

planish@sh.itjust.works · edit-2 2 years ago

What do you even run a 65b model on?

Kerfuffle@sh.itjust.works · 2 years ago

With a quantized GGML version you can just run on it on CPU if you have 64GB RAM. It is fairly slow though, I get about 800ms/token on a 5900X. Basically you start it generating something and come back in 30minutes or so. Can’t really carry on a conversation.

planish@sh.itjust.works · 2 years ago

Is it smart enough that it can get the thread of what you are looking for without as much rerolling or handholding, so this comes out better?