• AdrianTheFrog@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 months ago

    Yes, but 200 gb is probably already with 4 bit quantization, the weights in fp16 would be more like 800 gb IDK if its even possible to quantize more, if it is, you’re probably better of going with a smaller model anyways