• 1 Post
  • 31 Comments
Joined 2 years ago
cake
Cake day: July 1st, 2023

help-circle



  • Ok, turned out to be as simple to run as downloading llama.cpp binaries, gguf of gemma3 and an mmproj file and running it all like this

    ./llama-server -m ~/LLM-models/gemma-3-4b-it-qat-IQ4_NL.gguf --mmproj ~/LLM-models/gemma-3-4b-it-qat-mmproj-F16.gguf --port 5002
    

    (Could be even easier if I’d let it download weights itself, and just used -hf option instead of -m and —mmproj).

    And now I can use it from my browser at localhost:5002, llama.cpp already provides an interface there that supports images!

    Tested high resolution images and it seems to either downscale or cut them into chunks or both, but the main thing is that 20 megapixels photos work fine, even on my laptop with no gpu, they just take a couple of minutes to get processed. And while 4b model is not very smart (especially quantized), it could still read and translate text for me.

    Need to test more with other models but just wanted to leave this here already in case someone stumbles upon this question and wants to do it themselves. It turned out to be much more accessible than expected.















  • Github is a platform to upload your code to using git (a source code versioning system that allows you to store different versions of your program code, history of all changes to it, etc., and to collaborate with other people to work on the same project with each person working on their own part and then merging the changes together). You can check other people’s projects, upload yours, leave comments, create issue reports, copy others’ work and make a “fork” of the software, and much more. Among other things you can download the latest releases of the software provided by the developers, usually installation instructions are provided on the project page, and the latest releases can be found under the Releases tab.