Some more details : Ollama runs on a local, multi-GPU machine (ollama distributes nicely the workload between multiple GPUs). On that machine, à proxy listens to chat coming from a LSL script (relay.lsl), talks to the ollama API (localhost:11434) and bouces back the answer to LSL.
It supports both api/generate and api/chat, with tool capability. We use gemma3:27b a.t.m. The round trip is around 5 seconds for short answers. System card emphasizes roleplaying and group chatting. Gemma may turn temperamental when bullied, making it a good comedian.
And yes, you may talk to it if you visit atlasgrid.fr:8002:Atlas if the experiment is running, that is most of the time.
like(1)