Ponkotsu Lab
← Back to list

Ponkotsu LLM

● Live

A lightweight language model running on a Raspberry Pi. It's no good at hard problems, but for small talk and a bit of writing help, it does its honest best. Responses stream back token by token.

Endpoint
POST /api/v1/chat
I/O
text->text · streaming
Auth
None (open to all)

Demo

chat.demo
⌘/Ctrl + Enter
>

How to use

Send text, get text back. Responses stream one token at a time over SSE (Server-Sent Events).

Request

curl -N https://ponkotsu-lab.net/api/v1/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello"}'
FieldTypeRequiredDescription
messagestringInput text for the model
max_tokensnumberMax tokens to generate (default: 256)

Response (streaming)

data: {"delta": "Hel"}
data: {"delta": "lo"}
data: {"done": true}

Limitations (the ponkotsu bits)

  • Being underpowered, long text and complex reasoning are not its strength.
  • Under load you may be rate-limited and put in a queue.
  • No auth required, but there is a per-IP usage cap.
  • Runs on a lightweight model (powered by Ollama / gemma).