Ponkotsu LLM
● LiveA lightweight language model running on a Raspberry Pi. It's no good at hard problems, but for small talk and a bit of writing help, it does its honest best. Responses stream back token by token.
- Endpoint
POST /api/v1/chat- I/O
- text->text · streaming
- Auth
- None (open to all)
Demo
⌘/Ctrl + Enter
>
How to use
Send text, get text back. Responses stream one token at a time over SSE (Server-Sent Events).
Request
curl -N https://ponkotsu-lab.net/api/v1/chat \
-H "Content-Type: application/json" \
-d '{"message": "Hello"}'
| Field | Type | Required | Description |
|---|---|---|---|
message | string | ✔ | Input text for the model |
max_tokens | number | Max tokens to generate (default: 256) |
Response (streaming)
data: {"delta": "Hel"}
data: {"delta": "lo"}
data: {"done": true}
Limitations (the ponkotsu bits)
- Being underpowered, long text and complex reasoning are not its strength.
- Under load you may be rate-limited and put in a queue.
- No auth required, but there is a per-IP usage cap.
- Runs on a lightweight model (powered by Ollama / gemma).