Jan-nano-128k: A 4B Model with a Super-Long Context Window

[email protected]

Jan-nano-128k is model fine-tuned to improve performance when enable YaRN scaling (instead of having degraded performance).
This model will require YaRN Scaling supported from inference engine.

It can uses tools continuously, repeatedly.

It can perform deep research

Extremely persistent

gguf can be found at: https://huggingface.co/Menlo/Jan-nano-128k-gguf

agnos.is Forums

Jan-nano-128k: A 4B Model with a Super-Long Context Window