JetBrains open-sources coding model “trained from scratch”

Software firm JetBrains has open-sourced “Mellum”, its home-grown LLM for code completion, under a permissive Apache 2.0 licence.

The small, specialist model joins a staggering and eclectic 1.6 million models that are now available for use via the Hugging Face platform.

It has been trained for code completion tasks in C, C#, C++, CSS, Go, HTML, Java, JavaScript, Kotlin, PHP, Python, Ruby, Rust, and TypeScript.

The privately held company, which provides a range of Integrated Development Environments (IDEs) to over 11 million users, said the model was “trained from scratch to power cloud-based code completion in JetBrains IDEs” rather than being a fine-tuned version of another LLM.

It was trained on over four trillion tokens and has a context window of 8192 tokens across multiple programming languages.

“ The model follows a LLaMA-style architecture with 4 billion parameters, making it efficient for both cloud inference (e.g., via vLLM) and local deployment (e.g., using llama.cpp or Ollama)”

JetBrains said today.

"Let's be real..."

“Let’s be real – the average developer probably won’t fine-tune or deploy Mellum,” the company added: “That’s okay. Instead [it is] meant for:

“AI/ML researchers: Especially those exploring AI’s role in software development, benchmarking, or model interpretability.
“AI/ML engineers and educators: As a foundation for learning how to build, fine-tune, and adapt domain-specific language models, or to support educational programs focused on LLM architecture and specialization.

“Mellum isn’t a plug-and-play solution. By releasing it… we are offering researchers, educators, and advanced teams the opportunity to explore how a purpose-built model works under the hood.”

Whilst no shortage of large companies are wedded to close partnerships with the likes of OpenAI or Google and still exploring the capabilities of their models, others are looking to choose specific models for specific purposes; essentially farming workloads to the best large or small model for the job across multiple modalities, including via firms like Fireworks.ai

Will Mellum have a place here? Perhaps. And, as JetBrains notes, as an open-model for researchers to dismantle, prod and poke and deepen their understanding of how LLMs work, it may be invaluable and it has been trained on some uniquely useful data from the IDE provider.

"Let's be real..."

Sign up for The Stack