← Retour au blog
tech 11 May 2026

AMÁLIA and the Future of European Portuguese LLMs

AMÁLIA, an ambitious project funded by the Portuguese government, aims to transform natural language processing in European Portuguese. Explore the challenges and opportunities of this open-source initiative.

Article inspired by the original source
AMÁLIA and the future of European Portuguese LLMs ↗ duarteocarmo.com

Introduction

In December 2024, the Portuguese government made waves by announcing a €5.5 million investment in AMÁLIA: a large-scale Language Model (LLM) for European Portuguese. This project could redefine the future of natural language processing (NLP) for this language. So, what is AMÁLIA, and why is it such a crucial project?

What is AMÁLIA?

AMÁLIA is a language model specifically designed to treat European Portuguese as a first-class citizen in the LLM space. The project emerged from a collaboration between several top-tier Portuguese universities and research labs, including NOVA, IST, IT, and FCT.

Contrary to expectations, AMÁLIA was not trained from scratch. It is a continuation of the pre-training phase of the EuroLLM model, with slight modifications in context length and RoPE scaling.

Focus on European Portuguese

The core of the AMÁLIA project lies in increasing the share of European Portuguese data at every training stage. During pre-training, Arquivo.pt data was used. During the supervised fine-tuning (SFT), Portuguese data was synthetically generated, and during preference training, some data from the SFT phase was subsampled.

Measuring Effectiveness

To assess the model's quality, the team created four new benchmarks specific to European Portuguese, the most notable being ALBA. These benchmarks are crucial for understanding whether the model can meet the expectations and specific needs of Lusophone users.

Open Source: Limited Transparency

While the goal is to have an open-source model, AMÁLIA is not yet fully open. To date, the model weights, data, training logs, and new benchmarks are not entirely publicly accessible. This stands in stark contrast to Olmo, a benchmark for openness.

Potential Impact and Challenges

The success of AMÁLIA could have significant implications for research and technological development in European Portuguese, offering more precise and tailored NLP solutions. However, the project must overcome challenges in openness and collaboration to maximize its impact.

Conclusion

AMÁLIA represents a significant advance for European Portuguese in the world of LLMs. Navigating between technical challenges and aspirations for openness, this project could redefine the standards of natural language processing for minority languages. Let's discuss your project in 15 minutes.

AMÁLIA European Portuguese LLM Open Source Natural Language Processing
Deepthix newsletter · 100% AI · every Monday 8am

An AI agent reads tech for you.

Our AI agent scans ~200 sources per week and ships the best articles to your inbox Monday 8am. Free. One click to unsubscribe.

Visit the newsletter page →

Want to automate your operations?

Let's talk about your project in 15 minutes.

Book a call