AI Act, Article 15, Gonçalo Teixeira

Article 15 of Regulation (EU) 2024/1689 (the AI Act) sets out, for high-risk AI systems, obligations of accuracy, robustness, and cybersecurity throughout the lifecycle, and operationalizes the concept of 'robustness' in paragraph 5 by reference to specific attacks: data poisoning (manipulation of the training dataset), model poisoning (of pre-trained components used in training), adversarial examples or model evasion (inputs designed to make the model err), confidentiality attacks, and flaws in the model itself. The definition is not abstract, it is operational, and anchors robustness in the system's capacity to resist known adversarial inputs.

The article's relevance to this blog's thesis is direct. The AI Act builds the compliance regime for high-risk systems around adversarial evaluation: subjecting the model to inputs designed to make it fail, documenting the results, and using that documentation as evidence that the provider has met its due diligence obligations. The analytical weight of Article 15 depends entirely on what this evaluation can, or cannot, see. When empirical research shows that adversarial training can teach a model to better hide a behaviour rather than remove it, the article's epistemological presumption (that if an adversarial test does not detect the behaviour, the behaviour is not there) is called into question.

The article applies fully from 2 August 2026, with the deferral margin proposed in the Commission's Digital Omnibus on AI of November 2025. Sleeper Agents treats these implications in depth, arguing that three converging lines of empirical research from 2024-2025 frontally challenge that presumption. Emergent Goals addresses the regime at the handover from product liability to Article 15, when the defect is emergent from training rather than from specification.

Essays referencing this

Sleeper Agents