EU AI Act checker exposes Big Tech compliance gaps

A new AI compliance tool developed by LatticeFlow AI has revealed that major models from Meta, OpenAI, and Alibaba may fall short of the European Union’s evolving AI Act standards. Early tests show that these models could face challenges in areas like cybersecurity and discriminatory output, with scores highlighting significant regulatory gaps.

Marina Mouka
October 16, 2024
4 minutes

A newly developed AI compliance tool has revealed that some of the leading artificial intelligence models created by Big Tech companies may struggle to meet the European Union’s upcoming regulatory standards. As the EU’s AI Act continues to evolve, early tests show that prominent AI models, including those from Meta, OpenAI, and Alibaba, are at risk of non-compliance in critical areas such as cybersecurity and discriminatory output.

The compliance checker, created by Swiss startup LatticeFlow AI alongside researchers from ETH Zurich and INSAIT in Bulgaria, is designed to evaluate AI models in line with the EU AI Act, which will gradually come into force over the next two years. The tool, known as the “Large Language Model (LLM) Checker,” assesses models across a range of categories including technical robustness, safety, and cybersecurity resilience, assigning scores between 0 and 1. Scores of less than 0.75 signal potential weaknesses in specific regulatory areas, which companies will need to address to avoid significant financial penalties.

A mixed bag of results for AI leaders

The LLM Checker’s results, published by LatticeFlow, showed that while some models scored well overall, notable deficiencies were identified in key areas. For instance, OpenAI’s widely used “GPT-3.5 Turbo” model received a concerning score of 0.46 for its performance in preventing discriminatory output—an issue that reflects ongoing challenges in mitigating bias within AI systems. Alibaba’s “Qwen1.5 72B Chat” model fared even worse, with a score of 0.37 in the same category.

Cybersecurity vulnerabilities were also flagged, with Meta’s “Llama 2 13B Chat” model receiving a score of just 0.42 for its ability to defend against prompt hijacking, a type of cyberattack where malicious actors disguise harmful prompts to extract sensitive data. French AI startup Mistral’s model “8x7B Instruct” performed similarly poorly, scoring 0.38 in the same area.

In contrast, Anthropic’s “Claude 3 Opus” model, backed by Google, emerged as the top performer with an impressive overall score of 0.89, indicating stronger compliance readiness. Nevertheless, the varying performance across models underscores the need for further fine-tuning to meet the stringent requirements of the forthcoming AI Act.

Regulatory compliance a growing priority for Big Tech

The EU’s AI Act represents one of the most comprehensive regulatory frameworks globally, aimed at curbing the risks posed by artificial intelligence technologies while promoting innovation. Companies that fail to comply with the Act’s provisions could face fines as high as €35 million or 7% of their global annual turnover.

LatticeFlow’s CEO, Petar Tsankov, emphasised that the tool offers companies a roadmap to adjust their models in line with evolving EU standards. “The EU is still working out all the compliance benchmarks, but we can already see some gaps in the models,” Tsankov noted. “With a greater focus on optimising for compliance, we believe model providers can be well-prepared to meet regulatory requirements.”

While the European Commission has not yet officially endorsed the LLM Checker, it has been closely monitoring its development. A spokesperson for the Commission described the tool as a “first step” towards translating the AI Act’s legal requirements into technical guidelines that companies can follow.

Looking ahead: A compliance challenge

As the EU moves forward with its AI Act, tech companies will need to prioritise compliance or face steep penalties. The mixed results from the LLM Checker provide early insights into the regulatory challenges ahead. While some AI developers are ahead of the curve, others must make significant improvements in critical areas like cybersecurity and bias mitigation to align with the forthcoming laws.

With the clock ticking towards the full enforcement of the AI Act, the need for compliance tools like LatticeFlow’s LLM Checker will only increase, offering Big Tech a clear path to regulatory adherence in an increasingly scrutinised AI landscape.

As Tsankov concluded, “This is an opportunity for AI developers to proactively address these issues, rather than reacting when the regulations are already in force.”

For AI companies, the next few years will be pivotal in shaping their compliance strategies and ensuring their technologies meet Europe’s stringent new standards.