Issue 06 - Tracing the family tree of AI models

Tracing the family tree of AI models

Biology inspires a method to map the ancestry of AI models — and what we inherit may matter more than we think.

At some point, many of us have wondered about the roots of who we are. Perhaps you’ve suspected a streak of athleticism from a grandparent, or wondered if quick wit runs in the family, or imagined that being a musical genius might just be your natural calling. Our DNA is our biological blueprint, revealing everything from ancestry and predisposition to disease, to subtle quirks of behaviour and aptitude. At the same time, genes are among the

Issue 06 | Aug 2025 Forging

most important factors in identifying familial relationships and evolutionary pathways.

Artificial intelligence, built on silicon rather than cells, has its own way of passing down traits. AI systems nowadays are rarely created from scratch but are often adapted from existing models, fine-tuned on new data, repurposed from different tasks. Over time, this has created a vast web of interrelated models. But unlike in biology, there is as yet no central registry, no “genome project”, no family tree.

In a new study selected as an oral presentation at the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, a distinction given to only 0.8% of over 30,000 submissions, Assistant Professor Wang Xinchao from the Department of Electrical and Computer Engineering, College of Design and Engineering, National University of Singapore, propose a strikingly simply question: if AI models inherit from each other, can we trace their ancestry?

With the paper’s first author Mr Yu Runpeng, Asst Prof Wang developed a system to do just that. “We call it ‘neural lineage detection.’ It traces the ‘parent’ model from which a given AI model was fine-tuned — and sometimes even its ‘grandparent’ or ‘great-grandparent’,” says Asst Prof Wang. “It’s part forensic tool, part diagnostic instrument, and wholly relevant in untangling today’s increasingly complex, interdependent web of AI systems.”

Getting to the roots

To bring natural lineage detection to life, the duo developed two complementary methods, each tackling the problem from a different angle. The first, called a learning-free approach, is a computational shortcut that approximates how an AI model changes when it’s fine-tuned from another. Rather than running a model through multiple simulations, it uses a mathematical trick, inspired by ideas from theoretical machine learning, to estimate how closely a child model aligns with each potential parent, based on their internal structures. It’s like scanning the fingerprints of a model’s architecture and comparing them against known prints, with a lens that takes inheritance into account.

Assistant Professor Wang Xinchao and his team created a framework that traces the ancestry of AI models.

Issue 06 | Aug 2025

The second method is more data-driven. The researchers trained a dedicated system — an AI that learns to discern the telltale signs of fine-tuning — by studying hundreds of known parent-child model pairs. Apart from learning to match models by their current behaviour, the system also picks up on deeper signs of inheritance embedded in their structure and output. It’s like training a genealogist to read between the lines of a fragmented family history.

Both methods were remarkably effective across a range of tasks, from classifying images to detecting and labelling objects in complex scenes. In particular, the lineage detection methods performed well even when models had been finetuned through multiple generations or trained on scarce data. One demonstration involved a Frankenstein-like AI stitched together from fragments of nine image models. The team’s system expertly traced each individual fragment back to its source — a feat akin to pinpointing not just your ancestors but which traits came from whom.

“In a world where AI models are shared, adapted and redeployed across multiple platforms, understanding their lineage can help with accountability, bias tracing and even IP protection.”

“In a world where AI models are shared, adapted and redeployed across multiple platforms, understanding their lineage can help with accountability, bias tracing and even IP protection,” adds Asst Prof Wang. “A model’s ancestry can reveal where its assumptions came from, what data it was likely exposed to and what vulnerabilities or blind spots it may have inherited.”

Rooted in security

Such capabilities could greatly support efforts in AI governance, offering a way to map out how different systems are connected, where their training histories overlap and how knowledge circulates across the model ecosystem. Just as genetic analysis can help identify inherited disorders or risk factors in humans, neural lineage detection might one day help ascertain where an AI system’s flaws come from — and how to remedy them.

06 | Aug 2025

Looking ahead, Asst Prof Wang plans to apply the framework to more challenging tasks, such as tracing lineage when the model’s internal structure has been altered, which makes direct comparisons tough, or when only its external behaviour is observable (like in a black-box scenario, where ancestry is purely inferred from the model’s outputs without access to its inner workings).

“Just like biology, understanding a system’s origins can illuminate not only how it functions but how it might evolve,” adds Asst Prof Wang.

Turn static files into dynamic content formats.

Create a flipbook