News Release 14-Jan-2025

Researchers find the key to Artificial Intelligence’s learning power – an inbuilt, special kind of Occam’s razor

Peer-Reviewed Publication

University of Oxford

A new study from Oxford University has uncovered why the deep neural networks (DNNs) that power modern artificial intelligence are so effective at learning from data. The new findings demonstrate that DNNs have an inbuilt ‘Occam's razor,’ meaning that when presented with multiple solutions that fit training data, they tend to favour those that are simpler. What is special about this version of Occam’s razor is that the bias exactly cancels the exponential growth of the number of possible solutions with complexity. The study has been published today (14 Jan) in Nature Communications.

In order to make good predictions on new, unseen data - even when there are millions or even billions more parameters than training data points - the researchers hypothesised that DNNs would need a kind of ‘built-in guidance’ to help them choose the right patterns to focus on.

“Whilst we knew that the effectiveness of DNNs relies on some form of inductive bias towards simplicity – a kind of Occam’s razor – there are many versions of the razor. The precise nature of the razor used by DNNs remained elusive” said theoretical physicist Professor Ard Louis (Department of Physics, Oxford University), who led the study.

To uncover the guiding principle of DNNs, the authors investigated how these learn Boolean functions – fundamental rules in computing where a result can only have one of two possible values: true or false. They discovered that even though DNNs can technically fit any function to data, they have a built-in preference for simpler functions that are easier to describe. This means DNNs are naturally biased towards simple rules over complex ones.

Furthermore, the authors discovered that this inherent Occam’s razor has a unique property: it exactly counteracts the exponential increase in the number of complex functions as the system size grows. This allows DNNs to identify the rare, simple functions that generalise well (making accurate predictions on both the training data and unseen data), while avoiding the vast majority of complex functions that fit the training data but perform poorly on unseen data.

This emergent principle helps DNNs do well when the data follows simple patterns. However, when the data is more complex and does not fit simple patterns, DNNs do not perform as well, sometimes no better than random guessing. Fortunately, real-world data is often fairly simple and structured, which aligns with the DNNs' preference for simplicity. This helps DNNs avoid overfitting (where the model gets too ‘tuned’ to the training data) when working with simple, real-world data.

To delve deeper into the nature of this razor, the team investigated how the network's performance changed when its learning process was altered by changing certain mathematical functions that decide whether a neuron should ‘fire’ or not.

They found that even though these modified DNNs still favour simple solutions, even slight adjustments to this preference significantly reduced their ability to generalize (or make accurate predictions) on simple Boolean functions. This problem also occurred in other learning tasks, demonstrating that having the correct form of Occam’s razor is crucial for the network to learn effectively.

The new findings help to ‘open the black box’ of how DNNs arrive at certain conclusions, which currently makes it difficult to explain or challenge decisions made by AI systems. However, while these findings apply to DNNs in general, they do not fully explain why some specific DNN models work better than others on certain types of data.

Christopher Mingard (Department of Physics, Oxford University), co-lead author of the study, said: “This suggests that we need to look beyond simplicity to identify additional inductive biases driving these performance differences.”

According to the researchers, the findings suggest a strong parallel between artificial intelligence and fundamental principles of nature. Indeed, the remarkable success of DNNs on a broad range of scientific problems indicates that this exponential inductive bias must mirror something deep about the structure of the natural world.

“Our findings open up exciting possibilities.” said Professor Louis. “The bias we observe in DNNs has the same functional form as the simplicity bias in evolutionary systems that helps explain, for example, the prevalence of symmetry in protein complexes. This points to intriguing connections between learning and evolution, a connection ripe for further exploration.”

Notes to editors:

For media enquiries and interview requests, contact Professor Ard Louis: ard.louis@physics.ox.ac.uk

The paper ‘Deep neural networks have an inbuilt Occam’s razor' will be published in Nature Communications at 10 AM GMT / 5 AM ET Tuesday 14 January 2025 at https://www.nature.com/articles/s41467-024-54813-x To view a copy of the study in advance under embargo contact Professor Ard Louis: ard.louis@physics.ox.ac.uk

About the University of Oxford:

Oxford University has been placed number 1 in the Times Higher Education World University Rankings for the ninth year running, and number 3 in the QS World Rankings 2024. At the heart of this success are the twin-pillars of our ground-breaking research and innovation and our distinctive educational offer. Oxford is world-famous for research and teaching excellence and home to some of the most talented people from across the globe. Our work helps the lives of millions, solving real-world problems through a huge network of partnerships and collaborations. The breadth and interdisciplinary nature of our research alongside our personalised approach to teaching sparks imaginative and inventive insights and solutions.

Through its research commercialisation arm, Oxford University Innovation, Oxford is the highest university patent filer in the UK and is ranked first in the UK for university spinouts, having created more than 300 new companies since 1988. Over a third of these companies have been created in the past five years. The university is a catalyst for prosperity in Oxfordshire and the United Kingdom, contributing £15.7 billion to the UK economy in 2018/19, and supports more than 28,000 full time jobs.

Journal

Nature Communications

DOI

10.1038/s41467-024-54813-x

Article Title

Deep neural networks have an inbuilt Occam’s razor

Article Publication Date

14-Jan-2025

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.