New Method for Comparing Neural Networks Reveals How AI Works

Neural networks

Pointers at Glance

  • Researchers have developed a new method for comparing neural networks. They are looking at ways to improve neural network robustness.
  • Neural networks identify patterns in datasets. The AI research community might not spend more time exploring new architectures.

The Los Alamos National Laboratory team has developed a new approach for comparing neural networks that looks within the “black box” of AI to make researchers understand the behavior of neural networks. Neural networks fetch patterns in datasets. They are used everywhere in society, in applications like facial recognition systems, virtual assistants, and self-driving cars.

Haydn Jones, a researcher in the Advanced Research in Cyber Systems group at Los Alamos, says the artificial intelligence research community doesn’t necessarily understand what neural networks are doing. The new method compares neural networks, an essential step toward better understanding the mathematics behind AI.

The lead author of the paper, Jones, “If You’ve Trained One, You’ve Trained Them All: Inter-Architecture Similarity Increases With Robustness,” was recently presented at the Conference on Uncertainty in AI. In addition to studying network similarity, the paper is a significant step toward characterizing the behavior of robust neural networks.

Neural networks are high-performance but fragile. For instance, self-driving cars use neural networks to identify signs. When conditions are ideal, they do this very well.

Researchers are looking at ways to improve neural network robustness. A state-of-the-art approach involves attacking networks during the time of their training process. Researchers wanted to introduce aberrations and train the AI to ignore them. This process is known as adversarial training, making it harder to fool the networks.

Jones, Los Alamos collaborators Jacob Springer and Garrett Kenyon, and Jones’ mentor Juston Moore, applied their new metric of network similarity to trained neural networks and identified that adversarial training causes neural networks in the computer vision domain to converge to very similar data representations, nevertheless of network architecture, as the magnitude of the attack increases.

There has been an extensive effort in industry and the academic community searching for exemplary architecture for neural networks. But, the findings of the Los Alamos team indicate that the introduction of adversarial training narrows this search space substantially.

As a result, the AI research community may not need to spend as much time exploring new architectures, feeling that adversarial training causes diverse architectures to converge to similar solutions.

Skip to content