Computers can now drive cars, beat world champions at board games like chess and Go, and even write prose. The revolution in artificial intelligence stems in large part from the power of one particular kind of artificial neural network, whose design is inspired by the connected layers of neurons in the mammalian visual cortex. These “convolutional neural networks” (CNNs) have proved surprisingly adept at learning patterns in two-dimensional data—especially in computer vision tasks like recognizing handwritten words and objects in digital images.
But when applied to data sets without a built-in planar geometry—say, models of irregular shapes used in 3D computer animation, or the point clouds generated by self-driving cars to map their surroundings—this powerful machine learning architecture doesn’t work well. Around 2016, a new discipline called geometric deep learning emerged with the goal of lifting CNNs out of flatland.
Now, researchers have delivered, with a new theoretical framework for building neural networks that can learn patterns on any kind of geometric surface. These “gauge-equivariant convolutional neural networks,” or gauge CNNs, developed at the University of Amsterdam and Qualcomm AI Research by Taco Cohen, Maurice Weiler, Berkay Kicanaoglu and Max Welling, can detect patterns not only in 2D arrays of pixels, but also on spheres and asymmetrically curved objects. “This framework is a fairly definitive answer to this problem of deep learning on curved surfaces,” Welling said.
The researchers’ solution to getting deep learning to work beyond flatland also has deep connections to physics. Physical theories that describe the world, like Albert Einstein’s general theory of relativity and the Standard Model of particle physics, exhibit a property called “gauge equivariance.” This means that quantities in the world and their relationships don’t depend on arbitrary frames of reference (or “gauges”); they remain consistent whether an observer is moving or standing still, and no matter how far apart the numbers are on a ruler. Measurements made in those different gauges must be convertible into each other in a way that preserves the underlying relationships between things.
Physics and machine learning have a basic similarity. As Cohen put it, “Both fields are concerned with making observations and then building models to predict future observations.” Crucially, he noted, both fields seek models not of individual things — it’s no good having one description of hydrogen atoms and another of upside-down hydrogen atoms — but of general categories of things. “Physics, of course, has been quite successful at that.”
Meanwhile, gauge CNNs are gaining traction among physicists like Cranmer, who plans to put them to work on data from simulations of subatomic particle interactions. “We’re analyzing data related to the strong [nuclear] force, trying to understand what’s going on inside of a proton,” Cranmer said. The data is four-dimensional, he said, “so we have a perfect use case for neural networks that have this gauge equivariance.”
Risi Kondor, a former physicist who now studies equivariant neural networks, said the potential scientific applications of gauge CNNs may be more important than their uses in AI.
But while physicists’ math helped inspire gauge CNNs, and physicists may find ample use for them, Cohen noted that these neural networks won’t be discovering any new physics themselves. “We’re now able to design networks that can process very exotic kinds of data, but you have to know what the structure of that data is” in advance, he said. In other words, the reason physicists can use gauge CNNs is because Einstein already proved that space-time can be represented as a four-dimensional curved manifold. Cohen’s neural network wouldn’t be able to “see” that structure on its own. “Learning of symmetries is something we don’t do,” he said, though he hopes it will be possible in the future.
Cohen can’t help but delight in the interdisciplinary connections that he once intuited and has now demonstrated with mathematical rigor. “I have always had this sense that machine learning and physics are doing very similar things,” he said. “This is one of the things that I find really marvelous: We just started with this engineering problem, and as we started improving our systems, we gradually unraveled more and more connections.”
Original story source is Quanta Magazine, an editorially independent publication of the Simons Foundation whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the physical and life sciences.
Your email address will not be published. Required fields are marked *