Artificial Intelligence helps in the identification of astronomical objects

Tridimensional Universe map, made by the eBOSS collaboration at SDSS. (Credit: EPFL)

SHEEP is a new artificial intelligence software, developed by researchers at Instituto de Astrofísica e Ciências do Espaço, in Portugal, to help classify astronomical sources.

26th of May 2022 –  Classifying celestial objects is a long standing problem. With sources at near unimaginable distances, sometimes it’s difficult for researchers to distinguish, for example, between stars, galaxies, quasars1 or supernovae2.

The problem of classifying celestial objects is very challenging, in terms of the numbers and the complexity of the Universe, and artificial intelligence is a very promising tool for this type of task.
Andrew Humphrey

Instituto de Astrofísica e Ciências do Espaço’s (IA3) researchers Pedro Cunha and Andrew Humphrey tried to solve this classical problem by creating SHEEP, a machine learning algorithm, which determines the nature of astronomical sources. 

The first author of the article4 now published in the journal Astronomy & Astrophysics, Pedro Cunha, a PhD student at IA and in the Dep. of Physics and Astronomy of the Science Faculty of the University of Porto, thinks: “This work was born as a side project from my MSc thesis. It combined the lessons learned during that time into a unique project.”. Andrew Humphrey, Pedro Cunha’s MSc advisor and now PhD co-advisor pointed out that: “It was very cool to get such an interesting result, especially from a master’s thesis!”

SHEEP is a supervised machine learning pipeline which estimates photometric5 redshifts6 and uses this information when subsequently classifying the sources as a galaxy, quasar or star. “The photometric information is the easiest to obtain and thus is very important to provide a first analysis about the nature of the observed sources.”, says Pedro Cunha.

A novel step in our pipeline is that prior to performing the classification, SHEEP first estimates photometric redshifts, which are then placed into the data set as an additional feature for classification model training.

The team found that including the redshift and the coordinates of the objects allows the AI to understand them within a 3D map of the Universe, and use that together with color information, to make better estimations of source properties. For example, the AI learnt that there is a higher chance of finding stars closer to the Milky Way plane than at the Galactic Poles. Andrew Humphrey (IA & University of Porto) added: “When we allowed the AI to have a 3D view of the Universe, this really improved its ability to make accurate decisions about what each celestial object was.”

Artist’s impression of the Euclid spacecraft. (Credit: ESA/ATG medialab (spacecraft); NASA, ESA, CXC, C. Ma, H. Ebeling and E. Barrett (U. Hawaii/IfA), et al. and STScI (background))

Wide-area surveys, both ground- and space-based, like the Sloan Digital Sky Survey (SDSS), have yielded high volumes of data, revolutionizing the field of astronomy. Future surveys, carried out by the likes of the Vera C. Rubin Observatory , the Dark Energy Spectroscopic Instrument (DESI), the Euclid (ESA) space mission or the James Webb Space Telescope (NASA/ESA) will continue to give us more detailed imaging. However, analyzing all the data using traditional analysis methods can be time consuming due to the extremely high volume of data. AI or machine learning will be crucial for analyzing and making the best scientific use of this new data.

One of the most exciting parts is seeing how machine learning is helping us to better understand the universe. Our methodology shows us one possible path, while new ones are created along the process. It is an exciting time for Astronomy!
Pedro Cunha

This work is part of the team’s effort towards exploiting the expected deluge of data to come from those surveys, by developing artificial intelligence systems that efficiently classify and characterize billions of sources.

Imaging and spectroscopic surveys are one of the main resources for the understanding of the visible content of the Universe. The data from these surveys enables statistical studies of stars, quasars and galaxies , and the discovery of more peculiar objects.

For the Principal Investigator of the research group “The assembly history of galaxies resolved in space and time” at IA, Polychronis Papaderos: “The development of advanced Machine Learning algorithms, such as SHEEP, is an integral component of IA’s coherent strategy toward scientific exploitation of unprecedentedly large sets of photometric data for billions of galaxies with ESA’s Euclid space mission, scheduled for launch in 2023.”

Euclid will provide a detailed cartography of the Universe and shed light into the nature of the enigmatic dark matter and dark energy. The IA coordinates the Portuguese participation in Euclid and leads or co-leads, within the Euclid consortium, several work packages in the field of Cosmology and Extragalactic Astronomy.


  1. A Quasar is an extremely bright and very distant active galactic nucleus, formed by a supermassive black hole with an accretion disk around it. The disk material, accelerated almost to the speed of light, heats up by friction, becoming extremely bright. Because these objects are very distant, when they were discovered, they seemed to be stars, and were dubbed quasi-stellar objects, or quas-ars for short.
  2. A Supernova is a stellar explosion. It might be the result of the death of a massive star (type II), or of the accretion of matter from a giant star by a dwarf star, until it reaches critical mass (type Ia). These explosions are so intense that, for a few weeks, its brightness is equivalent to the brightness of the whole host galaxy.
  3. The Instituto de Astrofísica e Ciências do Espaço (Institute of Astrophysics and Space Sciences – IA) is the reference Portuguese research unit in this field, integrating researchers from the University of Lisbon, the University of Coimbra and the University of Porto, and encompasses most of the field’s national scientific output. It was evaluated as “Excellent” in the last evaluation of research and development units undertaken by Fundação para a Ciência e Tecnologia (FCT). IA’s activity is funded by national and international funds, including FCT/MCES (UIDB/04434/2020 e UIDP/04434/2020).
  4. The article “Photometric redshift-aided classification using ensemble learning”, was published online in Astronomy & Astrophysics (DOI: 10.1051/0004-6361/202243135).
  5. Photometry is a technique to measure the intensity of light emitted by astronomical objects.
  6. Redshift is the increase in the wavelength of light, which results from an object moving away from us. This wavelength shift, in the visible part of the electromagnetic spectrum, translates in a shift of color towards the red part of the spectrum, or “red-shift”. The redshift of an astronomical object can be used to determine the distance of the object.

Pedro Cunha; Andrew Humphrey

Science Communication Group
Ricardo Cardoso Reis; Sérgio Pereira; Filipe Pires (coordination, Porto); João Retrê (coordination, Lisbon)