News Release

AI reshapes how we observe the stars

Researchers use deep learning and large language models to classify stars with high accuracy

Peer-Reviewed Publication

Intelligent Computing

How the StarWhisper LightCurve series works

image: 

A large-language-model-based system for classifying stellar light curves, integrating specialized components for text, image, and audio data processing.

view more 

Credit: Yu-Yang Li et al.

AI tools are transforming how we observe the world around us — and even the stars beyond. Recently, an international team proved that deep learning techniques and large language models can help astronomers classify stars with high accuracy and efficiency. Their study, “Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve Classification,” was published Feb. 26 in Intelligent Computing, a Science Partner Journal.

The team introduced the StarWhisper LightCurve series, a trio of AI models, and evaluated their performance alongside other state-of-the-art approaches. All models were trained to classify variable stars from their light curves with automated deep learning, which enables automatic optimization of key factors such as learning rate, batch size, and model complexity, minimizing the need for manual tuning.

The team sourced training data from NASA’s Kepler and K2 missions, focusing on five major types of variable stars. A small number of rare variable stars were also included to improve model generalization.

The comprehensive evaluation shows high classification accuracy across different AI architectures for major variable star types. Among the top-performing models, the Conv1D + BiLSTM model — a hybrid deep learning approach combining convolutional layers for feature extraction and recurrent layers for temporal patterns — achieved 94% accuracy. The Swin Transformer model, a variant of the popular transformer architecture originally developed for natural language processing, achieved 99% accuracy.

Notably, the Swin Transformer demonstrated 83% accuracy in identifying Type II Cepheid stars, a rare class of pulsating stars that make up just 0.02% of the dataset.

Although the Swin Transformer delivers impressive accuracy, it requires extra preprocessing to convert light curve data into images. In contrast, StarWhisper LightCurve achieved nearly 90% accuracy with minimal manual intervention, reducing the need for explicit feature engineering. This efficiency not only streamlines data processing but also paves the way for parallel data analysis and the advancement of multi-modal AI applications in astronomy.

The StarWhisper LightCurve series consists of three specialized large language models, each fine-tuned for a different astronomical data format:

  • A large language model, built on Gemini 7B, for classifying light curves as structured time-series text.
  • A multimodal large language model, built on DeepSeek-VL-7B-Chat, for processing image-based light curve representations.
  • A large audio language model, built on Qwen-Audio, for converting light curves into sound waves.

The StarWhisper LightCurve series is part of the broader StarWhisper project, a large language model designed for astronomy with strong reasoning and instruction-following capabilities. More details can be found at: https://github.com/Yu-Yang-Li/StarWhisper.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.