Stellar Cartography

Exoplanet Host Star Classification: Multi-Objective Optimisation Of Incomplete Stellar Abundance Data

By Keith Cowing
Status Report
RAS Techniques and Instruments
June 25, 2024
Filed under , , , , , , , ,
Exoplanet Host Star Classification: Multi-Objective Optimisation Of Incomplete Stellar Abundance Data
This infographic compares the characteristics of three classes of stars in our galaxy: Sunlike stars are classified as G-type stars; stars less massive and cooler than our Sun are K dwarfs; and even fainter and cooler stars are the reddish M dwarfs. The habitable zones, potentially capable of hosting life-bearing planets, are wider for hotter stars. The longevity for red dwarf M stars can exceed 100 billion years. K dwarf ages can range from 15 to 45 billion years. Meanwhile, our Sun only lasts for 10 billion years. The relative amount of harmful radiation (to life as we know it) that stars emit can be 80 to 500 times more intense for M dwarfs relative to our Sun, but only 5 to 25 times more intense for the orange K dwarfs. Red dwarfs make up the bulk of the Milky Way’s population, about 73%. Sunlike stars are merely 6% of the population, and K dwarfs are at 13%.– NASA, ESA and Z. Levy (STScI)

The presence of a planetary companion around its host star has been repeatedly linked with stellar properties, affecting the likelihood of sub-stellar object formation and stability in the protoplanetary disc, thus presenting a key challenge in exoplanet science.

Furthermore, abundance and stellar parameter datasets tend to be incomplete, which limits the ability to infer distributional characteristics harnessing the entire dataset. This work aims to develop a methodology using machine learning and multi-objective optimisation for reliable imputation for subsequent comparison tests and host star recommendation. It integrates fuzzy clustering for imputation and ML classification of hosts and comparison stars into an evolutionary multi-objective optimisation algorithm.

We test several candidates for the classification model, starting with a binary classification for giant planet hosts. Upon confirmation that the XGBoost algorithm provides the best performance, we interpret the performance of both the imputation and classification modules for binary classification. The model is extended to handle multi-label classification for low-mass planets and planet multiplicity. Constraints on the model’s use and feature/sample selection are given, outlining strengths and limitations.

We conclude that the careful use of this technique for host star recommendation will be an asset to future missions and the compilation of necessary target lists.

Schematic for the chromosome encoding within the GA design. The specific configuration for both the imputation and classification model are represented as a string of genes within the chromosome. The set of imputation genes consists of the clustering hyperparameters and the coordinates of all designated cluster centres. The classification genes will be values used to build and define the classification model, and therefore depend on whichever model is being utilised in that particular run. Within the classification module, schematic solid directional lines represent paths which are present in both the binary and multi-label variations of the design, while the dashed line is used to represent those present only in the multi-label modification. — RAS Techniques and Instruments

Exoplanet host star classification: Multi-Objective Optimisation of incomplete stellar abundance data, RAS Techniques and Instruments (open access)

Astrobiology

Explorers Club Fellow, ex-NASA Space Station Payload manager/space biologist, Away Teams, Journalist, Lapsed climber, Synaesthete, Na’Vi-Jedi-Freman-Buddhist-mix, ASL, Devon Island and Everest Base Camp veteran, (he/him) 🖖🏻