PhD Defense by Zoe Ansari

Abstract: Machine learning methods are a promising approach to obtain precise estimations on large astronomical data. These methods allow us to explore a large parameter space that is not often fully explored with traditional methods due to for example insufficient data coverage and/or quality, as well as computational limitations. The aim of this Ph.D. work is to design, apply and evaluate different machine learning algorithms for two different scientific cases: 1) estimating photometric redshifts and 2) inferring some quantities of dust in supernovae.Determining photometric redshifts (photo-zs) of extragalactic sources to a high accuracy is paramount to measure distances in wide-field cosmological experiments. With only photometric information at hand, photo-zs are prone to systematic uncertainties in the intervening extinction and the unknown underlying spectral-energy distribution of different astrophysical sources, leading to degeneracies in the modern machine learning algorithm that impacts the level of accuracy for photo-z estimates. Here, I will present an infinite Gaussian mixture model as a semi-supervised machine learning method to resolve these model degeneracies and obtain separation between intrinsic physical properties of astrophysical sources and extrinsic systematics. Furthermore, I will present a mixture density network that I have applied to estimate full photo-z probability distributions, and their uncertainties from the probabilistic classification from the implemented infinite Gaussian mixture model. The estimated photo-zs from this method are competitive with state-of-the-art techniques and are comparable with photo-zs estimated by SDSS.The origin of dust in galaxies in the universe remains unclear, but there is growing evidence that core collapse supernovae (CCSNe) are efficient dust producers. However, determining the properties of dust formed in and around supernovae from observations remains challenging. This may be due to incomplete coverage of data in wavelength or time but also due to often inconspicuous signatures of dust in the observed data. To address this challenge I used a large set of dusty supernova spectral energy distribution simulated with MOCASSIN. I developed a neural network to determine the amount and temperature of dust as well as its composition which I will present. Furthermore, I will present a feature importance analysis that I conducted to find the minimum set of JWST bandpass filters required to predict quantities of dust with an accuracy comparable than achieved with standard models in the literature. For a most realistic scenario that is adopted in this work, the most reliable predicted dust masses and temperatures achieve a root-mean-square-error of ~0.12 dex and ~38 K, respectively. At last, I tested the trained neural network on observational data of SN 1987A at 615 days after the explosion. The neural network predicts 0.0008 mass of sun and 359 K with a certainty of 0.0005 mass of sun and 45 K for dust mass and temperature. The results are comparable to estimations in the literature and thus show a promising way ahead of determining dust in supernovae. However, more detailed simulated models are required to improve inferring properties of dust by allowing the neural network learn all combinations of assumptions (e.g. various dust species, optical depth and grain sizes).

Supervisor: Christa Gall

Assessment committee: Viviana Acquaviva; Mikako Matsuura; Steen Harle Hansen