Nabeel Seedat
Hi ๐ I am a PhD student in Machine Learning at the University of Cambridge supervised by Prof. Mihaela van der Schaar. My research interests are data modality agnostic, focussing on: Data-Centric AI, Uncertainty Quantification, Synthetic Data & Trustworthy ML.
Since data is the fuel for ML, my research seeks to develop systematic data-centric approaches applicable across different data modalities including tabular, image & text — with the goal of making ML systems more reliable & trustworthy ๐ฆพ, whilst also improving both model performance & training efficiency ๐. Most recently on LLMs!
I hold an Masters degree from Cornell University working on Bayesian Deep Learning, as well as a Masters from the University of the Witwatersrand (South Africa) working on Signal Processing & ML for Parkinson’s Disease. I also hold a dual-bachelors in Information Engineering & Biomedical Engineering from the University of the Witwatersrand (South Africa). Before starting my PhD I have spent time working on production ML systems as a Data Scientist focussed on Computer Vision at Shutterstock (USA) and an ML Engineer focussed on NLP at Multichoice (Africa’s largest multimedia company).
To reach out, please send an email to: ns741@cam.ac.uk
๐๏ธ News ๐๏ธ
Jan 2024 โ Three papers accepted! ๐ฅณ One paper at AISTATS2024 and two papers at ICLR2024 — a first time :) Looking forward to presenting these with my co-authors!
Dec 2023 โ DC-Check accepted to IEEE Transactions on AI! Interested in Data-Centric AI, then checkout our paper Navigating Data-Centric Artificial Intelligence with DC-Check: Advances, Challenges, and Opportunities ๐
Nov 2023 โ Gave a talk at Microsoft Research Cambridge on Data-Centric AI. Thanks for hosting me! ๐
Oct 2023 โ Data-Centric AI Tutorial accepted to NeurIPS2023! w/ Mihaela van der Schaar and Isabelle Guyon (Google Research). See you in New Orleans ๐๐บ๐ถ๏ธ
Oct 2023 โ Four papers accepted to NeurIPS2023! Three papers on the main track and one on the D&B track. Camera-ready versions of our papers coming soon!
Sep 2023 โ On September 11 I gave a talk on Data-Centric AI at the AI and Machine Learning in Healthcare Summer School organised by the Cambridge Center for AI in Medicine (CCAIM). Have a look at the fantastic program here: https://ccaim.cam.ac.uk/program/.
Aug 2023 โ Presented a tutorial on Data-Centric AI@ IJCAI2023! together w/ Mihaela van der Schaar. It was a fantastic experience to engage with the community about this important research area!
July 2023 โ Selected by the Mail & Guardian in the Top 200, Young South Africas’s for 2023! ๐ฟ๐ฆ
June 2023 โ Awarded the best research poster presentation at the Future of Data-Centric AI conference
May 2023 โ Paper accepted to ICML2023 on transportable structure learning [paper].
March 2023 โ Our Data-Centric AI checklist called DC-Check was featured by MarkTechPost and the Montreal AI Ethics Institute (see the DC-Check paper).
Jan 2023 โ New paper accepted to AISTATS2023 on improving conformal prediction w/ self-supervised learning [paper].
Oct 2022 โ Excited to be giving talks on Data-Centric AI at AstraZeneca, Queen Mary University of London and the University of Cape Town!
Sept 2022 โ _New paper accepted to NeurIPS2022 on data-centric AI to audit training datasets for tabular, images and text [paper]. Looking forward to presenting together with my co-authors!
May 2022 โ Two papers accepted! ๐ฅณ at ICML22 on data-centric AI for reliable deployment [paper] and treatment effect estimation in continuous time [paper].
Oct 2021 โ I have officially started a PhD in Machine Learning in the University of Cambridge under the supervision of Mihaela van der Schaar!
Publications
Please find some of my publications below (a more up-to-date list can be found onย google scholar).
“*” denotes equal contribution.
Top ML/AI conferences
- N.Seedat, F.Imrie, M. van der Schaar. ``Dissecting sample hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI.’’ ICLR 2024 [paper]
- T.Liu, N.Astorga, N.Seedat, M. van der Schaar. ``Large Language Models to Enhance Bayesian Optimization.’’ ICLR 2024 [paper]
- N. Huynh, J. Berrevoets,N.Seedat, J.Crabbe, Z.Qian, M. van der Schaar. ``DAGnosis: Localized Identification of Data Inconsistencies using Structures.’’ AISTATS 2024 [paper]
- N.Seedat, J.Crabbe, Z.Qian, M. van der Schaar. ``TRIAGE: Characterizing and auditing training data for improved regression.’’ NeurIPS 2023 [paper]
- N.Seedat*, B.van Breugel*, F.Imrie, M. van der Schaar. ``Can you rely on your model evaluation? Improving model evaluation with synthetic test data.’’ NeurIPS 2023 [paper]
- L.Hansen*, N.Seedat*, M. van der Schaar, A.Petrovic. ``Reimagining Synthetic Data Generation through DataCentric AI: A Comprehensive Benchmark..’’ NeurIPS 2023 (D&B) [paper]
- H.Sun, B.van Breugel, J.Crabb'e, N.Seedat, M. van der Schaar. ``What is Flagged in Uncertainty Quantification? Latent Density Models for Uncertainty Categorization.’’ NeurIPS 2023 [paper]
- N.Seedat*, A.Jeffares*, F.Imrie, M. van der Schaar. ``Improving adaptive conformal prediction using self-supervised learning.’’ AISTATS 2023 [paper]
- J.Berrevoets, N.Seedat, F.Imrie M. van Der Schaar. ``Differentiable and transportable structure learning.’’ ICML 2023 [paper]
- N.Seedat, J.Crabbe, I.Bica, M. van der Schaar. ``Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data.’’ NeurIPS 2022 [paper]
- N.Seedat, J.Crabbe, M. van der Schaar. ``Data-SUITE: Data-Centric identification of in-distribution incongruous examples.’’ ICML 2022 (Spotlight) [paper]
- N.Seedat*, F.Imrie*, A.Bellot, Z.Qian, M. van der Schaar. ``Continuous-time modeling of counterfactual outcomes using neural controlled differential equations.’’ ICML 2022 [paper]
- N.Seedat. ``MCU-Net: A framework towards uncertainty representations for decision support system patient referrals in healthcare contexts.’’ KDD 2020, Spotlight Presentation: Workshop on Applied Data Science for Healthcare & ICML 2020: Uncertainty & Robustness in Deep Learning Workshop. [paper]
- N.Seedat and C.Kanan. ``Towards calibrated and scalable uncertainty representations for neural networks.’’ NeurIPS 2019 - 4th Workshop on Bayesian Deep Learning. [paper]
- N.Seedat and V.Aharonson. ``Machine learning discrimination of Parkinsonโs Disease stages from walker-mounted sensors data.’’ AAAI 2020 - International Workshop on Health Intelligence and Studies in Computational Intelligence (Springer), 2020. [paper]
Journals, other conferences, pre-prints
- N.Seedat, F.Imrie, M. van der Schaar. ``Navigating Data-Centric Artificial Intelligence with DC-Check: Advances, Challenges, and Opportunities.’’ IEEE Transactions on Artificial Intelligence, 2024. [paper]
- E.Heremans, N.Seedat, B.Buyse, D.Testelmans, M. van der Schaar, & M. De Vos. ``U-PASS: an Uncertainty-guided deep learning Pipeline for Automated Sleep Staging.’’ Computers in Biology and Medicine, Vol 171, 2024. [paper]
- H.Liu*, N.Seedat* and J.Ive. ``Modelling Disagreement in Automatic Data Labelling for Semi-Supervised Learning in Clinical Natural Language Processing.’’ arXiv preprint, arXiv:2205.14761, 2022. [paper]
- N.Seedat and V.Aharonson. ``Automated Machine Vision Enabled Detection of Movement Disorders from hand drawn spirals.’’ IEEE International Conference on Health Informatics (IEEE ICHI), 2020. [paper]
- N.Seedat, V.Aharonson and Y.Hamzany. ``Automated and interpretable m-health discrimination of vocal cord pathology enabled by machine learning.’’ IEEE Conference on Computer Science and Data Engineering, 2020. [paper]
- V.Aharonson, N.Seedat, S.Korn, S.Baer, M.Postema, G.Yahalom. ‘‘Automated stage discrimination of Parkinsonโs Disease.’’, BIO Integration Journal, 2020. [paper]
- N.Seedat, N.Sen, N.Naicker, K.Sharma, A. Almeida, G.Kalyansundaram, B.Mkwanazi, M.Velayudan. ``PEMS: Custom Neural Machine Translation System-Making subtitling of Portuguese TV shows and movies on the African continent work.’’ IEEE ICECET, 2021. [paper]
- N.Seedat, D.Beder, V.Aharonson and S.Dubowsky. ``A comparison of footfall detection algorithms from walker mounted sensors data.’’ IEEE EBBT, 2018. [paper]
- V.Aharonson, N.Seedat, I.Schlesinger, A.McDonald, S.Dubowsky and A.Korczyn. ``Feasibility of an instrumented walker to quantify treatment effects on Parkinsonโs patient gait.’’ IEEE EBBT, 2018. [paper]
- N.Seedat, I.Mohamed and AK.Mohamed .``Custom Force Sensor and Sensory Feedback System to Enable Grip Control of a Robotic Prosthetic Hand.’’ IEEE BioRob, 2018. [paper]
- N.Seedat and A.van Wyk. ``Quadcopter Control using Intelligent Control.’’ Deep Learning Indaba. [paper]
Tutorials
N.Seedat, I.Guyon, M. van der Schaar. ``Data-Centric AI for reliable and responsible AI: from theory to practice’’ NeurIPS 2023 Tutorial.
N.Seedat and M. van der Schaar. ``Data-Centric AI: Foundation, Frontiers and Applications.’’ IJCAI 2023 Tutorial.
Invited Talks
Apple (Topic: Data-Centric AI - Data characterization & Synthetic Data) (April 2024)
Microsoft Research Cambridge (Topic: Data-Centric AI) (November 2023)
Future of Data-Centric AI Conference Talk (Topic: Data-IQ) (June 2023)
Discovery Limited Invited Talk (Topic: Data-Centric AI) (Feb 2023)
AstraZeneca AI Journal Club Invited Talk (Topic: Data-Centric AI) (Nov 2022)
Queen Mary University London CogSci Invited Talk (Topic: Data-Centric AI) (Oct 2022)
University of Cape Town Invited Talk (Topic: Data-Centric AI) (Oct 2022)