Welcome to my webpage ! 
	
	
	I am a Senior Research Scientist at Owkin. Our goal is to understand complex biology through AI, bringing together domain knowledge expertise, traditional biostatistics methods and advanced machine learning techniques. We eventually aim at developing cutting-edge precision medicine tools.
	
	
	Before that, I was a Postdoctoral Researcher at Telecom Paris, working in the S2A
	research team (Signal, Statistics and Learning), which is part of the LTCI Laboratory (Communication and Information Theory),
	inside the department IDS (Image, Data and Signal). 
 
	
	I worked on Machine Learning, Deep Learning and Transfer Learning, with a particular
	interest in Trustworthy AI: biases, fairness and robustness. In line with these subjects, I worked in collaboration with Idemia to reduce the demographic biases in face recognition.
	
	 I also co-founded a startup, althiqa, together with Victor Storchan. We are dedicated to AI Evaluation and to simplifying AI reporting for data scientists.
	Don't hesitate to reach out if you have any question!
	
	 I completed a PhD in probability theory at Modal'X, under the supervision of Nathanaël Enriquez and Laurent Ménard. I was working on random matrices and random graphs theory. Here is the manuscript.
	
	You can find a detailed CV here. 
	
	Contact
	Telecom Paris
	Bureau 5C14 
	19 place Marguerite Perey 
	91120 Palaiseau, France 
 
	 Email: noirynathan 'at' gmail.com 
	
	 Research 
	
		My research interests lie at the intersection between probability, statistics and machine learning.
		So far, the topics I work / have worked on include:
	
		-  Transfer Learning, Covariate Shift; 
-  Biases and Fair Learning in NLP and Computer Vision; 
-  Supervised and Self-supervised Contrastive Learning; 
-  Matching and Online Matching; 
-  Spectra of large random graphs; 
-  Eigenvalues and eigenvectors of large deformed random matrices; 
-  Asymptotic analysis of exploration algorithms on sparse random graphs. 
 Ongoing works / Under review 
		  A fast softmax-based adversarial attack detector 
		with Marine Picot, Pablo Piantanida and Pierre Colombo.
		
		  A functional Perspective on Multi-Layer Out-of-Distribution Detection 
		with Eduardo D. C. Gomes, Pierre Colomb, Guillaume Staerman and Pablo Piantanida.
		
		  A simple unsupervised data depth-based method to detect adversarial images 
		with Marine Picot, Guillaume Staerman, Federica Granese, Francisco Messina, Pablo Piantanida and Pierre Colombo.
		
		  Toward Stronger Textual Attack Detectors 
		with Pierre Colombo, Marine Picot, Guillaume Staerman and Pablo Piantanida.
		
		  The Glass Ceiling of Automatic Evaluation in Natural Language Generation 
		with Pierre Colombo, Maxime Peyrard, Robert West and Pablo Piantanida.
		
		  A Novel Information Theoretic Objective to Disentangle Representations for Faire Classification 
		with Pierre Colombo, Guillaume Staerman and Pablo Piantanida.
		
	
	 Papers 
	
	
		-   What are the best systems? New perspectives on NLP Benchmarking 
		with Pierre Colombo, Ekhine Irurozki and Stéphan Clémençon.
		NeurIPS (2022). Links: journal.
		
-   Beyond Mahalanobis-Based Scores for Textual OOD Detection 
		with Pierre Colombo, Eduardo Gomes, Guillaume Staerman and Pablo Piantanida.
		NeurIPS (2022). Links: journal.
		
-   Mitigating Gender Bias in Face Recognition Using the von Mises-Fisher Mixture Model 
		with Jean-Rémy Conti, Vicent Spiegelman, Stéphane Gentric and Stéphan Clémençon.
		ICML (2022). Links: journal.
		
-   Learning Disentangled Textual Representations via Statistical Measures of Similarity 
		with Pierre Colombo, Guillaume Staerman and Pablo Piantanida.
		ACL oral (2022). Links: journal.
		
-   Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm 
		with Vianney Perchet and Flore Sentenac.
		NeurIPS (2021). Links: pdf, arXiv, journal.
		
-   Learning to Rank Anomalies: Scalar Performance Criteria and Maximization of Two-Sample Rank Statistics
		with Myrto Limnios and Stéphan Clémençon.
		LIDTA (2021). Links: journal.
		
-   Learning from Biased Data: A Semi-Parametric Approach
		with Patrice Bertail, Stéphan Clémençon and Yannick Guyonvarch.
		ICML (2021). Links: pdf.
		
-   Large deviations for spectral measures of some spiked matrices  
		with Alain Rouault. 
		RMTA (2021). Links: pdf, arXiv.
		
-   A solvable class of renewal processes and its applications 
		with Nathanaël Enriquez.
		Electronic Communications in Probability, Volume 25 (2020). 
		Links: pdf, arXiv, journal.
		
-   Depth First Exploration of a Configuration Model 
		with Nathanaël Enriquez, Gabriel Faraud and Laurent Ménard.
		EJP (2022). Links: pdf, arXiv.
		
-   Spectral Measures of Spiked Random Matrices   
		Journal of Theoritical Probability (2020). Links: pdf, arXiv.
		
-   Spectral asymptotic expansion of Wishart matrices with exploding moments   
		ALEA Latin American Journal of Probability and Mathematical Statistics, Volume XV, Number 2 (2018), pp. 897-911. Links: pdf, journal.
		
 
	
	 Others 
	
	
		-  I gave a talk at the Dataiku seminar, it was about OOD detection in the age of Transformers. You can find it here. 
		
-  For the International Conference on Machine Learning (ICML), I gave a short talk about the article Learning from Biased Data: A Semi-Parametric Approach. You can find it here.  
		You can also have a look at the poster. 
		
-  
			For the conference Random Matrices and Random Graphs, organized by the GDR MEGA (Matrices Et Graphes Aléatoires) at the CIRM (2019), I made a poster on spectral measures of spiked random matrices. You can find it here. 
		
- 
			I gave a talk at the 2018 edition of the conference Les probabilités de demain. It was a short introduction to Wigner and Marchenko-Pastur laws and their generalizations, in links with a work on Wishart matrices with exploding moments. You can find the record (in french) here.
	
 Teaching 
	
	
		-  2020-2021: Supervision of a group of students in internship at Safran   
		            For the Masters Big Data and Artificial Intelligence, Télécom Paris. 
			        Subjects: Active Learning, Semantic Segmentation, Sampling...
		 
-  2021-2022: Machine Learning Practical Sessions with Python   
		            For the Masters Big Data and Artificial Intelligence, Télécom Paris. 
			        Among others: k-NN, LDA, Logistic Regression, SVM, Boosting, Random Forests, NMF...
		 
-  2020-2021: Supervision of a group of students in internship at Sicara   
		            For the Masters Big Data and Artificial Intelligence, Télécom Paris. 
			        Subjects: Computer Vision, Detection of edges, Detection of keypoints...
		 
-  2020-2021: Supervision of a group of students in internship at Air Liquide   
		            For the Masters Big Data and Artificial Intelligence, Télécom Paris. 
			        Subject: Causal Machine Learning and Counterfactual Analysis for medical applications.
		 
-  2020-2021: Machine Learning Practical Sessions with Python   
		            For the Masters Big Data and Artificial Intelligence, Télécom Paris. 
			        Among others: k-NN, LDA, Logistic Regression, SVM, Boosting, Random Forests, NMF...
		 
-  2019-2020: Statistical Methods Tutorials   
		            For bachelors in mathematics and economics, Paris Nanterre. 
			        Among others: descriptive statistics, hypothesis testing...
		 
-  2017-2019: Algebra, Analysis and Optimization Tutorials   
		            For bachelors in mathematics and economics, Paris Nanterre. 
			        Among others: algebraic structures, multivariate calculus, Lagrange’s method...
		 
 
	 Miscellaneous 
	Organization duties
	
	
		- 
		Co-organization of the meetups of the Data Science and Artificial Intelligence for Digitalized Industry and Services chaire (DSAIDIS), where academic researchers in Machine Learning present their work to the industrial partners of Telecom Paris.
		
- 
		With Laure Dumaz and Guillaume Barraquand, I co-organize the monthly seminar of the GDR MEGA. More precisely, I am responsible of the morning session which consists of a mini-course (1h30) intended to PhD students. For more information about the seminar, you can go to the dedicated website. 
		
Writings