Course on Big Data and
Artificial Intelligence in Materials Sciences
Introduction to artificial intelligence and machine-learning methods
Speaker: Luca M. Ghiringhelli
This introductory lecture covers a general overview of artificial intelligence methods, including machine-learning and data-mining methods. There is no hands-on notebook specifically associated to this lectures, but the concepts introduced therein can be useful to link together the various techniques presented in the next lectures.
Regularized regression and kernel methods
Speaker: Santiago Rigamonti
In this tutorial, we will explore the application of kernel ridge regression to the prediction of materials properties.
Decision trees and random forests
Speaker: Daniel Speckhard
In this tutorial, we will introduce decision trees. We go through a toy model introducing the SKLearn API. We then discuss step by step the different theoretical aspects of trees. We then move to training a regression tree and classification tree on different datasets related to materials science. We end the tutorial by covering random forests and bagging classfiers.
Artificial neural networks and deep learning
Speaker: Angelo Ziletti
In these tutorials, we introduce the basic of deep learning, via tranditional multilayer perceptrons and modern convolutional neural networks.
Launch the notebook 1 Launch the notebook 2
Unsupervised learning
Speaker: Luigi Sbailò
In this tutorial, we introduce to the most popular clustering algorithms. We focus on partitioning, hierarchical and density-based clustering algorithms. The methods are tested on synthetic datasets of increasing complexity.
Launch the notebook 1 Launch the notebook 2
Compressed sensing meets symbolic regression: SISSO
Speaker: Luca M. Ghiringhelli
In this tutorial, we will show how to find descriptive parameters to predict materials properties using symbolic regrression combined with compressed sensing tools. The relative stability of the zincblende (ZB) versus rocksalt (RS) structure of binary materials is predicted and compared against a model trained with kernel ridge regression.
Subgroup discovery, rare-phenomena challenge, and domain of applicability
Speaker: Matthias Scheffler
In these tutorials, we introduce the subgroup discovery (SGD) method, which identifies rules (Boolean statements involving selected features among the given candidates) describing exceptional subsets of data. SGD is an exploratory analysis tools and allows for identifying local structure in the data, while most ML tools focus on global models. A notable application of SGD is to locate the domain of applicability of ML models, i.e., identifying the property of data that are expected to yield significantly lower errors than the overall dataset. A notebook on this specific application is also linked.
Launch the notebook 1 Launch the notebook 2
Fusion of experimental and computational data by AI
Speaker: Lucas Foppa
In this tutorial, we provide an example of AI techniques combined to experiments, in such a way that the AI is trained on an initial set of data and the predictions of the trained model is used to guide the experiment in a similar but distinct class of materials.