Fault tolerance in dynamic distributed systems

Les Grandes Conférences du LIG - The LIG Keynote Speeches
 - 
LIG
Pierre SENS
Jeudi 05 avril 2018
"Réalisation technique : Antoine Orlandi | Tous droits réservés"

Pierre Sens obtained his Ph. D. in Computer Science in 1994, and the “Habilitation à diriger des recherches” in 2000 from Paris 6 University (UPMC), France. Currently, he is a full Professor at Sorbonne Université. His research interests include distributed systems and algorithms, large scale data storage, fault tolerance, and cloud computing. Pierre Sens is heading the Delys group (previously Regal) which is a joint research team between LIP6 and Inria Paris. He was member of the Program Committee of major conferences in the areas of distributed systems and parallelism (DISC, ICDCS, IPDPS, OPODIS, ICPP, Europar,…) and serves as General chair of SBAC and EDCC. Overall, he has published over 140 papers in international journals and conferences and has acted for advisor of 22 PhD theses.

 

Résumé :

Nowadays, distributed systems are more and more versatile. Computing units can join, leave or move inside a global infrastructure. These features require the implementation of dynamic systems that can cope autonomously with changes in their structure. It therefore becomes necessary to define, develop, and validate distributed algorithms able to manage such dynamic at a large scale. 
Failure detection is a prerequisite to failure mitigation and a key component to build distributed algorithms requiring resilience.  We introduce the problem of failure detection in asynchronous network where the transmission delay is not known. We show how distributed failure detector oracles can be used to address fundamental problems such as consensus, k-set agreement, or mutual exclusion. Then, we focus on new advances and open issues for taking into account the dynamic of the infrastructure.