Università degli Studi di Roma “Tor Vergata” Dipartimento di Ingegneria Civile e Ingegneria Informatica Progetti Corso di Sistemi Distribuiti e Cloud Computing A.A. 2014/15 Valeria Cardellini Scelta e validità del progetto • Comunicare entro 21/6/2015 le seguenti informazioni: – componenti del gruppo (nominativi ed email) – progetto scelto • Numero max di prenotazioni per ogni tipo di progetto • Comunicare e concordare tempestivamente eventuali modifiche relative al gruppo • Progetto valido solo per l'A.A. 2014/15 Valeria Cardellini - SDCC 2014/15 1 Consegna e discussione del progetto • Consegna: circa dieci giorni prima della discussione del progetto – Quando consegnare? Durante l’A.A, non ci sono date predefinite – Cosa consegnare? CD con codice, relazione ed eventuali risultati sperimentali; relazione (no codice) anche in formato cartaceo – Relazione possibilmente scritta in forma di articolo scientifico (max 12 pagine in formato ACM o IEEE) • Discussione: tutti i componenti del gruppo nella stessa data – Cosa presentare? Una parte del progetto per ciascun componente del gruppo con ludici (max 10 minuti per studente) – Domande e testing del sistema/servizio durante la discussione Valeria Cardellini - SDCC 2014/15 2 Common requirements for all projects • You can choose the programming language • You can either use Amazon Web Services (AWS) through the AWS Educate program or IBM Bluemix • You can use support libraries and tools to develop your project (of course they should not overlap with the project goals!) – Be careful: their use must be properly mentioned in the project report • System/service with configurable parameters (no hard-coded!) – Through a configuration file/service • You must test all the functionalities of your developed system/service and present and discuss the testing results in the project report Valeria Cardellini - SDCC 2014/15 3 Common requirements for all projects (2) • System/service supports multiple, autonomous entities contending for shared resources • System/service supports real-time updates to some form of shared state • System/service state should be distributed across multiple client or server nodes – The only allowed centralized service can be one that supports users logging on, adding or removing clients or servers, and other housekeeping tasks • System/service scalability and elasticity • System/service fault tolerance, in particular system/ service continues operation even if one of the participant nodes crashes (optionally, recovers the state of a crashed node so that it can resume operation) 4 Valeria Cardellini - SDCC 2014/15 Overview of projects • Project 1: DEBS 2015 Grand Challenge – 2/3 students per team • Project 2: Log Analysis - 2/3 students per team • Project 3: Elastic File System in the Cloud - 3 students per team • Project 4: Key-Value Store Service for Mobile Cloud and IoT - 2/3 students per team • Project 5: Your Own Distributed System using Consensus Protocols - 2/3 students per team • Project 6: Cloud-based application/service using IBM Bluemix - 2/3 students per team Valeria Cardellini - SDCC 2014/15 5 Project 1 • DEBS 2015 Grand Challenge http://www.debs2015.org/call-grand-challenge.html For data visualization see: https://vimeo.com/31298658 • Solve two queries: – Query 1: Frequent routes – Query 2: Profitable areas 6 Valeria Cardellini - SDCC 2014/15 Project 1: platform • Use Apache Storm or some other distributed stream processing (DSP) system – The DSP system can be managed on AWS EC2 instances or IBM Bluemix VMs – Amazon Kinesis could not be included in the AWS Educate grant Valeria Cardellini - SDCC 2014/15 7 Project 2 • Log file service • Design and realize an application for log file acquisition and analysis (e.g., Web logs) using both: – Batch processing (e.g., MapReduce) – Data stream processing • Which kind of analysis? It is your choice! – Some basis requirements will be provided in the detailed project description 8 Valeria Cardellini - SDCC 2014/15 Project 2: platform • For batch processing: use the Apache Hadoop framework – Also available as Amazon Web Service (i.e., AWS Elastic MapReduce, EMR) • For data stream processing use Apache Storm or some other distributed stream processing (DSP) system – The DSP system can be installed and managed on AWS EC2 instances or IBM Bluemix VMs – Amazon Kinesis could not be included in the AWS Educate grant Valeria Cardellini - SDCC 2014/15 9 Project 2: details • Propose your logging service by June 21 – In 1 page briefly describe your project’s idea – The number of students in the team depends on the chosen service 10 Valeria Cardellini - SDCC 2014/15 Project 3 • Elastic file system in the Cloud • Idea: distributed file system that supports application scale-up and scale-down – Applications tag files with different levels of consistency and persistence (only certain files and directories are to be kept consistent across VMs) – The file system leverages the elastic nature of VM instances – Scale-down can be challenging due to need to perform state reintegration in case of stateful services (if a file is divided into chunks, chunks have to be properly moved/replicated before the corresponding VM is retired) Valeria Cardellini - SDCC 2014/15 11 Project 4 • Key-Value Store Service for Mobile Cloud and IoT – Idea: to provides a simple key-value store for mobile applications and low-energy sensors – With this service, an application can store and retrieve application-wide state information as text by using key-value pairs 12 Valeria Cardellini - SDCC 2014/15 Project 5 • Design, implement, and test a distributed system by implementing your own application, such as a multi-player game or a transaction system • Requirements: those already listed plus you must use a distributed consensus protocol (e.g., Paxos or 2PC) Valeria Cardellini - SDCC 2014/15 13 Project 5: details • Which kind of system? It is your choice! • Propose your distributed system by June 21 – In 1 page briefly describe your project’s idea – The number of students in the team depends on the chosen system 14 Valeria Cardellini - SDCC 2014/15 Project 6 • Design, implement, and test a distributed application/service using IBM Bluemix Valeria Cardellini - SDCC 2014/15 15 Project 6: details • Which kind of application/service? It is your choice! • Propose your distributed system by June 21 – In 1 page briefly describe your project’s idea – The number of students in the team depends on the chosen application/service Valeria Cardellini - SDCC 2014/15 16