About Me

I’m currently involved in a PhD in Machine Learning Methods for Big Data sponsored by the University of Portsmouth (UK). I obtained a Bachelor degree in Computer Science in Summer 2014 from University of Salerno (Italy). During the Fall 2013 semester I received a full scholarship issued by the Italian Ministry of Instruction, University and Research (MIUR) which allowed me to spend a semester (Spring 2014) at the Penn State University. During the Spring 2015 I have been part of the Erasmus+ programme for Research at the University of Portsmouth. During the summer 2016 I joined Expedia Inc. as Intern Data Scientist working on the cold start problem (ranking of new items added to the catalog ) for Recommender Systems, engineering new non-historical based features and building machine learning pipelines over 200+gb of data every day.

Contact Details

Alessio Petrozziello
School of Computing, Room 1.04
University of Portsmouth,
Portsmouth, PO1 3HE, UK.
Skype: alessiounisa


Research Interests

Missing Data Imputation - Dealing with missing data is an important step in dataset pre-processing since most statistical analysis techniques, data reduction tools, machine learning methods and recommender systems require complete datasets. There are many techniques that can be used to deal with the missingness, but the common approach is to make the most of the available data through minimizing the loss of statistical power and the bias inevitably brought by inferring values for the missing data. A more interesting problem arise when the size of the data grows in a way that the state-of-the-art techniques are no longer applicable due to the high computational time and memory required - here new distributed and online techniques should be studied and developed in order to face this new challenge.

Machine Learning for Big Data - Machine learning is ideal for exploiting the opportunities hidden in big data. It delivers on the promise of extracting value from big and disparate data sources with far less reliance on human direction. It is data driven and runs at machine scale. It is well suited to the complexity of dealing with disparate data sources and the huge variety of variables and amounts of data involved - and unlike traditional analysis, machine learning thrives on growing datasets, meaning the more data fed into a machine learning system, the more it can learn and apply the results to higher quality insights. Freed from the limitations of human scale thinking and analysis, machine learning is able to discover and display the patterns buried in the data.

Software Development Effort Estimation - Development effort is considered the dominant cost of software projects, thus effort estimation is a critical activity for planning and monitoring software project development and for delivering the product on time and within budget. Significant over or under-estimates can be very expensive for a company and the competitiveness of a software company heavily depends on the ability of its project managers to accurately predict in advance the effort required to develop software systems. In the literature several methods have been proposed in order to estimate software development effort. Among them, widely employed estimation methods try to explain the effort to develop a software system in terms of some relevant factors (named cost drivers), e.g., Linear and Stepwise Regression, Regression Tree, and Case Based-Reasoning. These methods exploit data from past projects, consisting of both factor values that are related to effort and the actual effort to develop the projects, in order to estimate the effort for a new project under development. The main research topics related to the software development effort estimation regard the definition and empirical evaluation of search-based approach for building novel estimation models and the definition and the empirical evaluation of functional metrics for sizing software products.


PhD in Computational Intelligence: ongoing University of Portsmouth - Started in October 2015

Master degree in Computational Intelligence: ongoing University of Salerno - Started in September 2014

Bachelor Degree in Computer Science: 1st with Honours University of Salerno - July 2014

Courses and Schools attended

International School on Mathematics “Guido Stampacchia” Workshop “Graph Theory, Algorithms and Application (3rd Edition) Erice - Sep 8-16 2014

International Summer School on Software Engineering (11th edition) University of Salerno - July 2014

Data Mining of satellite data for the study of natural hazards University of Salerno - September 2013

Work Experience


Data Scientist Intern Expedia Inc. - June 2016 to October 2016


Erasmus+ Traineeship at the University of Portsmouth (School of Computing) under the supervision of Dr. Ivan Jordanov. University of Portsmouth - March 2015 to July 2015

Research Assistent at the Penn State University (Department of GeoInformatics) under the supervision of Dr. Guido Cervone. Penn State University - March 2014 to May 2014


International Conferences

  1. A. Petrozziello and I. Jordanov, "Column-wise Guided Data Imputation", 17th International Conference on Computational Science (ICCS 2017). To appear. Paperpdf paper

  2. A. Petrozziello and I. Jordanov "Data Analytics for Online Traveling Recommendation System: A Case Study", IASTED's 36th International Conference on Modelling, Identification and Control (MIC 2017). Paperpdf paper

  3. I. Jordanov, N. Petrov and A. Petrozziello, "Supervised Radar Signal Classification", IEEE International Joint Conference on Neural Networks (IJCNN 2016). Paperpdf paper

  4. F. Sarro, A. Petrozziello, M. Harman, "Multi-Objective Effort Estimation", ACM 38th International Conference on Software Engineering (ICSE 2016). Paperpdf paper Supplementary materials

International Journals

  1. A. Petrozziello and I. Jordanov, "Online Travel Recommendation System: Tackling the 'Long Tail' as a Multi-Objective Problem". (under review)

  2. A. Petrozziello and I. Jordanov, "Classifiers accuracy improvement based on missing data imputation", Journal of Artificial Intelligence and Soft Computing Research (2017)

  3. A. Petrozziello, G. Cervone, P. Franzese, S.E. Haupt, R. Cerulli, "Source Reconstruction of Atmospheric Releases with Limited Meteorological Observations Using Genetic Algorithms", Applied Artificial Intelligence Journal (2017) Paperpdf paper

The copyright of the papers is owned by the respective publishers. Personal use of the electronic versions here provided is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the publishers.



  1. 3 years full time PhD Bursary (£42000) University of Portsmouth - September 2015

  2. Erasmus+ Traineeship Scholarship (2400€) University of Salerno - February 2015

  3. Scholarship "Messaggeri della Conoscenza) (12000€) MIUR (Italian Ministry of Education University and Research) - October 2013


  1. Azure Machine Learning Award ($20000) Microsoft - May 2016

  2. Azure Machine Learning Award ($20000) Microsoft - May 2015

  3. ISSSE2014 Travel Grant University College of London (UK) - June 2014


  1. Intern price Award - Summer Hackathon 2016 on Smart Searches Hotels.com - September 2016

  2. Bronze Medal - 13th Annual "Humies" Awards for Human-Competitive Results Produced by Genetic and Evolutionary Computation ($2000) Genetic and Evolutionary Computation Conference (GECCO16) - July 2016

  3. Accenture Talent Digital Competition - Finalist Accenture - June 2015

  4. The young scientist award ($1000) The Italian Cultural Society of Washington D.C., inc. - June 2014


Teaching Assistant

Academic Year 2016/2017

  1. Second term: Artificial Neural Networks and Genetic Algorithms (NENGA), BSc year 3, Dr. Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).

  2. First term: Advanced Programming Concepts (ADPROC), BSc year 2, Dr. Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).

Academic Year 2015/2016

  1. Second term: Artificial Neural Networks and Genetic Algorithms (NENGA), BSc year 3, Dr. Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).

  2. First term: Advanced Programming Concepts (ADPROC), BSc year 2, Dr. Ivan Jordanov, School of Computing, University of Portsmouth, Portsmouth (UK).



  1. Research Student University of Portsmouth - October 2015 / Now

  2. Co-Founding member of the “Geoinformatics and Earth Observation Laboratory” Penn State University - May 2014 / Now

Client Testimonials

  • "Random processes play an important role in evolution and, to some extent, in all things."I. Asimov"

    I. Asimov