Meet the ESRs: Daria Morozova

Buongiorno da Roma!

My name is Daria Morozova and I am currently the ESR hired by Pangea Formazione within «INSIGHTS» Innovative Training Network. Without a doubt, the Network is a great opportunity for a young researcher to contribute to the Science and Society. I would be really glad to share with you all the details of this amazing journey and keep you updated on the highlights of every step of the program. 

The research project I am involved in is carried out in Rome. It is focused on the exploitation of the latest Machine and Deep Learning techniques to image and sound recognition applications. In particular, my goal is twofold: on one hand, to estimate traffic through crossroads and to identify special class vehicles (e.g. police and ambulance) in order to prioritize them; on the other hand, to develop a tool to coordinate and synchronize the drone swarm for emergency services, especially in search-and-rescue scenarios. This will be done using audio and video data streams collected by sensors on Unmanned Aerial Vehicles (or «UAV») in order to detect other UAV in the surroundings for collision avoidance (even during loss of ground communication!), and to detect search-and-rescue targets. 

About me: I was born and raised in Moscow, which is the northernmost and coldest megacity and metropolis on Earth. I graduated with a 5-year Specialist’s Degree program in Applied Mathematics and Information Theory at Lomonosov Moscow State University and with a Master Degree in Economics at the National Research University «Higher School of Economics». I also had a chance to study abroad: 5-months overseas stay at the Catholic University of Sacred Heart in Milan, Italy, which gave me the opportunity to improve my linguistic and intercultural skills and facilitated my relocation to Rome. 🙂

(picture from: versus.com)

In the following blogs I am going to present the current events and a Step by Step approach how to carry out an exciting project: stay tuned!

See you soon!

(written by Daria Morozova)


Data-driven decision making

In my first post about Pangea Formazione (PangeaF in the following), I have mentioned a few times that our company has set its mission as to help other companies to make good use of the data they own, in order to move towards data-driven decision process.

Is this really something useful and/or needed? In fact, it is. 

Since the late 70s there have been plenty of studies which revealed the huge impact that bias and heuristics can have on our quantitative decisions, not because of lack of expertise or just ignorance, but due to the actual evolution process of the human brain through centuries. A typical example is the so called “framing effect”, studied by Kahneman and Tversky in the early 80s [1].

Daniel Kahneman (picture from: wikipedia.org)

Two separate groups of participants are presented with a different scenario, related to the outbreak of an Asian epidemic who would affect six thousand people. Participants are asked to choose among two possible courses of actions, based on their rational preferences. The first group was presented with the following choices:

  • with plan A, 2000 persons will be saved;
  • with plan B, we have 1/3 of probability to save 6000 persons (everybody), and 2/3 of probability that no people are saved.

The second group was presented with the following choices:

  • with plan C, 4000 persons will die;
  • with plan D, we have 1/3 of probability that no people die, and 2/3 of probability that 6000 persons (everybody) die.

PLAN A
2000 saved

PLAN B
A 33% chance of saving all 6000 people,
66% possibility of saving no one.
PLAN C
4000 dead

PLAN D
A 33% chance that no people will die,
66% possibility that all 6000 will die.

What has been observed both in the original experiment and in many replications is that in the first case around 70% of the participants prefer plan A, while in the second case almost 80% of the participants prefer plan D. But plan A is the same as plan C, and plan B is the same of plan D! The only change is in the frame which is used to present the decision making problem, that affects the choice much more than any rational decision making theory would allow. [*]
The problem is that the description of the experiment in the two settings triggers different areas of our brain: when presenting the choice in terms of gains (first group) mechanisms of risk-aversion take precedence, while when presenting the choice in terms of losses (second group) we are much more propense to choose a risky option because of loss-aversion. 

Other examples can be found in Kahneman’s book “Thinking, fast and slow” [2], that the famous psychologist and 2002 Nobel laureate for Economic Sciences wrote to present the results of decades of experiments on the psychology of judgment and decision-making, as well as behavioral economics. 

And this is not just an example taken from some psychological study to “push our agenda”, with no true impact on the business world: it is something that is continuously seen in action. A 20+ years monitoring research on public, private and no-profit companies throughout USA, Europe and Canada [3] has shown that typically 50% of the business decisions ends up in failure, 33% of all decisions made are never implemented, and half of the decisions which get implemented are discontinued after 2 years. One of the causes of such (depressing) trend is the fact that in two cases out of three, choices are taken based either on failure-prone methods or on fads that are popular but not based on actual evidences.
In several cases it has also been shown that failure-prone methods are still followed because of difficulties to deal correctly with uncertainties that are intrinsic with decision making processes in strategic and business contexts.

There exist several types of uncertainties which can affect a decision making process: factors that there is no time or money to monitor effectively, factors that our outside our control capabilities like competitors’ moves or other stakeholders’ decisions, factors that are truly random and unexpected and that can lead the same decision towards very different results. Uncertainty assessment is a critical element in such scenario and we always find surprising to see how often it is underestimated: typically, it is only considered when assessing the global risk level of a productive process or “a posteriori” when a decision has undesired outcomes.

The described difficulties in evaluating quantitatively uncertainties are absolutely in line with the psychological researches we mentioned above, but there seems to be an additional inertia towards adoption of software-based tools that could provide with more coherent and consistent probability evaluations in different scenarios. 

What can be done to address such problems? How can we improve our skills in dealing with uncertainties? We will provide a possible answer in the next post, which shall complete the overview of the main points of the approach followed by PangeaF when implementing software solutions to support decision making processes.

Stay tuned!

[*] On a side note, you might want to notice that the expected value of each plan is always the same, so that assuming human choices follow a model based on perfect information, and defining rationality along the lines of von Neumann & Morgenstern’s game theory, we shall conclude that any “rational” decision maker would be indifferent among the four possible plans.

Bibliography

[1] A. Tversky & D. Kahneman, The Framing of decisions and the psychology of choice, Science. 211 (4481), 453?458 (1981). doi:10.1126/science.7455683.

[2] D. Kahneman. Thinking, Fast and Slow. Farrar, Straus and Giroux, New York, 2011. ISBN: 0374533555

[3] P. C. Nutt. Why Decisions Fail. Berrett-Koehler Publishers, Oakland, California, 2002. ISBN: 1576751503

Meet the ESRs: Nathan Simpson

Hey, I’m Nathan. I’m studying for a PhD in particle physics at Lund University (Sweden), specialising in statistics and machine learning. It’s awesome.

(:

Facts about me:

  • I’m the self appointed videographer of the INSIGHTS network. I’ll be vlogging our training events to show you how cool it is to be part of a training network, and to showcase my wonderful colleagues ^^
  • My hair color is a non-linear function of time.
  • I’m British. Love me a nice chips and gravy.
  • If I could sum myself up in one GIF, I would use this one:
Bongo cat + GameCube = Nathan + c

On an academic level, I’m interested in Bayesian statistical methods, e.g. nested sampling, and applying them to everything physics. By Bayesian, I mean methods that update your prior beliefs about a thing in the light of some data on that thing. This is in contrast with frequentist methods, which try to take a purely ‘data-driven’ approach, telling you about the expected outcome of an experiment in the limit of many identical experiments.

When I ask people in particle physics whether they are Bayesian or frequentist, people often reply along the lines of ‘I use whichever one yields the best result’. I would argue that the two schools of statistics answer fundamentally different questions, so it’s worth sitting down and deciding on a philisophical level the questions you want to ask about your data. More on this to follow in future posts :]

Please enjoy this picture of me dressed as a Christmas tree (left), courtesy of the departmental secret Santa.

They called me lil’ treezus at school. At least I like to think so.

When I’m not doing any of this stuff, I make music. I’m releasing one song a week through a project called riverbubble if you want to pass time on a rainy day.

I look forward to making content for you in the future :3

Pangea Formazione INSIGHTS @ Rome

Hi all, 
here is a first short overview of the only private beneficiary of the INSIGHTS ITN: Pangea Formazione, a SME with base in Rome (PangeaF in the following).

Pangea Formazione S.r.l.(Rome, Italy)
founded in 2009
innovative SME    
research institute recognized by  
certified UNI EN ISO 9001:2015 (development of informatic tools for predictive models)
team formed by around 15 persons, mostly with a background and a Ph.D. in quantitative sciences (Physics, Mathematics, Engineering, etc.)

PangeaF was created in 2009, on initiative of Paolo Agnoli and Francesco Piccolo, its charter members, to the purpose of promoting and spreading the importance of exploiting available data as a support to business decision making.

At the beginning, training courses were PangeaF’s main trademark, to encourage the use of appropriate statistical tools to deal with the uncertainties that are naturally embedded in business decisions. To this aim, PangeaF created a network of connections and collaborations with several researchers at universities in Rome, Milan, Venice, Naples and Pavia. During this period, managers from private companies and public institutions who attended such courses asked PangeaF for a practical application to their own real business problems of the techniques that were presented. Those studies shortly evolved in activities of modeling algorithms and software development which nowadays constitutes the core business of the company.

As of 2019, PangeaF has several active projects of software development and management consulting with different companies such as TIM, Poste Italiane, DHL, MBDA, ENEL, etc., while retaining important training activities on both data-driven decision making (higher level courses for top and middle managers) and machine learning techniques through open source languages like R, Python and Scala (technical courses for data scientists and IT personnel). Helping other companies to use their own data, as well as open data sources, to improve their decision making processes is still our main mission and both software development and training activity are the means to reach it effectively.

A third way to accomplish our mission is devoting efforts to develop brand new algorithmic and software solutions for problems which we believe will be soon relevant for business processes. This led our company to spend time and resources on advanced R&D activities, including development of optimal control strategies for swarms of drones with a common objective and of deep learning techniques to analyze video data sources to detect and classify objects of interest. While performing such researches, we created new connections with researchers from the academic world where similar problems are studied and we realized there were plenty of opportunities for training young researchers, with an interest in applied data analysis techniques, on such topics. This was the link that brought PangeaF in the orbit of INSIGHTS, while it was still a preliminary proposal for an EC training network, and made our company the natural choice as responsible for the work package on “Statistics for Society”. 

Of course, we are very happy for the opportunity to join the ITN: thanks to INSIGHTS ITN, we now have a new member of our team, Daria Morozova. Our ESR has moved from Moscow to Rome on September 2018 to work on her INSIGHTS project: developing a common framework to integrate video and sound data analysis techniques towards different goals. We foresee that such a framework will be useful for applications like smart mobility and coordinated drone control, but there is plenty of other fields that could benefit of a similar tool. Once all the building blocks will be in place, and it will become easier to stack together pre-trained deep learning networks with customized ones and with Bayesian networks, we will be able to start playing with it in different contexts and see how everything plays out.

Daria Morozova
  • ESR
  • Specialist (5 yrs) in Applied Mathematics and Cybernetics at MSU (Moscow, Russia)
  • Master in Economics at HSE University (Moscow, Russia)
  • At PangeaF since 09/2018.
Fabio S. Priuli
  • Supervisor and PI
  • Ph.D. in mathematics at SISSA-ISAS (Trieste, Italy)
  • 8 years as post-doc in applied math
  • At PangeaF since 04/2015 as data scientist, project manager and head of training activities.
Sara Borroni
  • Co-Supervisor
  • Ph.D. in physics at University of Rome “Sapienza” (Rome, Italy)
  • 4 years as post-doc
  • 4 yeas as data scientist and project manager
  • At PangeaF since 03/2015.

For the moment, that’s all.
More posts will follow soon presenting some of the exciting machine learning applications we developed for industry process management, and some of the latest statistical techniques that we experienced as very useful in such context.

University of Edinburgh

Welcome to the home of the Higgs boson inventor and Nobel prize winner Peter Higgs.

Edinburgh is the capital of Scotland and a very nice place to live, especially if you prefer rugby over football and you will find the Scots to be friendly and welcoming, although sometimes you may struggle to understand them. The city has all services you may need including theatres, cinemas, independent restaurants, and much more. The public transport will take you anywhere including to the airport that connects the city to many major cities in Europe and beyond. We even have a direct flight to Geneva!

The University is a major player in the city recent development making a strong effort in creating new connections between academia and industries. The recently open Bayes Centre is a new building in the town campus that will enhance the capabilities of the University to develop multidisciplinary projects centred on the use of advanced statistical tools in a variety of fields. Why does this sound familiar? Ah, yes, that’s what we do!

The University has several campuses as it could no longer be hosted in the city centre.  Actually, the School of Physics and Astronomy is based in the King’s Building campus which is located 20 minutes by bus from the city centre. The campus hosts many schools and departments; we are in the James Clark Maxwell Building (JCMB). The school is formed of 3 institutes and 5 research centres including the Higgs centre of Theoretical Physics. We are part of the Institute for Particle and Nuclear Physics, which is itself divided into 3 research group (Nuclear physics, Particle Physics experiments and Particle Physics theory). The ATLAS experiment is part of the Particle Physics experiments group and is formed by 3 academic staff (soon to be 5), 6 researchers and 7 PhD students.

The group, unsurprisingly, focuses its effort on the measurements of the Higgs boson but we now have also activities in the Exotic group and in the Top group of the ATLAS experiment. Our technical contribution is mainly based on the development of the Simulation, with a focus on fast simulation, and on Trigger activities.

Serena Palazzo, our ESR, will mainly work on differential cross-section measurements of top pair production. In line with the group activities, she will also work on fast simulation, but she will give a Machine Learning spin to it. We will use the ever more popular GANs to speed up the ATLAS simulation by several orders of magnitude. Serena will also have a significant period of secondments in companies working with financial data that she will use to improve fraud detection techniques.

Stay tuned for more updates from us!