Food bank analytics for the Trussell Trust

We’ve been working with the Trussell Trust over the last 15 months to develop analytics for food banks. The Trust released its annual report on Friday, 15th April and received a lot of press coverage, which also picks up on the role of the University of Hull project funded by NEMODE.

The project is reported in the Huffington Post, the Daily Mirror, the Yorkshire Post, ITV News, the Guardian, the Independent.

We reported on the project in The Conversation – reproduced here:

Food bank use is at a record high – here’s what we know about the people using them

Richard Vidgen, UNSW Australia and Giles Hindle, University of Hull

The story of Jesus feeding 5,000 people with just five loaves of bread and two fish takes some believing, especially when read by a modern audience that is used to a society of waste and want.

But while “waste not, want not” may well be the choice phrase for millions of parents at mealtimes, food banks across the UK are performing their own small miracles every day in making sure there is enough food to go round.

UK food bank use is still at record levels. Over a million food packages – with three day’s worth of food – were given to people in crisis by the Trussell Trust in the last year alone.

The figures from the charity, which operates a network of more than 420 food banks, underline the scale of the challenge for those tackling poverty and points to a problem with hunger that’s not going away.

 

One in five parents in the UK are struggling to feed their children.
Kena Siilike/shutterstock

For the first time, academics from the University of Hull, working with data scientists from Coppelia and consultants from AAM Associates, developed a prototype tool to map food bank data against geographical demand. As well as showing actual food bank usage the prototype uses 2011 Census data to predict possible areas of food bank need.

Researchers took various census variables, for example levels of deprivation and unemployment at a ward level and found that many of these were highly correlated with food bank usage per head of population. Food bank use was shown to be higher in wards where there are more people who are unable to work due to long term sickness or disability.

Higher food bank use was also shown to be associated with deprived wards or areas with higher levels of people in skilled manual work.

Food bank use across England and Wales mapped to ward level

Looking at anonymous postcode data for people referred to Trussell Trust food banks against census data has also enabled the trust to drill down to a micro-level and look at trends specific to a local area, as well as looking at the national picture.

Taking London as an example, the mapping shows high levels of food bank referrals due to benefit delays in certain wards in north and south-east London.

While the data alone can paint a vivid picture of food bank use in these areas, it requires more investigation to really get to the heart of the issue, and to find out if crisis provision is failing in these places, or if it is simply the case that local authorities are working more closely with food banks.

Practical applications

While finding out where food banks are used and by who is all interesting stuff, beyond the nitty-gritty of data metrics, there is now the opportunity for this tool to be used on a wider scale and really help to make a difference to people’s lives.

Adding in the referral agencies that provide access to food banks will help to provide another dimension for analysis. The Trussell Trust runs the majority of food banks, but future initiatives to incorporate data from non-trust food banks will also allow us to provide full coverage of UK emergency food provision for the first time.

And in time we will also add more open and external data: for example, to see if and how weather data impacts on food bank use.

On top of this, sharing data with other charities involved in poverty alleviation – for example homelessness charities – will provide a richer picture of food poverty and deprivation across the country.

‘Without the food bank I don’t think I would be here today’.

With a joined up approach to data, and insights from other charities and food aid providers, this data could be used by local projects to work out where to target their efforts and which additional services would best help tackle the biggest local issues. And it is hoped this will lead to better informed interventions and greater influence on policy.

Data is a big opportunity for charities and third sector organisations and one that may have an impact that we are only beginning to understand. We hope this early analytics tool will provide a basis for food banks and other front line agencies to create powerful real world data applications.

The Conversation

Richard Vidgen, Professor of Business Analytics, UNSW Australia and Giles Hindle, Senior Lecturer at Hull University Business School, University of Hull

This article was originally published on The Conversation. Read the original article.

Advertisements
Posted in Uncategorized | 1 Comment

P-hacking

The use of p-values has long been subject to criticism, one of which is its ability to be ‘hacked’. P-hacking is when a researcher tries lots of analyses and data treatments until they get the result they want (i.e., p<.05). For example, this might be fishing for p-values in a dataset, excluding outliers, transforming the data, analysing many measures but only reporting those with p<.05 – all represent potential selection decisions by the researcher. As Coase said, if you torture the data long enough it will confess. On 7 March 2016 the American Statistical Association (ASA) published a statement on the use of p-values (see Nature and the Oxford Internet Institute blog for background and commentary on the p-value problem). At least one journal has introduced a ban on the reporting of p-values.

The outcome of over-reliance, misinterpretation, and misuse of p-values is that much reported social science research is not reproducible – anywhere between 50% and 80% (also see John Ioannidis’ pioneering article, “Why Most Published Research Findings are False”).

For further detail Screen Shot 2016-03-09 at 13.50.01on reproducibility of research see the Ioannidis video.

To find out how to p-hack (and how to prevent it) see the video by Neuroskeptic.

See the “dance of the p-values” video to see how unreliable p-values can be.

Posted in Uncategorized | Leave a comment

OR and Data Science: a match made in heaven?

On 23 February 2016 Giles Hindle (University of Hull) and I gave a presentation to the York and Humber OR Group (YHORG) at the Circle in Sheffield on our thoughts about OR practitioners and data scientists: where they overlap and where they might differ. In coming to some tentative conclusions we reflected on our experiences on an analytics project for food banks that we have recently completed. It is only fair to say that we grossly over-simplified and set up stereotypes (caricatures?) of OR practitioners and data scientists arriving at our points for discussion:

  1. Technical skills: OR practitioners need IT skills, top of the list being Python and R. They also need to know where to start and where to stop with IT work (e.g., when should it be handed over to an IT professional who knows how to deploy an operational system?).
  2. Heterogeneous data: OR practitioners need to work with different data types, e.g., text mining and video analysis, rather than only seeing the world in terms of quantitative data.
  3. Out of the comfort zone: OR practitioners need to get out of their comfort zone and engage with the business and business users in an agile way, rather than being in a specialist departmental niche with a traditional engineering mind set in which they provide solutions to the business (e.g., using small scale data in simulations).
  4. Embedded analytics: Analytics will become embedded in successful organizations with a greater emphasis on prescriptive (action-based) applications where action is then subject to an evidence base (e.g., randomized controlled trials).
  5. Transformation: Analytics is about organizational transformation – culture change is needed throughout the organisation if it is to become data-driven and embrace evidence-based management.

The last two points relate to the business analytics methodology that we are developing, BAM, which uses value mapping and soft systems to develop business questions that can be tackled through analytics. The full presentation is here:

 

Posted in Uncategorized | 1 Comment

2015 in review

The WordPress.com stats helper monkeys prepared a 2015 annual report for this blog.

Here’s an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 990 times in 2015. If it were a cable car, it would take about 17 trips to carry that many people.

Click here to see the complete report.

Posted in Uncategorized | Leave a comment

scholarNET: analysing Google Scholar data

As part of a research agenda in scholarly impact I wrote an R program to analyse Google Scholar data for sets of scholars, e.g., all the researchers in a Department or a University. The R package scholar does the work of scraping the Google data for individual scholars; scholarNET takes individual Google Scholar IDs, retrieves the data for each scholar, and produces a ranking based on h index:

Screen Shot 2015-11-25 at 17.12.01

The program also retrieves the publications for each scholar and then matches them on title to detect coauthorship relationships, which are then represented in a social network graph.

Screen Shot 2015-11-25 at 16.51.11

The social network shows that Shaw and Grant have coauthored two papers and Creazza and Colicchia have coauthored 14 papers. There is not a lot of evidence of coauthorship activity in this particular network. The network is also written out in GML (graph modelling language) format for visualisation in Gephi.

In setting up scholarNET the hard work is collating the Google scholar IDs for individual researchers, which is a manual task involving cutting the Google Scholar ID from the URL string for each academic’s profile. Of course, the approach also relies on academics having established a Google Scholar profile. However, more and more academics are setting themselves up on Google Scholar as they seek to demonstrate impact (e.g., at job interviews, for promotion cases) and to understand how their work is being used by others.

Download the R code for scholarNET and the list of scholar ids to try out the analysis. This was the first program I wrote in R to do something useful and the code is not pretty! If there is interest in it I will rewrite it.

Posted in Uncategorized | Leave a comment

Calm down – it’s only big data

On 20th May 2015 Peter Andrews of the Marketing group at Hull University Business School and I presented on Big Data in a marketing context at a Chartered Institute of Marketing evening event (the slides are here). There is growing interest in the field of marketing analytics and for many organisations this is the logical place to take first steps in business analytics – they have data, they have motivation, and crucially, they are willing to spend money.

Posted in Uncategorized | Leave a comment

Hull Big Data Analytics Forum 2015

Rplot03

The second Hull Big Data Analytics Forum took place on Wednesday 1 July with five speakers. The full set of presentations can be found here. Throughout the morning we tweeted to #hullanalytics and produced a wordcloud and sentiment analysis in realtime using R.

Ben Latham of Summit talked on ‘Forecasting Future Profits with Big Data‘ and gave many detailed examples of how in-depth analysis and modelling of weather, seasonality, and TV viewing data can be used to fine-tune get more value from the purchasing of media and key words for advertising.

Michael Mortenson reported on his research at Loughborough University, ‘The Analytics Jigsaw: identifying the skills needed for the analytics age’, and gave us the Type I and Type II analytics professional. Type I are from a computer science background and focus on programming, machine learning, and visualisation; Type II are from business schools and focus on statistics, decision-making, consulting skills, and domain knowledge.

Matthew Robinson from IBM presented ‘From Business Intelligence to Cognitive Analytics‘ and introduced us to the Watson Analytics initiative and how this technology is automating some parts of the data scientist process for the end user. This is the same technology that was used to ‘destroy the human competition’ in the Jeopardy TV quiz. Further details of Watson Analytics, including a free trial, are here.

The Information Commissioner’s Office (ICO) was represented by Carl Wiper who talked on ‘Privacy in a Big Data World‘. Carl pointed us to the report ICO ‘Big Data and Data Protection‘, and highlighted the potential dangers in repurposing and recombining data. Big Data is not only about compliance – there is an ethical dimension to consider (see the Aimia report).

Simon Raper of Coppelia ran an active workshop with a human simulation of MapReduce. The audience got to play the role of mappers, sorters, and reducers to produce a word analysis of text taken from Proust. An excellent and unforgettable way to understand how MapReduce can ‘reduce your data processing time from hours to minutes‘.

Hull Analytics Forum 2016 – next year’s event will be on Wednesday, 6th July 2016.

Euro 2015 – it is the place to present your work and meet with researchers, academics, practitioners, and students interested in business analytics. The conference runs 12 – 15 July and is held in Strathclyde. Further details here.

Posted in Uncategorized | Leave a comment