16
A Democratic Licence to Operate
uploaded every hour and its users click a ‘Like’ button or leave a comment nearly three
billion times a day.41
1.36
The term ‘big data’ has come to refer to the very large data sets produced in today’s
digital environment. Data sets are described as ‘big’ based on a subjective judgement of
their volume (the number of fields in the data set), velocity (speed of change of the data
set) and variety (types of data in the data set). Whereas traditionally a ‘big’ data set may
have referred to the electoral roll or national telephone directory, today big data refers
to large data sets which feature a large number of fields and which evolve rapidly.42
1.37
Within this information lie many potentially profitable insights regarding customer
behaviour, market trends and supply-chain processes.43 The rate of data production
makes it difficult to analyse using traditional methods, which rely on human analysts
distinguishing the most useful information. Over the last decade or so, mathematical
tools for analysing large quantities of data and data sets have been developed, which
allow computer programmes to run algorithms against the data (at an extremely rapid
rate) in order to find correlations.
1.38
In addition to the speed of the analysis, there are numerous advantages to big-data
analytics. A large volume and variety of both structured data (for instance, logs of
smartphone use within a geographic area) and unstructured data (for instance, sentiment
expressed in Twitter feeds) can be analysed simultaneously (perhaps to predict the likely
scale and location of riots). Rather than using statistically representative or random
sampling, big-data analytics collects and analyses all the data that is available, resulting
in a greater degree of accuracy in results.
1.39
Once correlations have been identified, a new algorithm can be created and applied to
particular cases. The more correlations that are identified, the more certain kinds of
behaviour can be predicted – such as the volume of cars likely to use a new road, the
particular consumer goods likely to be purchased by a specific demographic, or even the
propensity of an individual to engage in criminal activity.44
Communications Data
1.40
As we describe earlier, the term ‘communications data’ refers to information about
an item of communication. According to the Home Office Code of Practice, it refers
to ‘the “who”, “when” and “where” of a communication, but not the content’.45 For
41. Viktor Mayor-Schönberger and Kenneth Cukier, Big Data: A Revolution That Will Transform
the Way We Live, Work and Think (New York, NY: Houghton, 2013).
42. Information Commissioner’s Office (ICO), ‘Big Data and Data Protection’, 2014.
43. Centre for Economics and Business Research, ‘Data Equity – Ireland: Unlocking the Value
of Big Data’, 2013.
44. ICO, ‘Big Data and Data Protection’.
45. Home Office, Acquisition and Disclosure of Communications Data: Code of Practice
(London: The Stationery Office, 2015), p. 13.