ethics

Unconference: ethics #opened17

RpouvAiI’m in Anaheim, CA., land of forest fires and Disney, for this years Open Education conference.  This year, for the first time, there is time allocated for an unconference session.  This seemed like a good opportunity for collect information about ethical issues delegates might be experiencing ahead of the the PILSNER seminar I’m delivering on Monday, so I offered to run a session on ethics.  However, this couldn’t go ahead because I got sidetracked into something else.  But here’s a blog post with some of the ideas I would have shared for anyone who is interested.

I produced a summary of some of my thoughts around ethics and openness for a Towards Openness session at OER17.

This is a basic summary of my paper, A Framework for the Ethics of Open Education (2016).  In Monday’s presentation I intend to outline the approach taken and highlight some specific instances of ethical issues to foster dialogue.  Here are some of the areas under consideration:

  • A colleague at a community college in the USA explained to me that there is some disturbance within his institution as a result of conflicting priorities around openness and cost.  They have a threshold of $40 per student/class for materials to be considered ‘low cost’.  Some commercial providers are using OER to put together offerings that are presented as ‘open’ when then in fact are run for profit.  Openness and cost are often conflated in this way and this provokes a challenge:  should we be purist about openness and advocate only for what we consider to be the most open/moral approach to expanding provision, or should we just be entirely pragmatic about these kinds of issues and concentrate only on whether students have access to the materials they need?
  • Guerrilla research: it’s increasingly possible for researchers to work independently of institutions, making use of publicly available data.  This is perhaps the most extreme example of ‘open’ research in that it happens entirely outside of the institutional structures and processes that are supposed to ensure or promote ethical behaviours.  It’s also possible to create a relatively high profile for this work through dissemination on social media.  How do we ensure that ethical standards are being met?
  • Equity and inclusion were real buzz words at this conference, but I sometimes find that these phrases have a platitudinous quality when we don’t acknowledge that these are more complex than they first appear.  For instance, should we take ‘equity’ to refer to ‘equity of opportunity’ (who gets to take part in education and under what conditions) or to refer to ‘equity of outcome’ (where the goal is to equalise educational outcomes)?  Should it refer to both?  If so, which is prioritised?
  • Another area that generated interest at the conference was the idea of educational bias and how it might be minimised.  This gave rise to questions about making the canon or corpus of a particular subject less “white, Christian and male” – philosophy came in for particular criticism – but there were also suggestions about teaching in such a way as to minimise this kind of bias.  I confess that I am not that clear on what this looks like.  Freire was mentioned as an inspiration.  But it seems to boil down to a combination of criticality, inclusiveness, contextual awareness and deconstruction of the status quo.
  • The place of social media in research is somewhat vexed.  At one end of the scale we have things like Facebook’s emotional contagion study which would have struggled to pass an institutional review because of its willingness to cause harm to participants, but since Facebook is not governed in the same way as educational institutions people have few options in the way of redress.  This case gets to the heart of the difference between our expectations about how our “open” data will be used and what happens in practice.
  • Another area I think is worth considering is the role of social justice in all this.  For a lot of open education advocates this is at the core of the movement.  But rarely do we hear about the concept of social justice being unpacked in the context of open education.  There are competing visions of social justice.
    • For Plato, social justice is a kind of harmony between individuals and the state which enables people to find the roles to which they are best suited (nb., not necessarily the ones they want).
    • Aristotle advocated a kind of redistributive justice where goods and wealth were assigned to people according to their merit – though this favoured – and perhaps only included – aristocratic males.
    • In the Scholastic tradition the idea of social justice becomes more intimately connected with religious service and religious harmony. Aquinas connected this with the Christian idea of caring for the needs of the poorest and most disadvantaged.
    • During the Enlightenment we see the emergence of the idea that a civilised society should provide (equity of) opportunity to its populace; this is distinct from the idea that the poor should somehow be looked after (equity of outcome).  J. S. Mill argued that virtue should be consistently rewarded; a kind of meritocracy (equity of opportunity).
    • Another approach is the ‘social contract‘ which consistently sets out rights and expectations
    • In a modern context we have visions of social justice like that of Rawls, who argued that we must in some sense provide a hypothetical consent for the organisation of society and this is done through compliance with the tenets of a liberal conception of justice: freedom of thought; political liberty, rights, and so on.
  • If we are focused on equity of opportunity our guiding light is something like a principle of fairness; if equity of outcome then the principle is something like equality.
Advertisements

#opened16 live blog: College Affordability and Social Justice

Preston Davis (aka @LazyPhilosopher) invites us to think about the early days of Western civilisation where philosophers like Plato and Aristotle formed educational institutions on the basis of their own privilege.  This kind of system persisted into Roman times, where males with the ability to pay could attend organised schools where they would learn to become educated citizens of the empire.

Education was further formalised in the Middle Ages, but mostly organised according to the strategic aims of the church.  Formalised educational systems in the USA widened curriculum and admitted women, but still remain ‘exclusive’ in many ways.

Rawlsian theories of social justice are reflective of conversations that are starting to take place in OER around stepping back from personal bias when making decisions.  If we disregard the considerations of race, gender, class and so on, we can support a more democratic and equally distributed educational system.

The remark is made that aspects of the USA educational system are exclusive rather than inclusive.  Much of the OER movement was organised around saving money on textbook costs, but this overlooks wider patterns of disenfranchisement.  The Sanders run for USA president foregrounded the idea of access to higher education as a matter of social justice.  Should education be ‘free’?

From the discussion:

  • Class divides are reinforced by higher education.  Some scholarships are set aside for students from disadvantaged backgrounds, but does this really change structural patterns of disenfranchisement?
  • If public education was made free, would this lead to a loss of resources through inefficiencies?
  • Can we really act as if we are ‘difference-blind’?
  • Is the difference between the student who goes on to higher education and the one who doesn’t a matter of money?  Disenfranchisement has other elements, e.g. confidence, role models, self-interpretation,  Much of these are the kind of ‘differences’ stripped out of the Rawlsian model.
  • How can social justice be understood from the perspective of what is essentially privilege?
  • Low cost vs. free?

Ethical principles of learning analytics – mini critique

This is just a short blog post to capture some thoughts on the ethical principles of learning analytics as set out in official documentation provided by The Open University.  I have attended various briefings at the OU around this subject, mainly because there is a lot of complexity here with regard to the ethical significance of these technologies.  I was also a member of the advisory panel for the JISC Code of Practice for Learning Analytics.

Here are the ‘ethical principles’ with my own brief annotations (click to enlarge).  (This is just an internal critique of these principles as they are set out here, not of the wider project of learning analytics.)

learninganalytics

The principles have been categorised in the following way:
Screen Shot 2015-12-03 at 13.03.15You can see the original list at http://www.open.ac.uk/students/charter/sites/www.open.ac.uk.students.charter/files/files/ecms/web-content/using-information-to-support-student-learning.pdf.

In essence, these are the points I would make about these principles are as follows:

  • Point 1.  It is asserted that learning analytics is an ethical practice, but this has yet to be established.  Arguably we should state that it should be thought of an ethical practice, but this is quite different in terms of ethical principle.  ‘Ought’ statements are much harder to justify.
  • Point 2. There is a confusing mix of deontological and consequentialist-utilitarian consideration here.  Unpicking it, I interpret it to mean that the university considers itself to have a responsibility to maximise the utility of the data about students that it owns.  The important points here are that a.) stakeholders are not clearly defined and could include (for instance, privately owned data brokers; b.) there is no acknowledgment of the possible tension between different forms of self-interest; c.) no criteria are given for ‘feasibility’.
  • Point 2. It’s difficult to see how feasibility should be a criterion for whether something is ethical.  After all, ethics is something that regulates the realm of the feasible, the possible, the actual.  This would be a much stronger principle if this word was replaced with ‘ethical’, or ‘justified’.
  • Point 3 infers that students should be at least partly defined by their data and the university’s interpretation of it.  This may not be that contentious to most people, though without clear parameters for the other criteria that are considered it could be taken to mean ‘mostly’ defined by the data held by the university.  It’s not clear what this means in practice except putting in some wording to ward off concerns about treating students as nothing more than a set of data points.
  • Point 4 seems right in setting out a principle of transparency in the process, purpose and use of student data.  But it doesn’t make a commitment to full transparency for all.  Why not?
  • This is brought into sharper relief in Point 5, which sets out a commitment to full transparency for data collection. Taken in conjunction with Point 4, it seems that transparency is endorsed for collection, but not use.
  • Point 6 is on the theme of student autonomy, and co-operation in these processes.  These are good things, though claims to have given informed consent are potentially undermined by the possible lack of transparency in use in Point 4.
  • A further possible undermining of student autonomy here is the lack of clarity about whether students can entirely opt out of these processes.  If not, how can they be considered ‘active agents’?
  • I’m not an expert in big data but I know a little bit about predictive modelling.  In Point 7. the idea is that modelling ‘should be’ free from bias.  Well, all modelling should be free from bias, but these effects cannot be truly eradicated.  It would make more sense as a principle to speak of ‘minimising’ bias.
  • Point 8. endorses adoption of learning analytics into the institutional culture, and vice versa.  It asserts that there values and benefits to the approach, though these are largely hypothetical.  It basically states that the institutional culture of the university must change, and that this should be ‘broadly accepted’ (whatever that might mean).

The final point I’d make about this is that, for me, these are not generally worded as principles: rather as vision statements or something intended to guide internal decision making.  But when it comes to ethics, we really need clear principles if we are to understand whether they are being applied consistently, sensitively, and systematically.

 

Ethics of FutureLearn

This is just a mini blog post, maybe even more of a bookmark.  Futurelearn is the MOOC provider owned by The Open University but with many operational partners.  The first courses launched in September 2013 and since then, 2,683,556 people have joined FutureLearn (with varying levels of commitment, like all MOOC).

This week a colleague forwarded me their ethical practice policies, which are available at https://about.futurelearn.com/terms/research-ethics-for-futurelearn/.  I think they are an interesting case since FutureLearn is not committed to being an open project (though enrolments are entirely open).  I haven’t had time to go through them yet but I have an idea that I could write a paper analysing them…

 

Research ethics

Research Ethics for FutureLearn

The purpose of this document is to provide a framework and guidelines for ethical and productive research practices associated with FutureLearn.

1. FutureLearn welcomes research by partners into all aspects of massive online learning including, but not restricted to, learning design, course evaluation, data analysis, educational technologies, and policy related to FutureLearn. Research is essential to understanding and improving the FutureLearn offering, and online learning in general.

2. We recognise that partners will have diverse needs, interests, philosophies, methods and forms of collaboration in research, including collaboration with external partners. The research may include comparative studies of learning on FutureLearn and other platforms.

3. All research should be conducted with an ethic of respect for: the people involved, diversity of cultures and interests, the quality of research, academic freedom and responsibility, the educational and commercial interests of FutureLearn and its partners.

4. This document provides guidelines to institutions and researchers in relation to research undertaken with data provided by FutureLearn, or in relation to FutureLearn courses or technologies. It is intended as a consensual document, to set down an agreed approach to research ethics. The sections that follow are based on the ‘Ethical Guidelines for Educational Research’ by BERA.[1] Some parts of that document are quoted verbatim.
Responsibilities to Participants

1. Individuals taking part in FutureLearn courses must be treated fairly and sensitively, recognising that they are engaging voluntarily with the courses with the intention of learning. They come from a wide variety of social and cultural backgrounds, with differing attitudes to research and to intrusion into their online activities.

2. Research into participation in FutureLearn courses presents particular challenges with regard to obtaining consent. Participants must be clearly informed that their participation and interactions may be monitored and analysed for research.

3. By taking part in a free open online course, where they are informed that activities may be monitored for research purposes, participants can be assumed to have given consent for participation in research conducted according to these guidelines, so opt-in consent from each participant is not required. It follows that learners can opt out from further participation only by unregistering from FutureLearn.

4. Although the FutureLearn platform is open to registration from anyone with internet access and learner names, profiles, and general comments and replies are made available for viewing by other users, it does not imply that learners engaging in FutureLearn discussions have forfeited rights to anonymity. The contributions were made in the context of an ongoing course discussion. It would normally be expected that research into learner contributions should use anonymised data.

5. Ownership of data created by learners is a further challenge. Learners own content they create on the FutureLearn platform, which they license to FutureLearn and partners forever and irrevocably. It should be recognised that participants in courses have a moral right of identity with materials created in their name. Some materials submitted by learners, including texts, documents, images, photographs, video and computer code may have additional rights of ownership. Learner-created content is published on the FutureLearn platform under a Creative Commons Licence (Attribution-Non Commercial-NoDerivs; BY-NC-ND), which means that any learner comments quoted in research publications must be attributed to the author.

6. If a learner ends registration with FutureLearn, there will be no further contact from the company or partners with that learner in relation to FutureLearn, but anonymised data from that learner’s previous interactions may continue to be used for research.
Responsibilities of Partners

1. All research associated with FutureLearn should be based on the principles of high standards, honesty, openness, accountability, integrity, inclusion and safety.

2. Partners are expected to gain appropriate approval from their institutional ethics panel for all research conducted in relation to FutureLearn.

3. Partners and FutureLearn together should be sensitive to the problem of inundating learners with surveys, particularly where learners may be engaging with many courses. Due regard should be given to the length of a survey, its complexity, and the intrusion into a learner’s private life.

4. There is increasing awareness that the mandatory conditions required by ethics review panels may not be sufficient to illuminate the complexities of research in online environments. Researchers are expected to reflect on their practices, and are encouraged to seek peer review of research proposals, particularly if they involve new or unusual methods.

5. Researchers should normally only work with anonymised data. A clear justification would need to be provided to analyse or present non-anonymised data, such as discussion postings or learner profiles with real names.

6. All non-anonymised data received by researchers should be kept secure, and in compliance with the partner’s research data management policies. This should involve, for example, securing the user account with a good password, encrypting the computer hard drive, encrypting any backups of data, and restricting access only to those essential to process the non-anonymised data.

7. Participants should be given opportunities to access the outcomes of research in which they have participated. This might, for example, be done by mailing those who participated in the course with a link to the research findings.

8. Researchers should not bring research into disrepute by, for example: falsifying evidence or findings, ‘sensationalizing’ findings to gain public exposure, distorting findings by selectively publishing some aspects and not others, criticizing other researchers in a defamatory or unprofessional manner, undertaking work where they are perceived to have a conflict of interest, or where self-interest or commercial gain might be perceived to compromise the objectivity of the research.

9. Researchers and partner institutions should recognize that research using data provided by FutureLearn is conducted in partnership with the company. FutureLearn would expect acknowledgement in research publications. It is also appropriate to provide FutureLearn with a copy of research findings and papers in advance of publication, particularly if these offer any new insight or issues.
Responsibilities of FutureLearn

1. FutureLearn should do all it can to enable researchers to publish the findings of their research in full and under their own names.

2. FutureLearn should not seek to prevent publication of research findings, nor to criticise researchers. The company may wish to respond in public to research findings, for example to promote favourable results, or rebut unfavourable ones.

3. FutureLearn should assist research wherever it is appropriate within its resources, for example by providing partners with data on their courses in forms that enable in-depth and comparative analyses.

4. This document will be made public and available for download from the FutureLearn website. It may also be distributed freely by partners.
24th February, 2014

References

[1] BERA (2011). Ethical Guidelines for Educational Research. British Educational Research Association. Available online at http://www.bera.ac.uk/system/files/BERA%20Ethical%20Guidelines%2020

Workshop Notes: #Ethics and #LearningAnalytics

This morning I’m attending a talk given by Sharon Slade about the ethical dimensions of learning analytics (LA), part of a larger workshop devoted to LA at The Open University’s library on the Walton Hall campus.

I was a bit late from a previous meeting but Sharon’s slides are pretty clear so I’m just going to crack on with trying to capture the essence of the talk.  Here are the guidelines currently influencing thinking in this area (with my comment in parentheses).

  1. LA as a moral practice (I guess people need to be reminded of this!)
  2. OU has a responsibility to use data for student benefit
  3. Students are not wholly defined by their data (Ergo partially defined by data?)
  4. Purpose and boundaries should be well defined and visible (transparency)
  5. Students should have the facility to update their own data
  6. Students as active agents
  7. Modelling approaches and interventions should be free from bias (Is this possible? What kind of bias should be avoided?)
  8. Adoptions of LA requires broad acceptance of the values and benefits the development of appropriate skills (Not sure I fully grasped this one)

Sharon was mainly outlining the results of some qualitative research done with OU staff and students. The most emotive discussion was around whether or not this use of student data was appropriate at all – many students expressed dismay that their data was being looked at, much less used to potentially determine their service provision and educational future (progress, funding, etc.). Many felt that LA itself is a rather intrusive approach which may not be justified by the benevolent intention to improve student support.

While there are clear policies in place around data protection (like most universities) there were concerns about the use of raw data and information derived from data patterns. There was lots of concern about the ability of the analysts to adequately understand the data they were looking at and treat it responsibly.

Students want to have a 1:1 relationship with tutors, and feel that LA can undermine this; although at the OU there are particular challenges around distance education at scale.

The most dominant issue surrounded the idea of being able to opt-out of having their data collected without this having an impact on their future studies or how they are treated by the university. The default position is one of ‘informed consent’, where students are currently expected to opt out if they wish. The policy will be explained to students at the point of registration and well as providing case studies and guidance for staff and students.

Another round of consultation is expected around the issue of whether students should have an opt-out or opt-in model.

There is an underlying paternalistic attitude here – the university believes that it knows best with regard to the interests of the students – though it seems to me that this potentially runs against the idea of a student centred approach.

Some further thoughts/comments:

  • Someone like Simon Buckingham-Shum will argue that the LA *is* the pedagogy – this is not the view being taken by the OU but we can perhaps identify a potential ‘mission creep’
  • Can we be sure that the analyses we create through LA are reliable?  How?
  • The more data we collect and the more open it is then the more effective LA can be – and the greater the ethical complexity
  • New legislation requires that everyone will have the right to opt-out but it’s not clear that this will necessarily apply to education
  • Commercialisation of data has already taken place in some initiatives

Doug Clow then took the floor and spoke about other LA initiatives.  He noted that the drivers behind interest in LA are very diverse (research, retention, support, business intelligence, etc).  Some projects of note include:

Many projects are attempting to produce the correct kind of ‘dashboard’ for LA.  Another theme is around the extent to which LA initiatives can be scaled up to form a larger infrastructure.  There is a risk that with LA we focus only on the data we have access to and everything follows from there – Doug used the metaphor of darkness/illumination/blinding light. Doug also noted that machine learning stands to benefit greatly from LA data, and LA generally should be understood within the context of trends towards informal and blended learning as well as MOOC provision.

Overall, though, it seems that evidence for the effectiveness of LA is still pretty thin with very few rigorous evaluations. This could reflect the age of the field (a lot of work has yet to be published) or alternatively the idea that LA isn’t really as effective as some hope.  For instance, it could be that any intervention is effective regardless of whether it has some foundation in data that has been collected (nb. ‘Hawthorne effect‘).

Ethics, Openness and the Future of Education #opened14

By popular demand, here are my slides from today’s presentation at Open Education 2014.  All feedback welcome and if this subject is of interest to you then consider checking out the OERRH Ethics Manual and the section on ethics (week 2) of our Open Research course.

Ethical Use of New Technology in Education

Today Beck Pitt and I travelled up to Birmingham in the midlands of the UK to attend a BERA/Wiley workshop on technologies and ethics in educational research.  I’m mainly here on focus on the redraft of the Ethics Manual for OER Research Hub and to give some time over to thinking about the ethical challenges that can be raised by openness.  The first draft of the ethics manual was primarily to guide us at the start of the project but now we need to redraft it to reflect some of the issues we have encountered in practice.

Things kicked off with an outline of what BERA does and the suggestion that consciousness about new technologies in education often doesn’t filter down to practitioners.  The rationale behind the seminar seems to be to raise awareness in light of the fact that these issues are especially prevalent at the moment.

This blog post may be in direct contravention of the Chatham convention

This blog post may be in direct contravention of the Chatham convention

We were first told that these meetings would be taken under the ‘Chatham House Rule’ which suggests that participants are free to use information received but without identifying speakers or their affiliation… this seems to be straight into the meat of some of the issues provoked by openness:  I’m in the middle of life-blogging this as this suggestion is made.  (The session is being filmed but apparently they will edit out anything ‘contentious’.)

Anyway, on to the first speaker:


Jill Jameson, Prof. of Education and Co-Chair of the University of Greenwich
‘Ethical Leadership of Educational Technologies Research:  Primum non noncere’

The latin part of the title of this presentation means ‘do no harm’ and is a recognised ethical principle that goes back to antiquity.  Jameson wants to suggest that this is a sound principle for ethical leadership in educational technology.

After outlining a case from medical care Jameson identified a number of features of good practice for involving patients in their own therapy and feeding the whole process back into training and pedagogy.

  • No harm
  • Informed consent
  • Data-informed consultation on treatment
  • Anonymity, confidentiality
  • Sensitivity re: privacy
  • No coercion
  • ‘Worthwhileness’
  • Research-linked: treatment & PG teaching

This was contrasted with a problematic case from the NHS concerning the public release of patient data.  Arguably very few people have given informed consent to this procedure.  But at the same time the potential benefits of aggregating data are being impeded by concerns about sharing of identifiable information and the commercial use of such information.

In educational technology the prevalence of ‘big data’ has raised new possibilities in the field of learning analytics.  This raises the possibility of data-driven decision making and evidence-based practice.  It may also lead to more homogenous forms of data collection as we seek to aggregate data sets over time.

The global expansion of web-enabled data presents many opportunities for innovation in educational technology research.  But there are also concerns and threats:

  • Privacy vs surveillance
  • Commercialisation of research data
  • Techno-centrism
  • Limits of big data
  • Learning analytics acts as a push against anonymity in education
  • Predictive modelling could become deterministic
  • Transparency of performance replaces ‘learning
  • Audit culture
  • Learning analytics as models, not reality
  • Datasets >< information and stand in need of analysis and interpretation

Simon Buckingham-Shum has put this in terms of a utopian/dystopian vision of big data:

Leadership is thus needed in ethical research regarding the use of new technologies to develop and refine urgently needed digital research ethics principles and codes of practice.  Students entrust institutions with their data and institutions need to act as caretakers.

I made the point that the principle of ‘do no harm’ is fundamentally incompatible with any leap into the unknown as far as practices are concerned.  Any consistent application of the principle leads to a risk-averse application of the precautionary principle with respect to innovation.  How can this be made compatible with experimental work on learning analytics and sharing of personal data?  Must we reconfigure the principle of ‘do no harm’ so it it becomes ‘minimise harm’?  It seems that way from this presentation… but it is worth noting that this is significantly different to the original maxim with which we were presented… different enough to undermine the basic position?


Ralf Klamma, Technical University Aachen
‘Do Mechanical Turks Dream of Big Data?’

Klamma started in earnest by showing us some slides:  Einstein sticking his tongue out; stills from Dr. Strangelove; Alan Turing; a knowledge network (citation) visualization which could be interpreted as a ‘citation cartel’.  The Cold War image of scientists working in isolation behind geopolitical boundaries has been superseded by building of new communities.  This process can be demonstrated through data mining, networking and visualization.

Historical figures of the like of Einstein and Turing are now more like nodes on a network diagram – at least, this is an increasingly natural perspective.  The ‘iron curtain’ around research communities has dropped:

  • Research communities have long tails
  • Many research communities are under public scrutiny (e.g. climate science)
  • Funding cuts may exacerbate the problem
  • Open access threatens the integrity of the academy (?!)

Klamma argues that social network analysis and machine learning can support big data research in education.  He highlights the US Department of Homeland Security, Science and Technology, Cyber Security Division publication The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research as a useful resource for the ethical debates in computer science.  In the case of learning analytics there have been many examples of data leaks:

One way to approach the issue of leaks comes from the TellNET project.  By encouraging students to learn about network data and network visualisations they can be put in better control of their own (transparent) data.  Other solutions used in this project:

  • Protection of data platform: fragmentation prevents ‘leaks’
  • Non-identification of participants at workshops
  • Only teachers had access to learning analytics tools
  • Acknowledgement that no systems are 100% secure

In conclusion we were introduced to the concept of ‘datability‘ as the ethical use of big data:

  • Clear risk assessment before data collection
  • Ethcial guidelines and sharing best pracice
  • Transparency and accountability without loss of privacy
  • Academic freedom

Fiona Murphy, Earth and Environmental Science (Wiley Publishing)
‘Getting to grips with research data: a publisher perspective’

From a publisher perspective, there is much interest in the ways that research data is shared.  They are moving towards a model with greater transparency.  There are some services under development that will use DOI to link datasets and archives to improve the findability of research data.  For instance, the Geoscience Data Journal includes bi-direction linking to original data sets.  Ethical issues from a publisher point of view include how to record citations and accreditation; manage peer review and maintenance of security protocols.

Data sharing models may be open, restricted (e.g. dependent on permissions set by data owner) or linked (where the original data is not released but access can be managed centrally).

[Discussion of open licensing was conspicuously absent from this though this is perhaps to be expected from commercial publishers.]


Luciano Floridi, Prof. of Philosophy & Ethics of Information at The University of Oxford
‘Big Data, Small Patterns, and Huge Ethical Issues’

Data can be defined by three Vs: variety, velocity, and volume. (Options for a fourth have been suggested.)  Data has seen a massive explosion since 2009 and the cost of storage is consistently falling.  The only limits to this process are thermodynamics, intelligence and memory.

This process is to some extent restricted by legal and ethical issues.

Epistemological Problems with Big Data: ‘big data’ has been with us for a while generally should be seen as a set of possibilities (prediction, simulation, decision-making, tailoring, deciding) rather than a problem per se.  The problem is rather that data sets have become so large and complex that they are difficult to process by hand or with standard software.

Ethical Problems with Big Data: the challenge is actually to understand the small patterns that exist within data sets.  This means that many data points are needed as ways into a particular data set so that meaning can become emergent.  Small patterns may be insignificant so working out which patterns have significance is half the battle.  Sometimes significance emerges through the combining of smaller patterns.

Thus small patterns may become significant when correlated.  To further complicate things:  small patterns may be significant through their absence (e.g. the curious incident of the dog in the night-time in Sherlock Holmes).

A specific ethical problem with big data: looking for these small patterns can require thorough and invasive exploration of large data sets.  These procedures may not respect the sensitivity of the subjects of that data.  The ethical problem with big data is sensitive patterns: this includes traditional data-related problems such as privacy, ownership and usability but now also includes the extraction and handling of these ‘patterns’.  The new issues that arise include:

  • Re-purposing of data and consent
  • Treating people not only as means, resources, types, targets, consumers, etc. (deontological)

It isn’t possible for a computer to calculate every variable around the education of an individual so we must use proxies:  indicators of type and frequency which render the uniqueness of the individual lost in order to make sense of the data.  However this results in the following:

  1. The profile becomes the profiled
  2. The profile becomes predictable
  3. The predictable becomes exploitable

Floridi advances the claim that the ethical value of data should not be higher than the ethical value of that entity but demand at most the same degree of respect.

Putting all this together:  how can privacy be protected while taking advantage of the potential of ‘big data’?.  This is an ethical tension between competing principles or ethical demands: the duties to be reconciled are 1) safeguarding individual rights and 2) improving human welfare.

  • This can be understood as a result of polarisation of a moral framework – we focus on the two duties to the individual and society and miss the privacy of groups in the middle
  • Ironically, it is the ‘social group’ level that is served by technology

Five related problems:

  • Can groups hold rights? (it seems so – e.g. national self-determination)
  • If yes, can groups hold a right to privacy?
  • When might a group qualify as a privacy holder? (corporate agency is often like this, isn’t it?)
  • How does group privacy relate to individual privacy?
  • Does respect for individual privacy require respect for the privacy of the group to which the individual belongs? (big data tends to address groups (‘types’) rather than individuals (‘tokens’))

The risks of releasing anonymised large data sets might need some unpacking:  the example given was that during the civil war in Cote d’Ivoire (2010-2011) Orange released a large metadata set which gave away strategic information about the position of groups involved in the conflict even though no individuals were identifiable.  There is a risk of overlooking group interests by focusing on the privacy of the individual.

There are legal or technological instruments which can be employed to mitigate the possibility of the misuse of big data, but there is no one clear solution at present.  Most of the discussion centred upon collective identity and the rights that might be afforded an individual according to groups they have autonomously chosen and those within which they have been categorised.  What happens, for example, if a group can take a legal action but one has to prove membership of that group in order to qualify?  The risk here is that we move into terra incognito when it comes to the preservation of privacy.


Summary of Discussion

Generally speaking, it’s not enough to simply get institutional ethical approval at the start of a project.  Institutional approvals typically focus on protection of individuals rather than groups and research activities can change significantly over the course of a project.

In addition to anonymising data there is a case for making it difficult to reconstruct the entire data set so as to stop others from misuse.  Increasingly we don’t even know who learners are (e.g. MOOC) so it’s hard to reasonably predict the potential outcomes of an intervention.

The BERA guidelines for ethical research are up for review by the sounds of it – and a working group is going to be formed to look at this ahead of a possible meeting at the BERA annual conference.