a pad lock that is open overlain with data

Do I need a new license? Creative Commons, Cambridge Analytica and Ethics

With thanks to Dan Lynds for for suggestions and edits

Earlier this year Robin de Rosa and Rajiv Jhangiani launched the Open Pedagogy Notebook, a resource for any educators to both use open resources and learn more about the underpinning idea of open. I first met Robin in August 2016 at DigPedPEI, we had lots of conversations, and in one particular break out session Robin DeRosa, Daniel Lynds, Scott Robison, and I sat around in some comfy chairs and started talking about open. Eventually we got to talking about analytics and the data that is generated by students (and staff) and the tools that can take that data and using a variety of algorithms add some context, thereby either giving the student a representation of what they have done, or predicting what they will be likely to do.

Most of the current LMS/VLE offerings have some sort of data collection, and I imagine that even if the institution doesn’t use it the vendor does have access to it, if only so that they can understand and refine things like usability. But does the student have access to that data, does the student even know what data is collected and how it is used.

First let me state that I believe analytics can be a great force for helping students succeed, to catch those at risk, and support them earlier in their learning. At the same time I definitely agree to ask, as Amy Collier: “How might colleges and universities shape, rather than simply adopt, the ways that companies treat data?”

Back in 2016 the four of us looked around and asked “What would open analytics look like?” We brainstormed many possibilities of what an environment where students had open access to their data would look like. Not just the data, but also the system itself, the algorithms it used to nudge student behaviour, and what pitfalls any of this would have.

In the last month the Cambridge Analytica story got the issue of data and analytics across all the major news sites. The data that they had stolen “harvested” had been (probably) used unethically. They were nudging the behaviour of users in Facebook. Underpinning this issue is our data, how/where it is collected, how it is accessed, and who can access it. In the case of Cambridge Analytica and Facebook a simple app and quiz pretty much gave them access to everything you posted and read, and all of your friends.

So what has this got to do with Creative Commons? My posts on this blog and my images on Flickr and elsewhere are all licensed under a creative commons license that allows:

  • Sharing — copy and redistribute the material in any medium or format
  • Adapt — remix, transform, and build upon the material for any purpose, even commercially.

This is the license needed to make my work “open”, as currently defined.

I believe in the open movement, and I think open textbooks and open educational resources are excellent initiatives, and I am currently working on a chapter for an open textbook.

Data and algorithms are an issue. They can be good, but I believe they should be open and transparent, or they should also be opt in, with all of the caveats and warnings explained to students (and staff) as part of their learning about data literacy.

If I create an open object, whether it is an educational resource, a blog post or a photo. I want to know that if it is going into a VLE / LMS or other educational tool, it will only appear in tools that have open and transparent analytics and algorithms, or opt in data for students and staff, that they can also access. But by current definitions, that would mean my work is not longer open. Creative Commons is also something I believe in, and I want my work to be seen, used and adapted, but I want it done in a way that ethically aligns with my values. I do not mind if it is used commercially, but I do want to hold people to an ethical standard. Do I want an ethics rider for my Creative Commons license?

Currently the license states my moral rights for the work are not affected. But that is not clear language. “The preserving of the integrity of the work allows the author to object to alteration, distortion, or mutilation of the work that is “prejudicial to the author’s honor or reputation”. Arguably the work could still be used for “evil” if the people adapting it make it clear that they are the ones changing the context and I had nothing to do with it.

I’m looking for an answer, and probably there isn’t one. It’s possible there isn’t even a problem? If there is one good thing that as come out of the Cambridge Analytica and Facebook story, it is that we are talking about these issues, and that people are realising that data and algorithms are not neutral, that they have political bias, either unconscious to deliberately placed there. I do believe that the Open Movement needs to look at analytics and algorithms and decide how open objects can be used in these closed systems, and what the implications are.


  1. It’s the data generated by the systems and the reports created by the data analysis that need oversight, not the content that is already CC licensed. Institutions need to take ownership and stewardship of that data and the reports and not allow a 3rd party to control it. We may need to go back to self hosting of LMSs if the 3rd parties won’t ensure institutional security of the data and reports based on the data.

  2. I’ll admit to not thinking this through properly, but it’s interesting how Facebook is responding the CA mess by making promises of further openness and personal control of data. It’s all a bit without substance and doesn’t really deal with the issue that it’s the whole premise of Facebook that’s broken, not some feature bug. It’s working exactly as it was designed to – they just didn’t anticipate the outcome (or did but don’t care). Any changes offered could be rolled back when the fuss has died down anyway. I think Zuck sees it as a PR problem and is handling it accordingly.
    What I haven’t got sorted in my head yet is whether there is something similar going on with LMSs. The model of a command-and-control (I know that’s an overly perjorative label!!) systems is a natural fit for the opaque harvesting and analysis of data. Yes, it can be used for good as well as evil but I’d be interested in whether people feel that the current generation LMS model actively works against what Lawrie’s talking about here. If we’d designed an LMS from the assumption that it should be transparent, controlled by the student how different would it look? Would it even be recognisable as an LMS?

      1. That would be interesting to explore. Still assumes a single system even if the power relationship is different. Is it too tin-foil-hat to say that centralised systems are both a reflection of and a preparation for a particular sort of digital society? Have felt this about the fact that schools are using biometrics for lunch payments and putting CCTV in toilets (https://www.theguardian.com/education/2017/nov/02/secondary-school-cctv-pupil-toilet-areas-summerhill-surveillance).

        How could we design systems that show that other models are possible?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.