For the second year in a row, I attended LAK and OER conferences one shortly after the other. This year, the Learning Analytics and Knowledge conference came first in Vancouver in mid-March followed by the OER17 Conference in London in early April.
Going into LAK, I made wrote Deciding Better, Learning Better: Different Kinds of Stories and made an active attempt to gather input from those who I deeply respect in the Open community who have also been (fiercely?) critical of the use of data. The exercise was successful in gathering several insightful comments which I then shared as part of the introduction to the LAK17 Hackathon. I went back to the Learning Analytics pre-conference MOOC from 2010-2011. Many of the topics hadn’t changed, but the folks involved have. (I uneasily looked at the names of folks involved wondering how I wound up the last one of them in this space…)
HackatLAK
To my surprise, the presentation was well received. Moreover of the challenges I put forward the most practical of them – the development of operational dashboards for use by learning technologists were adopted as a project. It’s hard for me to assess if we simply pitched the event in such a way to attract the most practical minded folks or if there is a change afoot in the Learning Analytics space. Either way, I walked away with some practical queries and dashboards that I’m hoping we can implement this year. Basic information on what parts of the system are being used and which ones aren’t. Yay!
The challenge
The outcome
Another challenge that I brought forward, submitted by Brian Lamb at a pre-Hackathon event within my department was to develop a digital literacy playground. That challenge was not explicitly adopted at the Hackathon, but it did offer some inspiration to some interesting work by Adam Cooper. Jeremy Knox’s LAK presentation presented a different approach but based on common themes.
SARPThe concept of the Digital Literacy Playground, and its potential importance, continued to grow throughout the remainder of the conference, especially when the conversation shifted to “mutli-modal data” (aka: wearables and/or the Internet of Things) and predictive analytics.This is the stuff that I wish didn’t exist, but it does and despite my desire to hide under a tin foil hat, that is no solution.
Discussions surrounding the new EU privacy legislation and the changing notion of ‘informed consent’ came up both at LAK and with Annie Marie Scott on a later trip to Edinburgh and offered yet another use case for such a space. In any conversations related to student-facing data there was a general consensus that students don’t understand it and that this is a problem. This stuff is happening to our students all the time outside of class, we can’t protect them, we need to educate them. A digital literacy playground might help.
About the same time, I ran into Amy Collier’s Digital Sanctuary work and got some early interest from others via Twitter. Before OER17 a talk by Audrey Watters at Coventry University reinforced some of the worrying themes from LAK17 themes that carried directly into OER17. Data protection, ethics, privacy, algorithms, surveillance. This stuff is happening to our students all the time outside of class, we can’t protect them, we need to educate them. A digital literacy playground might help.
See Jim Groom’s post, giving kudos to Kate Green, Christian Friedrich, and Markus Deimann on their workshop “Towards Openness – Safety in Open Online Learning?” In a comment on that post, Kate Green pointed to the Databox project which in some respects more closely approximates a vision proposed by Kirsty Kitto at the LAK Hackathon – Why not let students play with their own data?Full circle.
(My answer was then (and is now) that while ideal, the logistics of such a system are beyond the scope of what I want to tackle. I do hope though that these efforts move forward and maybe someday render the digital literacy playground obsolete.)
So from all of those conversations, here is what has emerged to date.
Digital literacy playground
A synthetic dataset modelled on a series of existing data sources currently gathered by the university, college or other institution they are attending. The dataset should include at a minimum data related to student information (demographics, grades and courses), payment information, location information (Eduroam and/or library data) and LMS/VLE data (Moodle, Blackboard, etc.) and an example of a third party system.
This dataset and associated tools will enable students to freely view, manipulate and use data to gain a deeper understanding of both the risks and benefits of sharing data. For each data source, the Data Literacy Playground will also include documentation relating to the current approved uses, laws and processes safeguarding this data.
Students will be encouraged to build models, tools and apps using the data and to share these with other students. Faculty use of the Data literacy playground will also be actively encouraged.
Proposed approach – Very drafty
Define data types required
Assemble an interdisciplinary group to gather and review related existing literature from a variety of fields and develop a series of use cases for the digital literacy playground. Using those use cases, identify the types of data required and associated laws and policies related to each data type. Attempts will be made to consider applicable laws across several jurisdictions, the scope of which will be determined based on the project participants.
Identify potential data sources
For each data type, specific relevant sources of data will be reviewed. In cases where similar data is gathered in multiple systems, some redundancy will be considered but the goal will be usability so the number of data sources may be simplified in order to better achieve that goal. The plan will include mapping source systems at a number of institutions with the hopes of creating a framework and examples that can be more easily applied at multiple institutions.
Create a synthetic version of the data
At this point, it is unknown what the willingness will be of institutions to allow their actual data to be used to create a synthetic version. It is expected that this process will be easier at some institutions that others. The goal will be to identify at least one but hopefully three institutions that will allow an actual data set to be used to create a synthetic version. A process for creating a synthetic version will be developed, implemented and tested to ensure that the characteristics of the data are sufficiently preserved while stripping away personal information. During this process, data types that cannot be well synthesized will also be identified and either removed, or dealt with in another agreed upon manner.
Processes to maintain, clean are refresh the datsets will be defined and implemented.
Testing and initial tools development
The initial prototyping of some tools and dashboards will take place and help with testing to ensure the system is reflecting appropriate information. Efforts will be made to overlay existing tools on the dataset in order to offer some early views for students to begin viewing and manipulating the data in various ways aligned with the use cases developed earlier in the project.
Implementation
As part of the testing phase, there will need to be active efforts to engage both students and faculty in the use of the digital literacy playground. Those efforts will need to continue after it is fully implemented and a mechanisms for sharing uses, findings and tools should be developed.
Next steps
Please share your thoughts, comments or suggestions as a comment, tweet or email. As smarter brains than mine suggest things, I’ll continue to update the plan.
Beyond that, I’m hoping to pull together interested folks to maybe make this a reality (or at least to say that we tried). I’m hoping to have an open source Slack-like channel set up next week and if you have so much as nodded as I described this project and I can find your email address, you can expect an invite. If there are others interested in participating please comment, tweet, email or sent a carrier pigeon in my direction. 🙂
Note: I wrote this as a proposal that I didn’t end up submitting. As a result, it sounds far more “defined” than this idea really is, so *please* offer any suggestions, thoughts or feedback 🙂
Hi Tanya – thanks for writing this up so fully. As you know I’m very interested in what we might be able to do with this in terms of helping out students become more data-savvy and also make really informed choices about consent when it comes to their data. A few thoughts that have come to me since reading this – take or leave as you will:
1. Presentations of data – textual presentation of data is really important. Graphs are variously crap/hard/scary/misleading. See Jeremy’s feedback in the LARC project.
2. Would personas offer us anything useful in this space? If we are using synthetic samples of data, then coming up with a few variations on types of students that could be used as short-hand for presentations of this data might offer some easy routes into exploration (part-time / full-time; online / blended; working full time + doing CPD; working hard but struggling; working hard, doing okay but still stressing). This would combine / overlap with use cases I suspect.
3. What do you think about collecting information about typical consequences to sit alongside this. E.g. what would be the consequences of sharing or not sharing data. What sort of interventions might the data drive?
Thanks again for collecting all of this together and putting so much thinking into it.
A-M.
This is awesome work Tanya. I have this sort of pipe dream of trying to get students access to their own data and then letting them use that for analysis. I think that your approach of building a synthetic data set is more likely and possibly more powerful as it could have simulated data from a wider context. Used for data and digital literacy I would hope that it would start to raise questions for students about data ownership and access. This is great work – can’t wait to see where you take it next!
Hi Autumn,
We have the same pipe dream. I hope this idea is an intermediary step towards students having access and rights over their own data. Having said that something from the Learning Analytics conference that is ringing in my ear (though I need to confirm that I’ve understood it correctly) is that when we show students representations of their own data – even if they are told they are made up – they take them as truth. If I’ve understood that statement correctly, in my mind it’s all the more reason to *start* with synthetic data in which there is not even fictitious or anonymized representation of themselves.
No one is more interested than me in where this might go next, but the positive feedback is certainly encouragement to continue to move it forward at whatever pace and whichever partners are willing and able to participate. One. Step. At. A. Time. 🙂
This is really exciting stuff Tanya and thinking it over a bit more since we talked I have a few further thoughts. Most of them about how to get students to engage with the tool when they will come with widely varying degrees of knowledge and qualitative skills.
1. Data to text representations would be worth attempting – as per Jeremy’s work.
2. Could personas be combined with use cases to show how different identities can be created? I’m thinking maybe this could change as more or less data sets are used too.
3. Have you thought about including info on consequences? E.g. Typical interventions that might be driven from the data. We talked about including legal information but this could be useful too.
Thanks so much for writing this up and sharing it. It’s give us lots of food for thought. Would be great to see if we can find a way to turn at least some of it into reality!
Hi all,
I was at the LAK17 hack with Tanya and am part of the current Mozilla Open Leaders cohort.
i will be focusing the n this project as I also thought it was a fantastic idea. I have set-up a GitHub repository (quirksahern/DataLiteracyPlayground) and will shortly be setting up a slack channel.
Unfortunately I won’t be at LAK18 but it would be great if we can get this project kickstartedf for then.
Happy to be emailed and contacted on twitter – @2standandstare