Blog - Degree Analytics

Higher Ed's Big Data Privacy Problem

Written by Aaron Benz | Mar 1, 2022 2:34:03 PM

Part II - Purpose of Data Collection

In the first blog post of this series, many of the systems that currently collect and use student data were highlighted. This showed that there is a disconnect between how institutions use student data, and what students think of their institution's data usage. But as it turns out, the idea of data collection and its uses is not a new one - this problem was actually well documented from the beginning...

From the moment computers became a thing there was skepticism and concern about how data might be used because of their nature - computers collect and store data. This excerpt is taken from “Records Computers and the Rights of Citizens” which was produced by the Department of Justice in 1973, the same year that Ethernet was invented by Bob Metcalfe:

The use of automated data systems containing information about individuals is growing in both the public and private sectors... The Department itself uses many such systems, and in addition, a substantial number are used by other organizations, both public and private, with financial or other support from the Department. At the same time, there is a growing concern that automated personal data systems present a serious potential for harmful consequences, including infringement of basic liberties. This has led to the belief that special safeguards should be developed to protect against potentially harmful consequences for privacy and due process.

But still, those were the early days of computing - privacy was a lot simpler. Proper privacy protocols would stipulate that data should only be used within the context of its original purpose. For example, card swipe data should strictly be used for card swipe purposes, like to manage and understand account balances so that a student could purchase a meal.

And then... it all changed...

In the new world of "Big Data" and advanced analytics, the new ways data is being shared, combined, or processed have created a lot of additional value that otherwise wouldn’t exist. For example, a student success coordinator may now be interested in looking at the last time a student swiped their card at a dining facility because it might lead to the discovery of a food-insecure student.

That is, the "new" use of this card swipe data, which requires bare-minimum analytics capabilities, presents at least a partial solution to combating one of the most serious growing issues today in higher education - food insecurity. From a study in 2020, "a third (34%) of students say they know someone who has dropped out of college due to difficulties affording food." From a traditional privacy perspective, the idea of utilizing student swipe meal data to identify these potential issues would constitute a breach. Nonetheless, are we willing to overlook the 27% of students who have suffered from food insecurity to uphold the privacy norms of the 20th century? Or perhaps, should we create a framework where an alert can be created on this institution-owned data, resulting in an email to students whose balances are empty and who haven't used the cafeteria in 10 days?

Some other examples of extending data use cases:

  • Using WiFi and Videoconferencing data to infer class attendance, the leading predictor of student persistence (aka students staying in school). This might create an alert if a student stopped attending for a week of class so that an advisor could reach out to the student
  • Using online login and participation to understand activity and inactivity (ie - who stopped participating). Perhaps even combining it with demographic data to better understand the learning rates and efficiencies of different students and student groups
  • Understanding building utilization and student traffic flow to decide where to build the next cafeteria, the next parking lot, or to even impact course scheduling
  • Using Digital Signatures (card swipe, video conferencing, LMS, WiFi, etc…) to generate more accurate  “Last Day of Attendance” records to prove students were participating in academic activities, often reducing repayments of Title IV grants like Pell (by the school and student)
  • Using demographic and academic records to identify the “most vulnerable students” coming into a school year, to suggest additional remedial classes that may raise their success rates. Likewise, tracking progress and participation after the remedial classes to see what actual effect those courses had on students and their progress
  • DEI Analytics - Measuring inclusion of different demographic groups on campus by evaluating which services and resources students use, resulting in an analysis that details particular student groups and their underutilization of certain resources (card swipe, LMS, WiFi, etc...)
  • Measure impact of student interventions - did that email, tutoring session, or phone call re-engage a student back into class? Out of all the services offered to students, are some more effective than others? Are some not working as we’d hoped?

By combining and sharing traditional data resources that were previously segmented, the fundamentals of historical privacy are disrupted in the pursuit of bettering the University and helping the student. 

So end privacy? No - this is not a call for the abolition of privacy; rather, for a solution that both acknowledges privacy as a right, while also advances us toward the future.

How do we embrace capabilities created by data sharing and analytics while providing a structure that protects data and individual liberty? How do we enable new applications that provide insight that otherwise would not exist, but provide adequate governance to prohibit abuse?

These are not trivial questions...

The Products of Higher Ed

In order to properly establish what a new picture may look like, let's first establish the frame. In principle, any usage of data should be used insofar as it aligns with the product, purpose, or reason that a student decides to attend school. While there may be others, here are the two core purposes/products of Higher Education in terms of Student Success:

1) The Experience - the memories, events, milestones, and achievements, both in and out of class

2) Continued Learning - most notably by progression towards a degree

No matter the size, cost, type or mission of the institution - these two products are the reason students enrolled and are the reasons they are paying tuition. There may be additional missions of an institution or limitations that prohibit students from attending some (cost, location, etc...), but Experience and Continued Learning are the ultimate products of every institution of learning today.

At Degree Analytics, we believe that as a first principle, educational institutions ought to use their data to accomplish these outcomes. Data being used outside of this context is extraneous and out of bounds for the purposes of student success. 

This framing immediately eliminates many exterior possible use cases - marketing purposes, the reselling of data to 3rd parties, the sharing of data to outside organizations, etc... However, there is still additional clarity needed within that frame - namely, are there applications that elevate student success but also create unnecessary risk or exposure of student data? And of course, how are these policies and frameworks both reinforced and communicated?

Part III will begin answering some of these questions, as well as provide more clarity and actual examples in our work at Degree Analytics on how privacy and progress can work together.

This is Part II in a multi-part series on Data Governance

  1. The Problem
  2. Purpose of Data Collection
  3. Legal and Background Information
  4. The Solution?

An Open Invitation

At Degree Analytics, we recognize how much of a gap exists, and continues to grow, between privacy and the pursuit of a better campus and student experience.

We believe that data can, and ought to be, used for the purpose of student success. We also recognize that protections and governance are needed to advance these technologies, their use cases, and to create trust with students, faculty, and staff.

We are starting an initiative to create an open-source bundle of materials to help jumpstart data governance at institutions of any size. Simply, the goal and purpose are to create best practices that will allow institutions to put this necessary governance in place. However, we cannot do that alone.

This is an open invitation to those in Higher Education and those who are experts in privacy to help establish generalized and repeatable processes.

If you are interested in participating in any manner, please fill out this contact form, and we’ll reach out to you: degreeanalytics.com/privacy-and-data-governance