Matters of Privacy

Tuesday, January 5, 2021

UC Santa Barbara joined six other UC campuses in the California COVID Notify pilot project to track COVID-19 infections in November. The Notify technology, which works through smartphones, can tell users if they may have been in close proximity to someone who has been exposed to the coronavirus. When the app is activated on iPhones or downloaded on Androids, the phones start broadcasting randomly generated and anonymous codes that change every ten to twenty minutes. When another phone using the app is nearby, both phones remember each other’s keys and the amount of time the phones were near each other — but not the users’ identities or locations. If a user tests positive for the virus and inputs that information into the app, the system matches up the phone contacts it has had over the past fourteen days and notifies those users of their potential exposure.

Great hope lies in the development of COVID Notify and other technologies that collect such person-generated health data (PGHD). Digital self-tracking technologies, such as wearables, apps and sensors, present ample opportunities for individuals to increase their health-related self-knowledge, and for scientists who use the data in health-related studies.

To help process and interpret the high volumes of complex information provided by PGHD, researchers have turned to machine learning (ML), using algorithms to make predictions and automate decision-making for policies and guidelines. While machine learning has a tremendous upside, it also raises its own set of questions regarding such things as the type and amount of data being tracked, and the ethical and legal issues related to data collection, access, sharing, and usage.

“Important societal factors must be considered when building algorithms to ensure accuracy and trustworthiness, and to eliminate bias,” said William Wang, an assistant professor of computer science and the Mellichamp Chair in Artificial Intelligence and Designs at UCSB. “Artificial intelligence and machine learning are changing many different industries like healthcare, which is why it is so essential to better understand and regulate the responsibilities of the technologies that shape our world.”

Wang, the director of UCSB’s Center for Responsible Machine Learning (CRML), works to better understand the mutual influences that society and machine learning have on each other. He examines the important societal factors and impacts, such as fairness, transparency, privacy, and accountability, that should be considered when building algorithms.

His research efforts have drawn support from Evidation Health, a California-based company that will join the CRML as a founding corporate sponsor. Evidation’s app and research platform, Achievement, helps individuals and organizations better understand and measure health in everyday life, outside the clinic walls. The Achievement network is comprised of nearly four million individuals and is the largest and most demographically and geographically diverse and connected cohort in the U.S., representing fifty states and nine out of every ten ZIP Codes nationwide. Achievement prioritizes user privacy and control, and follows a consent-per-use model, where participants provide consent for each program in which they participate and receive compensation for their contribution to research. Achievement has been the basis for pioneering studies across diverse health topics, from COVID-19 and Alzheimer’s disease to chronic pain and physical activity.

“We’re really excited to be a founding member of the center,” said Evidation co-founder and chief data scientist, Luca Foschini, who earned his PhD in computer science from UCSB in 2012. “As one of the few health-tech companies in Santa Barbara, and because health data is the most sensitive data we can have as individuals, I thought it was important to support the effort of a center whose first objective is to develop methodologies for the responsible use of ML tools.”

Foschini says that one of their first projects with the center will be to develop efficient differential-privacy algorithms for PGHD. Differential privacy is a system that allows information about a dataset to be publicly shared while withholding information about individuals to protect their identities. He describes PGHD as a “unique fingerprint,” because it is often highly identifying.  

“This characteristic is great to enable personalized health applications, but it hinders the ability to share data between researchers who want to develop and validate ML models on it, which is important for medical applications,” said Foschini, who was a guest speaker at a virtual summit, hosted by Wang earlier this fall, at which participants discussed how ML is being used in COVID-19 research. “Differential privacy could enable data sharing in a way that preserves privacy and breaks down silos between researchers who want to share data for the greater good.”

Wang sees the new partnership as a great opportunity to ensure the privacy and trustworthiness of PGHD, strengthen ML-based research and development efforts at the local level, and attract talented students, engineers, and scientists to Santa Barbara.

“Working with local industry is an essential part of the center. Evidation Health and CRML share the same vision and excitement for developing future responsible machine-learning technologies,” said Wang. “It’s an honor to continue the long tradition established by UCSB, the College of Engineering, and the Computer Science Department of partnering with industry to conduct innovative and collaborative research that positively impacts society.”