Skip to main content

Addressing Privacy Concerns in the CHFC Data Explorer

Photo

Privacy Concerns in the CHFC Data Explorer

Carnegie Hero Fund Commission's extensive database includes thousands of entries consisting of heroes, victims, nominees, relatives, friends, artifacts, acts, and more. With the humanization of data being a central theme of the project, protecting the privacy of these individuals is of the utmost priority when designing an encompassing research platform. This article is the second in a series that will explore the layers of the Data Explorer prior to its planned public release this fall.

Potential Security Risks Within Data

The CHFC's data contains fields that can be defined as both direct identifiers and quasi-identifiers.

Direct Identifiers Quasi-Identifiers
First Name Location
Last Name Date
Email Generalized Addresses
Headshot Various Tags
  Family
  Occupation
  Roles
  Award Status

Direct identifiers are attributes that can be looked up directly to find information about a specific person. Quasi-identifiers do not inherently allow an individual's identity to be exposed, but when multiple identifiers are used in conjunction with publicly available data like censuses or reports, individuals can be identified. If identification is an issue — especially in relation to victims and nominees whose information has not traditionally been made publicly available by the CHFC — some quasi-identifiers would have to become more generalized for the data to be safe for release.

The interconnection between each of the layers of the application could also present a risk. Each person is connected to other people, acts, and locations with an explicitly defined relationship. This means that the CHFC data can function as a social network graph, showing connections between individuals not otherwise obvious.

Tiered Access Model

The CHFC's main solution to address major privacy concerns is through a tiered access system.

Public Tier

  • Victim and nominee information is largely hidden
  • Person-to-person connections should be limited to heroes
  • Do not list all people associated with an organization or location; instead use a number for visualization purposes
  • Locations very generalized

Research Tier — access granted by the CHFC

  • Increased allowance for data exporting
  • All connecting edges are accessible
  • Locations specified as zip codes or location blocks
  • Some access to witnesses and victims
  • Authentication required

Administrative Tier — for CHFC and direct partners

  • Can add data to the explorer
  • All exact locations and specific information allowed
  • Full export capabilities

Written by Shanker Pillai