2

We have a dataset of very privacy sensitive people data and want to build a database with it. The data protection department in our company doesn't like the idea that the data scientists are able to see any data specific to a person (even if anonymized). We can't preaggregate the data in the database because there are hundreds of different possible aggregations that could be interesting.

Is there a software or DBMS that could ensure that users can only query aggregated results that contain at least groups of N people?

How else would you solve this problem technically?

user86825
  • 21
  • 1
  • "aggregated results that contain at least groups of N people" would not make the data anonymized at all.. – Valentas Dec 12 '19 at 12:47

1 Answers1

0

Two possible options:

  1. Have the database administer set limits on SQL queries that only allow result tables with aggregated results with minimum of N people to be returned.

  2. Apply differential privacy which a query result cannot be used to infer much about any single individual, and therefore provides privacy.

Brian Spiering
  • 20,142
  • 2
  • 25
  • 102