I work with a trajectory dataset which holds records from people using a certain ticketing app (for public transportation). The trajectory describes the route (i.e. an array of stations) of buses, trains etc
But, there is only a small set of passengers using this app (most buy regular tickets), leading to many unique trajectories. Thus, most records violate even k=3-anonymity.
I wonder how one should incorporate the fact that only ~ 1% of the passengers are using this tracking app when trying to anonymize this dataset?