In today’s digital world, our personal data is exposed in a variety of ways. From the websites we visit to the apps we use, we regularly provide sensitive information about ourselves. This data can be used to track our movements, build marketing profiles, and even commit fraud. As a result, there is a growing need for ways to protect our identity online. One solution is synthetic data. Synthetic data is created by algorithms that mimic real-world data sets. This artificial data can be used to train machine learning models without exposing sensitive information. As a result, synthetic data provides a safer way to protect our identity and prevent our personal data from being stolen. What about social data?
It just keeps getting harder to believe! No matter what you do, be sure to keep your social data within your four walls and make sure your defenses are turned up to 11. Then, and only then, be comfortable providing internal and external audiences synthetic data, that is, data that has been obfuscated so entirely that it is impossible to learn anything about any one individual. The data remains useful for aggregate analysis but is entirely useless at the individual record level.
“For well over a decade, identity thieves, phishers, and other online scammers have created a black market of stolen and aggregated consumer data that they used to break into people’s accounts, steal their money, or impersonate them. In October, dark web researcher Vinny Troia found one such trove sitting exposed and easily accessible on an unsecured server, comprising 4 terabytes of personal information—about 1.2 billion records in all.
While the collection is impressive for its sheer volume, the data doesn’t include sensitive information like passwords, credit card numbers, or Social Security numbers. It does, though, contain profiles of hundreds of millions of people that include home and cell phone numbers, associated social media profiles like Facebook, Twitter, LinkedIn, and Github, work histories seemingly scraped from LinkedIn, almost 50 million unique phone numbers, and 622 million unique email addresses.
“It’s bad that someone had this whole thing wide open,” Troia says. “This is the first time I’ve seen all these social media profiles collected and merged with user profile information into a single database on this scale. From the perspective of an attacker, if the goal is to impersonate people or hijack their accounts, you have names, phone numbers, and associated account URLs. That’s a lot of information in one place to get you started.””
Overview by Tim Sloane, VP, Payments Innovation at Mercator Advisory Group