Mudge’s testimony highlights a problem Twitter, Facebook: where is your data?

By Daniel G. Barnes On Sep 15, 2022

During a congressional hearing on Tuesday, Twitter whistleblower Peiter Zatko was repeatedly questioned whether Twitter was aware of how its user data is accessed and stored.

On several occasions, he gave an awkward answer: the company doesn’t know.

The problem, however, extends far beyond Twitter, according to a range of engineers and experts in Silicon Valley. In a recent court hearing, for example, a senior Meta engineer also struggled to provide answers to questions about how Facebook pulls together all the information it collects about its billions of users.

“I would be surprised if there was even one person who could conclusively answer this narrow question,” the engineer said, in an exchange of court testimony that was reported for the first time by interception. Facebook provided the court with a list of 55 systems and databases where user data could be stored.

Tech giants like Google, Facebook, and Twitter were founded more than 15 years ago and developed freewheeling cultures where engineers and teams could create databases, algorithms, and other software. independently of each other. Speed was favored over security measures that might slow things down. That was before years of privacy lawsuits and legislation pushed companies to tighten their data practices.

But experts said companies are still struggling to pay off years of technical debt as regulators and consumers demand more from tech companies, such as the ability to delete data or know exactly what is being collected about a person. . And some of those speed-first practices haven’t changed.

Twitter whistleblower says security breaches cause ‘real harm to real people’

“Many Twitter engineers took the position that security measures were making life difficult for them and slowing people down,” said Edwin Chen, who has held engineering positions at Twitter, Google and Facebook and is now CEO of the company. content moderation start-up Surge AI. . “And that’s certainly a bigger issue than just Twitter.”

Some of these systems are black boxes even for those who built them, said Katie Harbath, former Facebook policy director and CEO of consultancy Anchor Change (Facebook changed its name to Meta last year). Even if the right policies are in place, they can be difficult to implement when the underlying databases have not been designed to answer questions such as what are all the places where the location or profile of a person could have been stored.

“It’s hard to start from scratch, especially as you grow up,” she said. “The way these rigs were originally set up, each team had enormous autonomy.”

In the Meta court case, a Northern California class action lawsuit over the Cambridge Analytica privacy scandal that the company settled last month, plaintiffs demanded that the company show them the full information she collects and stores about them. This could include people’s precise locations throughout the day, health conditions they researched or groups they joined, and inferences such as the likelihood that a person was married.

Facebook initially offered data from the company’s “Upload Your Information” tool, but a judge found in 2020 that the information provided by Facebook was too limited. Yet Facebook’s response, recorded in a deposition this summer, was essentially that even the companies’ own engineers didn’t know where all the data was.

Dina El-Kassaby, a spokeswoman for Meta, Facebook’s parent company, said the deposition did not mean the company was failing on security or data access. “Our systems are sophisticated and it should come as no surprise that no engineer in the company can answer every question about where every user information is stored,” she said. “We have one of the most comprehensive privacy programs in place to oversee data usage across our operations and to carefully manage and protect people’s data. We have made – and continue to make – significant investments to meet our privacy commitments and obligations, including extensive data controls.

Ex-security chief says Twitter buried ‘glaring loopholes’

During Tuesday’s Senate hearing with Zatko, the whistleblower and former security chief made similar comments about Twitter. He noted that in a recent data breach, Twitter accidentally leaked the personal information of 50 million employees (Zatko’s attorney later issued a corrective statement saying Zatko meant 20,000).

Zatko noted during the hearing that Twitter has nothing that comes close to that many employees — the current number is 7,000 — and pointed out that Twitter keeps too much information about former employees and contractors that it does. fails to delete.

He repeatedly claimed that the company had up to 4,000 engineers – more than half of all company employees – with wide access to internal systems and few ways to officially track who accessed what. . It was a dangerous situation, he said, because an individual employee could take control of a Twitter account and impersonate it.

If that employee was secretly working for a foreign government, the risks of giving employees wide latitude to access user data are much greater. Zatko alleged that Twitter knowingly had employees who worked for both the Indian and Chinese governments, but did not provide evidence to support those claims.

And in a separate report on the company’s ability to combat misinformation that was in the treasury provided by Zatko to Congress, an independent auditor noted that Twitter lacks a formal system to track user cases. who violated company rules.

Twitter repeatedly pushed back against Zatko’s arguments. A spokeswoman, Rebecca Hahn, previously told The Washington Post that Twitter had tightened intensive security since 2020, that its security practices are in line with industry standards, and that it had specific rules about who can access company systems. In response to Tuesday’s hearing, Hahn reiterated that Zatko’s arguments were “riddled with inconsistencies and inaccuracies,” but declined to elaborate on specifics.

Twitter can’t afford to be one of the most influential websites in the world

David Thiel, technical director of the Stanford Internet Observatory at Stanford University and a former Facebook security engineer, said that after reading Zatko’s disclosures, he felt that Twitter’s security processes seemed to be years behind those of Facebook. He noted that Facebook had significantly tightened access in response to various controversies over the years, including the allegation that Facebook allowed the company Cambridge Analytica to access user data, to the point that if an engineer accessing a system, he did not have permission to access, “someone will come after you and you will be fired.”

But he said it’s still common in Silicon Valley to give engineers wide access so they can “build great products quickly.”

“The focus,” he said, “is always on speed and access.”

He said sometimes companies, including Facebook, really can’t know everything that’s inside their systems.

For example, machine learning systems and software algorithms are made up of tens of thousands of data points, often calculated instantaneously. Although it is possible to put data points into the system, one cannot then go back to retrieve the original entries. He drew a food analogy, noting that it would be impossible to reprocess the soup back to its original ingredients.

But other data, he said, is simply complex, and companies are resisting the extensive work it would take to track it all down — and would likely only do so if compelled by new laws or court rulings.

It’s not “so complicated that it’s not doable,” he said.

Mudge’s testimony highlights a problem Twitter, Facebook: where is your data?

Related posts: