Corporation data collection conversation

From das_wiki
Revision as of 15:05, 14 October 2017 by David (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Questions are bolded, responses are prepended with the name of the respondant, comments are italicized.

Moderated by Laura Brandimarte. Panelists: Richard Kosinski (sorry, only link is to LinkedIn), Deirdre Mulligan, Ashkan Soltani


How has data use evolved as the internet has emerged?

Kosinski: Data available with the rise of the internet drove new businesses.
Soltani: Data is like radiation.


What about data collection by multinational companies, who may be subject to different regulations?

Mulligan: Europe has omnibus protection, with Data Protection Authorities. U.S. does not; it has "piecemeal" laws at the state and federal levels. Most are on data use, but there are some for collection as well. The FCC has new rules limiting data use by companies.
Mulligan: U.S. protections against government access for data stored by companies are stronger than E.U. countries.
Brandimarte: There used to be Safe Harbor allowing U.S. companies to serve customers in E.U., with privacy rights guaranteed.


What are the privacy issues involved in advertising based on the emotional state of consumers

Kosinski: Mediabricks works to identify the emotional state of the consumer, so that other companies can choose which states the consumer should be in when their ads are served, in mobile apps. E.U. consumers are not served at all, due to privacy regulations.
Mulligan: Manipulation of a consumer by a marketer brings up consumer protection issues broadly. Some populations are vulnerable, and this kind of marketing might be exploitative. The built environment might be "like an experiment" in that exceptional kinds of information might be collected and used to sell. This might be "creepy."
Kosinski: Advertising is self-regulated to avoid being "creepy:" The Internet Advertising Bureau (IAB) and the Digital Advertising Alliance (DAA) are two agencies. The company "can build anything" but has to "have a conscience."


  • Kosinski works for Mediabricks.

How should we be anonymous online?

Soltani: There is an arms-race between the tracking technologies used companies who profit from targeted advertising and the privacy-enhancing technologies used by privacy-conscious users. Quantcast set supercookies on users who tried to delete their cookies to avoid being tracked. The DAA and the IAB (mentioned above) have incentives to overcome privacy-protections.
Soltani: Consumers now are at an informational disadvantage compared to companies that they engage with when choosing a product. Companies know a lot of things about consumers that consumers themselves do not know to be known. An example is offline enhancement, where offline data is combined with online behavioral data.
Soltani: "Sucker lists" were investigated by the FTC because they are vulnerable. When are consumers made aware that the ads shown by Mediabricks are shown because they are in a particular emotional state (like a "winning" or "rescued" moment)?
Soltani: More broadly, targeted advertising raises a question about what a pen-tester might call "vulnerability analysis:" much like software might be analyzed for vulnerabilities, are humans being targeted for vulnerabilities to sell them things?
Kosinski: The economics of apps given away for free demands targeted advertising. This is done in aggregate; it is never designed to "put someone at risk or in jeopardy."


  • Consumers are susceptible to manipulation due to the information imbalance when ad targeting is not transparent. When people are not aware of why they are being targeted (for example, because you finished your jog and feeling a "winning" moment) they can make an emotional decision to buy. By contrast telling people why they are targeted makes their decision rational, which might block the "impulse buy." Moreover, transparency involves revealing how much is known about the target, which may be "too much" or may be a matter of competitive advantage which an advertiser wishes to keep a secret.

The Facebook emotional contagion study

Mulligan: Policymakers, FB users, researchers were concerned because the research was conducted outside of the usual protections for research participants. Institutional Review Board oversees the suitability of manipulating research participants to prevent harm. Something like A/B testing is different from being part of a basic science research project.


  • Jane Bambauer resists the response that the emotional contagion study on Facebook justifies further regulation.

Google's privacy policy was changed so that peoples' real names could be linked to the behavioral advertising data Google collects. What happened, and why does it matter?

Soltani: FB has a real name policy, as they moved to advertising across the web, they showed that having a "strong identifier" was useful for targeting. Google hadn't used strong identifiers until recently, but now they do, like FB. This means that there is a bridge between your offline activities and your online activities. One consequence of this is that your ability to inhabit different social roles, and to keep them separated, is degraded. Who you are at work is no longer as separable from who you are at home as it had been, for example.
Mulligan: It had been an opt-in for Google (partly thanks to the FTC). For new accounts, it is now opt-out. Back to why this matters: Doubleclick was going to buy Abacus a while ago. Abacus had marketing data about offline purchasing behavior, attached to peoples' names. This was blocked. The reasons highlight why the Google case is problematic. All your behavior is linked back to a single account, which is also tied to your activity offline. This matters because people forget that there's someone behind the screen. Horvitz does research on search logs, and sees the sensitivity of that data: financial information, sex lives, purchases, etc. A search bar is treated like a thing for our uses only; people forget that what they type into a search bar is communicated to many third parties.
Kosinski: Aggregate information enables lots of cool/important things. Since identifiers are not available from cookies on a mobile device in the same way as they are on desktop, you need emails as identifiers.
Mulligan: People don't realize what the information they are disclosing to Google or similar says about them. It is unintuitive, given what Google already knows and how well that can use it, in combination with seemingly innocuous information, to infer things about you.
Soltani: This is especially hard as technology changes. Now there is the ability to do passive collection in a way that links when you've visited a store to your online activity. For example, Datalogix allows marketers to learn that a person who bought a razor from Walgreens (which is known from the use of a loyalty card) also saw a particular Facebook ad.
Kosinski: The business justification for this is like the "Nielsen family" thing from way back when. Even in the 2000s, the Nielsen data would be crossed with the yahoo advertising database.


  • Why is the kind of information foraging we do on the web worthy of protection? This seems to me like the question, "why is our activity at a library protected?" We want people to be able to learn about what they want to learn about without self-censoring out of concern for someone observing their behavior.
  • Kosinski's point about aggregate information is hard to understand here.

Facial Recognition is related to tying people's identity online and offline. How do you see this relation?

Kosinski: For an advertiser, facial recognition is a cheap way to judge sentiment, at a course grain.
Soltani: There are hidden costs of facial recognition. That's identification at a distance, without your knowledge, and its a strong identifier (it works across age ranges, now, as well). You can't opt-out of facial recognition, and you can't control (very well) what can be inferred from your face, which is largely involuntary. The government was using information gathered by commercial entities. For example, a database of faces from yahoo's recorded videos.