Saving Face(book): three lessons from the Cambridge Analytica scandal

zuckerberg

The recent news that’s still in the news about the Cambridge Analytica scandal on the Facebook platform is making the rounds in marketing circles, and for very good reason. In many ways, and across virtually every category, calls will be made for heads in data and analytics departments nationwide, just as they were (initially) for the head of Mark Zuckerberg. “How could this happen?” the world seemed to ask. More accurately, the throngs pleaded, “how could YOU LET this happen?”

The harsh – and probably less titillating – reality, however, is that neither Zuckerberg nor Facebook are culpable of even a misdemeanor as far as this story goes. The folks at Cambridge were undertaking some very underhanded activities, and OF COURSE they did it out of sight of Facebook’s developer guidelines.

A quick review of what transpired: Cambridge Analytica (through a developer company called GSR,) created and then convinced 270,000 people to download an app called “thisisyourdigitallife” where users shared profile data and answered questions about themselves in exchange for a payment. That part is totally legal and fine.

What’s not legal, and very much not fine, is that the app those users agreed to have access their post data was also accessing data of their extended networks through Facebook. Unknowingly, friends and associates of those initial 270,000 had their profile data accessed too, and without consent. Some estimates put the digital swipe at about 50 million profiles (about a 20X reach.) A new report issued last week, raises the estimate to 87 million.  The algorithm GSR built used that data to create (according to some reporting) 30 million unique “profiles” that then helped in the design of highly targeted political ads.

There are numerous ways to unpack this. But for the sake of the practitioner who may be leveraging data (that’s everyone,) or thinking about it, let’s look at the basic but extremely important lessons this offers us.

Lesson 1: It’s NOT Facebook’s fault.
Let’s leave Facebook out of it (mostly) in terms of blame. Facebook was neither complicit in nor aware of the underhanded swiping of data, or the duping of unwitting consumers to grab information. They have clear policies, and those were blatantly violated by a business on the prowl. [To be clear, “data-scraping” tactics were allowed at one point for academic purposes, but have since been altogether forbidden on the platform.]

Facebook has the odd misfortune of being the central place where two billion+ people go and share information. That Cambridge Analytica stole from them is the issue, but so many of the news stories were focused on the idea that people had their data stolen ON FACEBOOK. That’s not fair, and it’s certainly not indicative of the platform’s policies and guidelines regarding third party developers.

Even if (and this is fiction,) there were some way for Facebook to oversee or even closely monitor every interaction that every third party developer has with any user while on the platform, then said third party developer with dubious intentions would first write an evasive script to keep their real intentions hidden. That’s Hacker 101.

Lesson 2:  This doesn’t make ALL data collection “bad.”
One story, even an egregious one like this, is not indicative of an obvious trend or an impending sign of where the digital marketplace is headed. So let’s not jump to conclusions about the use or misuse of data in marketing. Although it seems like the reflexive idea du jour, now is not the time to “re-evaluate every data collection activity, provider, or service” and start lobbying to pull data – or at least data collection – out of marketing. Data makes life infinitely better for the majority of consumers, whether they are clear about how or not.

Virtually every advance in marketing (from a digital point of view,) has been made infinitely more appealing because of the use of broad arrays of interoperative data sets. From programmatic advertising and retargeting to contextualized offers and recommendations that are algorithmically derived, the average online consumer is treated to a platter of timely propositions that make sense based on their online behaviors.

This is also a good time to remind everyone that maybe seeing your face squished like a funhouse mirror isn’t worth compromising the last seven years of your profile data. And that when you see that “you are now leaving Facebook” warning, it’s because You. Are. Now. Leaving. Facebook.

Lesson 3: Make it a teaching moment.  Evaluate your partners today.
This is an excellent opportunity for careful evaluation and timely introspection. Let’s take a good hard look at ALL our partners, data collection, data storage, data transfer, database, or otherwise – and give them a thorough once-over. Make sure their collection methods are sound. Make sure their statistics are sound. Make sure their conclusions are rooted in strong discipline and rigor. Make sure they’re collecting information that YOUR BRAND can actually use for YOUR objectives. (Not using your customer data pool for information your partners can sell to, say, your competitors, eh?)

As a paying customer, you have the right to ask what sample sizes your data and/or research partners will tolerate before making general conclusions, and so on. This way, when someone calls you on a “you are the company you keep” claim, you can be assured of (and even write policy around) your vetting methods. And here’s a handy little secret: you can brag about it to your clients, too.

 

What happens in the sky might be solved in the cloud.

aviation_image

At this point, nearly two weeks after the disappearance of Malaysia Airlines flight 370, the only certainty is the source of the next conspiracy theory.  We’ve heard every theory from hijacking to pilot suicide to computer hacking terrorism.  There’s no plane.  There’s no physical evidence.  There’s no group claiming responsibility.  And the worst part – there’s no concrete data to tell us where that plane was, where it was heading, or what might have gone wrong.  Some of the best thinking, not unsurprisingly, is being forwarded on WIRED.com.

Why is that?

It turns out that, as astounding an engineering feat it may be to get 30 tons of aluminum aloft and cruising at 500 mph, there really is not that much new “technology” in aviation.  Sure, there are on-board computers, there are advanced avionics systems, there’s radar and so on.  But in terms of how planes are tracked, the systems are still pretty crude.

In the United States, for instance, there can be upwards of 50,000 aircraft flying through the skies on any given day.  These are tracked through the air route traffic control centers (ARTCC) using basic radio frequencies.  A plane flying from New York to Los Angeles, for instance, is simply “handed off” from one ARTCC to the next, until it’s close enough to talk to the air traffic control tower (ATCT) at Los Angeles.  Along the way, they’re instructed on basic parameters:  what altitude to fly at, what heading to take and so on.  And flights heading across oceans don’t even have real-time contact:  they’re given a heading, an altitude, and they simply “check in” via high frequency radio with control centers that can be sometimes thousands of miles away.

When a plane crashes (a rare occurrence, in terms of probability,) or disappears (even less likely,) the investigation usually focuses on finding the “black box.”  The black box houses a flight data recorder and a cockpit voice recorder.  These record all kinds of information about the flight, including the mechanical data, and the conversations between the cockpit and the towers.

Why not modernize the flight data recording and cockpit voice recordings into a more technologically advanced system?  For instance, why doesn’t every commercial flight have a real-time data stream to the cloud?  From the time a plane is at the gate, through takeoff and climb, flight routing, approach and landing, EVERYTHING can be uploaded in real-time to the cloud.

This would be big data indeed.  On the receiving end, interested parties (from the airplane manufacturers to airline system executives to airports,) can monitor that data for all kinds of information BEFORE anything happens.  Think of the advances that might be realized:

  • A real-time data stream can tell the pilots and the airline about on-the-ground conditions, such as tire pressures, tire wear (heck even your basic automobile can do that,) hydraulics systems, power systems, computer systems and more.
  • In-flight data streams can inform on other conditions like rate of burn on fuel, weather-related data (triangulated with the aircraft’s current heading and velocity,) best altitudes for certain legs, engine efficiency and diagnostics and even act as the precursor to ATC at arriving airports for more streamlined trafficking.  Every interested party could tap into segments of the data set for relevant and actionable information.
  • Imagine – if the real-time data recording detects any glitch whatsoever, the awaiting airport can have the appropriate crews ready to remedy the problem and get the plane back in the air sooner than later.  That’s good for the airline, and for impatient passengers.
  • With big data providing in-air information, manufacturers like Boeing and Airbus can have access to a wealth of information about their aircraft, providing a post-sale, ongoing flight test to make longer-term observations and in turn, inform their engineering teams with an up-to-the-moment feedback loop.
  • With big data, we could probably streamline airport efficiency as well. (Yay!)

But mostly, the benefits of big data center around safety.  Big data is, at worst, informative.  And at best, it’s predictive.  If we could predict when issues might arise (even at the probability level,) we could keep pilots, crews and passengers safe, and probably avert any more, um, disappearing aircraft.

But why isn’t this done on a global scale?  There are drawbacks to such a proposal, to be sure.

  • Any system that can be built is eventually at the risk of being hacked.  Duly noted.  So we build in the world’s most sophisticated security (like every government/defense/space program has,) and find ways to packet, encrypt and protect.
  • There’s the sheer heft.  We’re talking storage in the yottabytes and a data center the size of Topeka.
  • This most likely hasn’t been done because it would be prohibitively expensive.  To the tune of tens or even hundreds of billions of dollars to craft, build, deploy and maintain a data system of this magnitude.  And then there’s the storage/archiving issue.

But think about it:  that cost could be amortized by every airline, manufacturer and aviation association in the world, and if it carries with it the promise of improved safety, greater efficiency, and predictive analytics, who wouldn’t be in favor of that?

We have entered the age of the Internet of Things.  Our homes are warmed by “smart” thermostats that are remotely controllable.  We have “smart” TVs and “smart” dishwashers and “smart” refrigerators to enhance our entertainment choices and the temperature of our water. So why not a smarter aviation infrastructure?

But who could build such a vast and predictive data center?  I don’t know for sure, but it might rhyme with Froogle.