The Subjective Nature of Data: When Context is Outside of the Data
In my last blog post, I talked about the dangers of narrow queries that attempt to prove a conclusion.
In this post, I would like to address another discipline that can expose embedded subjectivity in data:
Subjective weakness: Assuming all pertinent context exists in the data.
Data discipline: Look outside the data for context that can give the data a different narrative.
As I did in my last post, I will explain with an example.
This story took place a few years after the Canadian payment card market had completed its migration to EMV. As part of that migration, contactless payments (tap) had also been introduced in Canada.
As you may know, contactless can work at a “magstripe level” as well as at an “EMV level”. As the contactless rollout had progressed more rapidly than the EMV migration, a fair amount of “magstripe-only” contactless infrastructure remained in Canada, even after the EMV migration had completed.
A global EMV security team in the payment network that I worked for ran data to determine the extent of the magstripe contactless infrastructure remaining in Canada. Their analysis caused an uproar as it showed that over 60% of Canadian contactless transactions were at a “magstripe” level in a market that should have been a mature EMV market.
The uproar landed on my desk.
Their conclusion was that this 60% magstripe tap had to be caused by a deficiency on the merchant terminal side. Therefore, an urgent merchant terminal upgrade project must be initiated immediately.
Why is it always the merchants at fault?
This team could have used every bit of data in the company’s data warehouse, and they would have arrived at the same conclusion – because the key contextual information was not in the data.
There is a really geeky explanation as to why. Here is a paragraph of Geek History. . .
During the EMV migration and contactless rollout, there had been a problem. Contactless readers, in those days, were separate from the POS terminal. The reader manufacturers had shipped EMV-capable readers to Canada. When these readers were connected to magstripe-only terminals, EMV-capable tap cards would cause the terminal to fail.
To combat this issue, most of Canadian issuers limited the contactless interface on their EMV cards to “magstripe-only” contactless.
Though it was now several years after the terminal issue had been fixed, these issuers had kept their cards with this same configuration – EMV on contact, magstripe on contactless.
I knew about this because I was part of the Canada EMV migration team. This critically important contextual information was not in the data.
I also knew which issuers came to market after the terminal issue had been fixed. These issuers’ cards were EMV-capable contact and contactless. So, I ran the same query the Global EMV team had run. Then I filtered the data by issuers whose cards supported EMV contactless.
This filter on the data showed that over 95% of the merchant terminals were EMV contactless enabled. The problem was the 60% of the cards in market that were not EMV tap capable.
I sent my queries and reports to the Global EMV team, explaining the context that lived outside of the data. They were able to effectively use that data and external contextual information to run a very successful issuer EMV card update project.
Often anecdotal information carries no significance. Sometimes it carries critically important details that completely change your data’s story. Always look outside your data for additional anecdotal context. Often, it will confirm your query is OK. Just enough times, it will save you from a very wrong conclusion.