Daily Tech Digest - March 15, 2018

How Valuable Is Your Company's Data?

Image: iStock
"As best as I can tell, there's no manual on how to value data but there are indirect methods. For example, if you're doing deep learning and you need labeled training data, you might go to a company like CrowdFlower and they'd create the labeled dataset and then you'd get some idea of how much that type of data is worth," said Ben Lorica, chief data officer at O'Reilly Media. "The other thing to look at is the valuation of startups that are valued highly because of their data." Observation can be especially misleading for those who fail to consider the differences between their organization and the organizations they're observing. The business models may differ, the audiences may differ, and the amount of data the organization has and the usefulness of that data may differ. Yet, a common mistake is to assume that because Facebook or Amazon did something, what they did is a generally-applicable template for success. However, there's no one magic formula for valuing data because not all data is equally valuable, usable or available.



Let Your Data Scientists Be Human


Humans are better at common sense than computers, instantly recognizing when a decision doesn’t make sense. This does not mean that humans are obsolete. Humans are stronger at communication and engagement, context and general knowledge, creativity, and empathy. When I have a frustrating problem, I want to talk to a human — someone who will understand my exasperation, listen to my experience, and make me feel valued as a customer while also solving my problem for me. Humans are better at common sense than computers, instantly recognizing when a decision doesn’t make sense. And humans can be creative. I recently heard music composed by a computer, and I’m sure that song won’t make it into the Top 40! Traditionally, businesses have hired data scientists who manually designed and built algorithms. The data scientists spent much of their time writing code and applying mathematics and statistics. Data scientists had no time to be human.


Yet Another Hospital Gets Extorted by Cybercriminals

Yet Another Hospital Gets Extorted By Cybercriminals.jpg
Unlike the vast majority of ransomware attacks, the Hancock attack was not the byproduct of a successful phishing campaign. “The [hacking] group obtained the login credentials of a vendor that provides hardware for one of the critical information systems used by the hospital,” explains Hancock Health President and CEO Steve Long. “Utilizing these compromised credentials, the hackers targeted a server located in the emergency IT backup facility utilized by the hospital...” Since they’d made a practice of regularly backing up all of their critical files, Hancock administrators initially believed that they would be able to purge the compromised files and replace them with clean backup versions. Unfortunately, it turned out that the “electronic tunnel” between the backup site and the hospital had been intentionally blocked. Several days later, administrators discovered that “the core components of the backup files from many other systems had been purposefully and permanently corrupted by the hackers.”


Is this the dawn of the robot CEO as artificial intelligence progresses?

Robot CEO
A human CEO can be corrupted by outside influence, but generally they have the freedom to make up their own minds and will face life-changing consequences should their impropriety be discovered. Robot CEOs on the other hand, could be completely ‘brain-washed’ by cybercriminals. For all of their incisive decision making and their unfaltering commitment to the company’s balance sheets, board and shareholders, a robot CEO could effectively ruin a company in seconds, or – if obfuscation is the game – quietly skim the company of profits in a ‘death by a thousand cuts’ approach. Kaspersky Lab researchers think the idea of robot CEOs is intriguing, but has some very real concerns about a future where robots are given too much responsibility. Cybercriminals go where the money is. That means if the robot stands between them and the possibility of substantial financial gain, they’ll find a way to exploit it. It’s always a cat and mouse game in cybersecurity. We come up with a defence; they find a way around it. It would be no different for a robot CEO.


Can The CIO Be The CDO?


The SAP Digital Transformation Executive Study indicates that successful companies must combine the best of these modes, resulting in what is effectively a “bimodal” approach to driving innovation. Our findings suggest that 72% of digital transformation leaders see a bimodal architecture as key to maintaining their core processes while quickly implementing next-generation technology. For better or worse, CIOs are traditionally associated with mode 1 – keeping the company running efficiently and effectively, at the lowest cost and least disruption. (No wonder they reigned during the era of deploying ERP systems.) It’s mission-critical, but it’s also the less glamorous side of IT today. In contrast, CDOs are all about disruption and digital transformation – the “mode 2” initiatives: driving new sources of revenue generation and using data to improve the customer experience. According to the SAP study, mode 2 initiatives fall into the category of “core business goals” for 96% of the Top 100 leaders in digital transformation, compared with 61% of laggards.


Voice-Operated Devices, Enterprise Security & the 'Big Truck' Attack

Biometric authentication, for one, doesn't solve the problem. In theory, Alexa could learn to identify authorized people's voices and listen only to the commands they give. But while this seems like a possible solution, the opposite is actually true. To begin with, there is an inherent trade-off between usability and security. Implementing such a system means that users would have to go through an onboarding process to teach Alexa or any other voice-enabled device how they sound. Compared to the status quo, where Alexa works out of the box, we are talking about a serious degradation in user comfortability. Biometric identification also means false positives: if your voice sounds different because you are sick, sleepy, or eating, Alexa will probably not accept you as an authorized user. And this is not all — there are systems available (like this example of Adobe VoCo) that, by using a person's voice sample saying one thing, can generate a new sample of his voice saying another thing.


With auditability, deep learning could revolutionise insurance industry


Given the low level of risk involved, said Natusch, “correlation is sufficient to drive action”. For applications such as Google’s photo identification, neural network-based algorithms are sufficient, he said. But in a risk-averse use case, such as decision-making in healthcare, people need to understand why a given decision was made, he said. This requires a causation-based approach, more suited to probabilistic graphical models, he added. Speaking of his experiences at Prudential, Natusch said: “We need two models – one to understand historical data, and something for handwriting recognition.” Discussing how handwriting recognition could streamline claims processing at the insurer, Natusch said that once a paper claims form is scanned, it ends up as a grayscale image. This is effectively a set of numbers that can be analysed using a neural network.


Innovation in Retail Banking 2017

The level of investment in both digitalization and innovation has increased in lockstep with each other as a result. The report illustrates the varying priorities of organizations of different sizes and the challenges and opportunities in the marketplace. More than ever, it is clear that having a defined innovation business model, with the application of data and advanced insights, is an imperative for success. It is also clear that being a ‘fast follower’ is not a viable strategy. We would like to thank Efma and Infosys Finacle for their partnership and their sponsoring of the 9th annual Innovation in Retail Banking research report. Their partnership has enabled us to create the most robust benchmarking of digitalization and innovation in banking, and to better understand the impact across all components of the financial services ecosystem.


How to train and deploy deep learning at scale

There was no deep learning in Spark MLlib at the time. We were trying to figure out how to perform distributed training of deep learning in Spark. Before actually getting our hands really dirty and trying to actually implement anything we wanted to just do some back-of-the-envelope calculations to see what speed-ups you could hope to get. ... The two main ingredients here are just computation and communication. ... We wanted to understand this landscape of distributed training, and, using Paleo, we've been able to get a good sense of this landscape without actually running experiments. The intuition is simple. The idea is that if we're very careful in our bookkeeping, we can write down the full set of computational operations that are required for a particular neural network architecture when it's performing training.


Outbrain Outgrows Initial Big Data Infrastructure, Migrates

"If a researcher or algorithm developer wants to introduce something new, we want to say 'let's try it, let's go for it,' " Yaron said. But it had become too difficult to do that with the 5-year-old Hadoop cluster which was now 330 nodes. "It resulted in bad performance for our algorithms when the Hadoop cluster wasn't stable enough," Yaron said. "At some point it became a source of frustration." Yaron's team decided to rebuild its Hadoop infrastructure with new physical servers and standardizing on a MapR implementation of Hadoop. "We are a company that runs a lot of open source, and we try to contribute to the open source community as well," Yaron told me. "However, there are cases where we feel there is value to enterprise technologies." The new physical servers changed the ratio between disk space, RAM, and CPU, Yaron said, and hardware and software upgrade enabled Outbrain to reduce the footprint of Hadoop servers in the data center to one-third of what it had been before.



Quote for the day:


"More people would learn from their mistakes if they weren't so busy denying them." -- Harold J. Smith