Daily Tech Digest by Kannan Subbiah

Daily Tech Digest - August 25, 2020

When Your Heartbeat Becomes Data: Benefits and Risk of Biometrics

We haven’t even discussed the abilities to detect your walking patterns (already being used by some police agencies), monitor scents, track microbial cells or identify you from your body shape. More and more organizations are looking for contactless methods to authenticate, especially relevant today. What all these biometrics technologies have in common is that they are using some combination of physiological and behavioral methods to make sure you are you. There are certain things people just can’t fake. You can’t fake a heartbeat, which is as unique as a retinal scan or fingerprint. You can’t easily fake how you walk. Even your typing and writing styles give off a distinct and unique signature. ... Some of the best innovators are threat actors. They may not be able to replicate your heartbeat today, but what about tomorrow? The not-too-distant future could include a “Mission: Impossible“ scenario with 3D printers that generate a ‘body suit’ (think wetsuit) that can have a simulated heartbeat uploaded into it. This all may sound like science fiction right now, but not too long ago, would it have not been silly to think that your heartbeat could be identified through clothes using a laser from over 200 yards away?

What skills should modern IT professionals prioritise?

Though technical skills, like those accompanying cyber security and emerging tech are a focus, IT professionals are coming to realise that non-technical skills are a critical element of their career development and IT management. When asked which of these were most important, IT pros listed project management (69%), interpersonal communication (57%), and people management (53%). According to the LinkedIn 2020 Emerging Jobs Report, the demand for soft skills like communication, collaboration, and creativity will continue to rise across the SaaS industry. Despite the budget and skills issues IT professionals report, 53% of those surveyed said they’re comfortable communicating with business leadership when requesting technology purchases, investing time/budget into team trainings, and the like. Though developing tech skills is often informed by current areas of expertise, the 2020 IT Trends Report reveals strong IT performance is about more than IT skills. Interpersonal skills are commonly referred to as “soft skills”, which is misleading. They rank highly in overall importance, meaning soft skills aren’t optional. They’re human skills — everyone needs to relate to other people and speak in a way they can understand. My advice in this area would be to find a mentor, someone on your team who can help you learn. Practice your communication skills and try your hand at new specialties like project management.

Predictive analytics vs. AI: Why the difference matters

Fast forward to today. Within the information governance space, there are two terms that have been used quite frequently in recent years: analytics and AI. Often they are used interchangeably and are practically synonymous. Organizations—as well as the software vendors that supply their needs—have largely tapped analytics to provide deeper information beyond basic indexed searching, which typically involves applying Boolean logic to keywords, date ranges, and data types. Search concepts have expanded to filter out application-specific metadata (e.g., parsing mail distribution lists, application login time, login/logout/idle times in chat and collaborative rooms, etc.). Today's search also includes advanced capabilities such as stemming and lemmatization—methods for matching queries with different forms of words—and proximity search, allowing searchers to find the elusive needle in the haystack. The latest whiz-bang features that are all the buzz within the information governance space are analytics (or predictive analytics) and AI (or artificial intelligence/machine learning). These are here to stay, and we are just beginning to scratch the surface of their many uses.

Too many AI researchers think real-world problems are not relevant

New machine-learning models are measured against large, curated data sets that lack noise and have well-defined, explicitly labeled categories (cat, dog, bird). Deep learning does well for these problems because it assumes a largely stable world (pdf). But in the real world, these categories are constantly changing over time or according to geographic and cultural context. Unfortunately, the response has not been to develop new methods that address the difficulties of real-world data; rather, there’s been a push for applications researchers to create their own benchmark data sets. The goal of these efforts is essentially to squeeze real-world problems into the paradigm that other machine-learning researchers use to measure performance. But the domain-specific data sets are likely to be no better than existing versions at representing real-world scenarios. The results could do more harm than good. People who might have been helped by these researchers’ work will become disillusioned by technologies that perform poorly when it matters most. Because of the field’s misguided priorities, people who are trying to solve the world’s biggest challenges are not benefiting as much as they could from AI’s very real promise.

Foundations of Deep Learning!!!

One may ask what is the difference between ANN and DL. The name Artificial Neural Network is inspired from a rough comparison of it’s architecture with human brain. Although some of the central concepts in ANNs were developed in part by drawing inspiration from our understanding of the brain, ANN models are not models of the brain. In reality, there is no great similarity between an ANN and it’s method of operation with human brain, neurons, synapses and it’s modus operandi. However the fact that the ANN is a consolidation of one or more layers of neurons, that help in solving perceptual problems - which is based human intuition, the name goes well. ANN essentially is a structure consisting of multiple layers of processing units (i.e. neurons) that take input data and process it through successive layers to derive meaningful representations. The word deep in Deep Learning stands for this idea of successive layers of representation. How many layers contribute to a model of the data is called the depth of the model. Below diagram illustrates the structure better as we have a simple ANN with only one hidden layer and a DL Neural Network (DNN) with multiple hidden layers.

COVID-19 Data Compromised in 'BlueLeaks' Incident

The Department of Homeland Security on June 29 issued an alert about "BlueLeaks" hacking of Nesential, saying a criminal hacker group called Distributed Denial of Secrets - also known as "DDS" and "DDoSecrets" - on June 19 "conducted a hack-and-leak operation targeting federal, state, and local law enforcement databases, probably in support of or in response to nationwide protests stemming from the death of George Floyd." The hacking group leaked 10 years of data from 200 police departments, fusion centers and other law enforcement training and support resources around the globe, the DHS alert noted. The 269 GB data dump was posted on June 19 to DDoSecrets' site, the hacking group said in a tweet that has since been removed. The data came from a wide variety of law enforcement sources and included personally identifiable information and data concerning ongoing cases, DDoSecrets claimed in a tweet. Several days after DDoSecrets revealed the law enforcement information through its Twitter account in June, the social media platform permanently removed the DDoSecrets account, citing Twitter rules concerning posting stolen data.

Top exploits used by ransomware gangs are VPN bugs, but RDP still reigns supreme

At the top of this list, we have the Remote Desktop Protocol (RDP). Reports from Coveware, Emsisoft, and Recorded Future clearly put RDP as the most popular intrusion vector and the source of most ransomware incidents in 2020. "Today, RDP is regarded as the single biggest attack vector for ransomware," cyber-security firm Emsisoft said last month, as part of a guide on securing RDP endpoints against ransomware gangs. Statistics from Coveware, a company that provides ransomware incident response and ransom negotiation services, also sustain this assessment; with the company firmly ranking RDP as the most popular entry point for the ransomware incidents it investigated this year. ... RDP has been the top intrusion vector for ransomware gangs since last year when ransomware gangs have stopped targeting home consumers and moved en-masse towards targeting companies instead. RDP is today's top technology for connecting to remote systems and there are millions of computers with RDP ports exposed online, which makes RDP a huge attack vector to all sorts of cyber-criminals, not just ransomware gangs.

Shoring Up the 2020 Election: Secure Vote Tallies Aren’t the Problem

“When looking at the ecosystem of election security, political campaigns can be soft targets for cyberattacks due to the inability to dedicate resources to sophisticated cybersecurity protections,” Woolbright said. “Campaigns are typically short-term, cash strapped operations that do not have an IT staff or budget necessary to promote long-term security strategies.” For state and local governments, constituents are accessing online information about voting processes and polling stations in noticeably larger numbers of late – Cloudflare said that it has seen increases in traffic ranging from two to three times the normal volume of requests since April. So perhaps it’s no coincidence that the firm found that government election-related sites are experiencing more attempts to exploit security vulnerabilities, with 122,475 such threats coming in per day (including an average of 199 SQL injection attempts per day bent on harvesting information from site visitors). “We believe there are a wide range of factors for traffic spikes including, but not limited to, states expanding vote-by-mail initiatives and voter registration deadlines due to emergency orders by 53 states and territories throughout the United States,” Woolbright said.

iRobot launches robot intelligence platform, new app, aims for quarterly updates

"We were focused on the idea that autonomous was the same as intelligence," said Angle. "We were told that wasn't intelligent and customers wanted collaboration." The COVID-19 pandemic pushed the collaboration theme with customers and robots because there was no choice. People are home more than ever so more cleaning coordination is needed. Meanwhile, iRobot found customers were home more yet had less time to clean. More time at home also meant more messes. Indeed, iRobot has seen strong demand during the COVID-19 pandemic. The company saw premium robot sales jump 43% in the second quarter with strong performance across its international business. Roomba i7 Series, s9 Series, and Braava jet m6 also performed well. For the second quarter, iRobot delivered revenue of $279.9 million, up 8% from a year ago. First-half revenue for 2020 was $472.4 million. iRobot reported second quarter earnings of $2.07 a share. Julie Zeiler, CFO of iRobot, said that Roomba was 90% of the product mix in the second quarter and the company's e-commerce business performed well.

Google Engineers 'Mutate' AI to Make It Evolve Systems Faster Than We Can Code Them

Using a simple three-step process - setup, predict and learn - it can be thought of as machine learning from scratch. The system starts off with a selection of 100 algorithms made by randomly combining simple mathematical operations. A sophisticated trial-and-error process then identifies the best performers, which are retained - with some tweaks - for another round of trials. In other words, the neural network is mutating as it goes. When new code is produced, it's tested on AI tasks - like spotting the difference between a picture of a truck and a picture of a dog - and the best-performing algorithms are then kept for future iteration. Like survival of the fittest. And it's fast too: the researchers reckon up to 10,000 possible algorithms can be searched through per second per processor (the more computer processors available for the task, the quicker it can work). Eventually, this should see artificial intelligence systems become more widely used, and easier to access for programmers with no AI expertise. It might even help us eradicate human bias from AI, because humans are barely involved. Work to improve AutoML-Zero continues, with the hope that it'll eventually be able to spit out algorithms that mere human programmers would never have thought of.

Quote for the day:

"Luck is what happens when preparation meets opportunity." -- Darrell Royal

Daily Tech Digest - August 24, 2020

What’s New In Gartner’s Hype Cycle For Emerging Technologies, 2020

Gartner believes that Composite AI will be an enabling technology for organizations that don’t have access to large historical data sets or have AI expertise in-house to complete complex analyses. Second, Gartner believes that Composite AI will help expand the scope and quality of AI applications. Early leaders in this area include ACTICO, Beyond Limits, BlackSwan Technologies, Cognite, Exponential AI, FICO, IBM, Indico, Petuum and ReactiveCore. ... The goal of Responsible AI is to streamline how organizations put responsible practices in place to ensure positive AI development and use. One of the most urgent use cases of Response AI is identifying and stopping “deep fakes” production globally. Gartner defines the category with use cases that involve improving business and societal value, reducing risk, increasing trust and transparency and reducing bias mitigation with AI. Of the new AI-based additions to the Hype Cycle this year, this is one that leads all others on its potential to use AI for good. Gartner believes responsible AI also needs to increase the explainability, accountability, safety, privacy and regulatory compliance of organizations as well.

How to ensure CIO and CMO alignment when making technology investment decisions

Often, total cost of ownership (TCO) for handling the complexity, maintenance and technical debt with new platforms can turn out to be a real burden for organisations. In fact, according to Gartner, more than three-quarters of orgnaisations found the technology buying process complex or difficult. But is this really surprising? Implementing the right technology solution for the business is often challenging due to the different priorities that CMOs and CIOs have. While for the CMO the priority is to adopt the latest innovations as soon as possible in order to stay ahead of the competition, this need has to fit the CIO’s focus on TCO for the long-term. The weight of these options is what drives a wedge between those key decision makers, creating a need to find common ground sooner. Being aligned is essential so that they can choose the right options which will allow marketing to execute on strategy and hit company targets on the one hand, and meet operational requirements for maintenance, governance and risk avoidance on the other, which are top of mind for the CIO. To ensure that the best options are selected for the business, the CMO’s priorities need to meet those of the CIO and vice versa.

Save-to-transform as a catalyst for embracing digital disruption

In this approach, businesses evolve through infrastructure investments in digital technologies. In turn, these technologies can deliver dramatic improvements in competitiveness, performance and operating efficiency. In response to the pandemic, the survey shows that organizations are evolving into a “Save-to-Thrive” mindset, in which they are accelerating strategic transformation actions specifically in response to challenges posed by COVID-19 to make shifts to their operating models, products and services and customer engagement capabilities. “The Save-to-Thrive framework will be essential to success in the next normal as companies rely on technology and digital enablement — with a renewed emphasis on talent — to improve their plans for strategic cost transformation and overall enterprise performance improvement,” said Omar Aguilar, principal and global strategic cost transformation leader, Deloitte Consulting. “Companies that react quickly and invest in technology and digital capabilities as they pursue the strategic levers of cost, growth, liquidity and talent will be best-positioned to succeed.”

How big data is solving future health challenges

Unlike many other data warehousing projects, Stringer said the focus is not just on collecting and using data if it has a specific quality level. Instead, when data is added to LifeCourse, its quality level is noted so researchers can decide for themselves if the data should or should not be used in their research. The GenV initiative relies on different technologies, but the two core pieces are the Informatica big data management platform and Zetaris. Informatica is used where traditional extract, transform and load (ETL) processes are needed because of its strong focus on usability. Stringer said this criterion was heavily weighted in the product selection process. Usability, he said, is a strong analogue for productivity. But with a dependence on external data sources and a need to integrate more data sources over the coming decades, Stringer said there needed to be a way to use new datasets wherever they resided. That was why Zetaris was chosen. Rather than rely on ETL processes, Stringer said the Zetaris platform lets GenV integrate data from sources where ETL is not viable.

5 Key Capabilities of a Next-Gen Enterprise Architecture

Many enterprise architects look to rationalize and centralize emerging technologies, processes, and best practices, making them available to all business units in a self-service mode to accelerate digital transformation and modernization initiatives across the enterprise. By defining enterprise-wide technology standards and tools, enterprise architects strive to plan for reusability, reducing costs and future proofing the architecture as technology changes and enforcing data governance and privacy policies to democratize data so that trusted data travels securely throughout the enterprise in a frictionless, self-serve fashion. Traditional data management solutions to support next-gen architectures are expensive, manual, and require time-consuming processes, while newer emerging niche vendor solutions are fragmented. As such, they require extensive integration to stitch together end-to-end workstreams, requiring data consumers to wait months to get useful data. Therefore, a next-gen enterprise architecture must support the entire data pipeline, which includes the ability to ingest, stream, integrate, and cleanse data.

India's National Digital Health Mission: A New Model To Enhance Health Outcomes

The digital health platform that NDHM is, is guided by an architectural blueprint called the National Digital Health Blueprint (NDHB), developed a few months earlier. The NDHB has put in place a structure to the thinking and approach. It established the vision and principles, architecture requirements and specifications, applicable standards and regulations, high-priority services, and institutional mechanisms needed to realize the mission of digital health. The NDHB is crafted to unlock enormous benefits for citizens, create new opportunities and financial, productivity, and transparency gains and make a positive contribution to growth, innovation, and knowledge sharing. A digital platform with a national footprint evokes immediate pushback as it is generally seen to steer the narrative towards centralization. The architecture deliberately and explicitly addresses this ‘concern’ to ensure that India’s overall federated structure of governance is reflected in the architecture as well. In a large country like India, where there are multiple layers of government – national (central), state, local (urban), and local (rural) – the responsibilities are distributed and this is guaranteed by the constitution.

Data Governance Should Not Threaten Work Culture

The discipline of data governance must focus on knowing who these people are, helping them to make more actionable decisions, and empowering them to become better stewards. People who define data must know what it means to define data better, and that includes providing meaningful business definitions for data and managing how often data is replicated across the organization. People who produce the data must know what quality data looks like, and they must be evaluated on the quality of the data they produce. And, the no-brainer. People in the organization who use the data, must understand how to use it, and follow the rules associated with using it appropriately. That means data consumers must follow the protection and privacy rules, the business rules, and use the data in the ethical manner spelled out by the organization. While people already define, produce, and use data, data governance requires that these people consistently follow the rules and standards for the action they take with that data. The rules and the standards are important metadata, data about the data, that must be recorded and made available to the people across the organization to assist in the discipline of data governance.

Defining a Data Governor

Without oversight, employees will misinterpret data, sensitive data may be shared inappropriately, employees will lack access to necessary data, and employees’ analysis will often be incorrect. A Data Governor will maintain and improve the quality of data and ensure your company is compliant with any regulations. It is a vital role to have for any informed company. With the exploding volume of data within companies, it has become extremely difficult for a small technical team to govern an entire organization’s data. As this trend continues, these Data Scientists and Analysts should transition themselves from their traditional reporting responsibilities to those of Data Governors. In a traditional reporting role, their day was filled with answering questions for various business groups around their needed metrics. The shift to Data Governors finds them instead creating cleaned, documented data products for those end business groups to explore themselves. This is called Democratized Data Governance, where the technical team (traditionally data gatekeepers) handles the technical aspects of governance and share the responsibilities of analytics with the end business groups.

Blockchain for Applications ~ A Multi-Industry Solution

The workings of blockchain are somewhat common knowledge now. A decentralized network of interconnected links that share all data among its peers, keeping a chronological log of each transaction. Simply put- “Everything that happens in the blockchain network is shared by all members of the network and everyone has a record of it on their individual device” Hence, in a way these block-chains form a binding link with each other and through this decentralized model of information storage, it liberates from the risk & inefficiencies of having all data stored in one place only. ... DApps or decentralized applications function without any central server to help interact with two parties. Blockchain users operate on mini-servers that work simultaneously to verify and exchange data. There are 2 kinds of blockchains, segregated on the basis of access and permissions – “Permissionless blockchain” & “permissioned blockchain”. A permissionless network grants full transparency and allows each member to verify transaction details, interact with others while staying completely anonymous. Bitcoin works on a permissionless blockchain.

How to manage your edge infrastructure and devices

Another aspect to consider when managing edge infrastructure and devices is to invest in discovery processes. “Edge by nature creates a distributed approach – accelerated by the current global pandemic – that needs a more flexible style of management,” said David Shepherd, area vice-president, pre-sales EMEA at Ivanti. “But ultimately, if we don’t know what we are managing then it becomes difficult to even start managing in a comprehensive manner. “Effective discovery processes allow an organisation to apply the right management policies at the right time. As more devices start to appear at the edge, the context of the device plays a crucial role. “This includes the type of device and the interaction it has with the infrastructure, plus its location (often remote). Understanding what a device is and how it interacts is again crucial to applying a comprehensive management approach. ... “Zero-touch provisioning, for example, enables easier onboarding of IoT devices onto an IoT cloud platform, e.g. AWS, as it enables automatic provisioning and configuration. This prevents developer error during the provisioning and configuration process, as well as provide a more secure interaction between the device and platform as the security framework had already been established on both ends during the pre-production stage.

Quote for the day:

"The hard part isn't making the decision. It's living with it." -- Jonas Cantrell

Daily Tech Digest - August 23, 2020

What we've lost in the push to agile software development, and how to get it back

Team members and business partners should not have to ask questions such as "what does that arrow mean?" "Is that a Java application?" or "is that a monolithic application or a set of microservices," he says. Rather, discussions should focus on the functions and services being delivered to the business. "The thing nobody talks about is you have to do design to get version 1," Brown says. "You have to put some foundations in place to give you a sufficient starting point to iterate, and evolve on top of. And that's what we're missing." Many software design teams keep upfront design to a minimum, assuming details will be fleshed out in an agile process as things move along. Brown says this is misplaced thinking, and design teams should incorporate more information into their upfront designs, including the type of technology and languages that are being proposed. "During my travels, I have been given every excuse you can possibly imagine for why teams should not do upfront design," he says. Some of his favorite excuses even include the question, "are we allowed to do upfront design?" Other responses include "we don't do upfront design because we do XP [extreme programming]," and "we're agile. It's not expected in agile."

Those who innovate, lead: the new normal for digital transformation

Even in normal times, IT departments struggled to meet their digital transformation goals as quickly as required. According to research, 59% of IT directors reported that they were unable to deliver all of their projects last year. Much of this is due to IT complexity and the challenges inherent in trying to integrate various data sources, applications and systems in an agile way that supports the goals of transformation. All too often, organizations rely on linking capabilities together with point-to-point integrations, which are inflexible and unsuited to the dynamism of modern IT environments. As a result, they find it hard to quickly launch innovative, customer-centric products and services, as they can’t bring together the capabilities that drive them in a cost and time-effective manner. At the same time, it’s often the case that digital transformation is left largely to the IT department. IT teams – already stretched by their day-to-day maintenance responsibilities – are increasingly tasked with driving the entire organization forward, with limited support from other teams in the business. Understandably, this has led to a widening ‘delivery gap’ between what the business expects, and what IT is able to achieve.

Fileless worm builds cryptomining, backdoor-planting P2P botnet

A fileless worm dubbed FritzFrog has been found roping Linux-based devices – corporate servers, routers and IoT devices – with SSH servers into a P2P botnet whose apparent goal is to mine cryptocurrency. Simultaneously, though, the malware creates a backdoor on the infected machines, allowing attackers to access it at a later date even if the SSH password has been changed in the meantime. “When looking at the amount of code dedicated to the miner, compared with the P2P and the worm (‘cracker’) modules – we can confidently say that the attackers are much more interested in obtaining access to breached servers then making profit through Monero,” Guardicore Labs lead researcher Ophir Harpaz told Help Net Security. “This access and control over SSH servers can be worth much more money than spreading a cryptominer. Additionally, it is possible that FritzFrog is a P2P-infrastructure-as-a-service; since it is robust enough to run any executable file or script on victim machines, this botnet can potentially be sold in the darknet and be the genie of its operators, fulfilling any of its malicious wishes.”

Post-Pandemic Digitalization: Building a Human-Centric Cybersecurity Strategy

As leaders of a global business task force responsible for advising and providing recommendations on the future of digitalization to G20 Leaders, we are doubling down on our efforts to build cyber resilience, and we urge leaders to recognize the importance of cybersecurity resilience as a vital building block of our global economy. And we must be thoughtful in our future cyber approach. A human-centric, education-first strategy will protect organizations where they are most vulnerable and get us closer to the point where cybersecurity is ingrained in our daily life rather than an afterthought. Action through collaboration, one of our guiding principles as the voice of the private sector to the G20, is the only viable option. A public-private partnership built on cooperation among large corporations, MSMEs, academic institutions, and international governments is the cornerstone of a modern and resilient cybersecurity system. A few simple but powerful actions ingrained in a global cybersecurity strategy will bring our users into the new age of digital transformation and embed a security mindset into our day-to-day, making breach attempts significantly less successful.

Event Stream Processing: How Banks Can Overcome SQL and NoSQL Related Obstacles with Apache Kafka

Traditional relational databases which support SQL and NoSQL databases present obstacles to the real-time data flows needed in financial services, but ultimately still remain useful to banks. Jackson says that databases are good at recording the current state and allow banks to join and query that data. “However, they’re not really designed for storing the events that got you there. This is where Kafka comes in. If you want to move, create, join, process and reprocess events you really need event streaming technology. This is becoming critical in the financial services sector where context is everything – to customers, this can be anything from sharing alerts to let you know you’ve been paid or instantly sorting transactions into categories.” He continues to say that Nationwide are starting to build applications around events, but in the meantime, technologies such as CDC and Kafka Connect, a tool that reliably streams data between Apache Kafka and other data systems are helping to bridge older database technologies into the realm of events. Data caching technology can also play an important role in providing real-time data access for performance-critical, distributed applications in financial services as it is a well-known and tested approach to dealing with spikey, unpredictable loads in a cost-effective and resilient way.

What is semantic interoperability in IoT and why is it important?

Semantic interoperability can today be enabled by declarative models and logic statements (semantic models) encoded in a formal vocabulary of some sort. The fundamental idea is that by providing these structured semantic models about a subsystem, other subsystems can with the same mechanisms get an unambiguous understanding of the subsystem. This unambiguous understanding is the cornerstone for other subsystems to confidently interact with (in other words, understand information from, as well send commands to) the given subsystem to achieve some desired effect. It's important to note that interoperability is beyond data exchange formats or even explicit translation of information models between a producer and a consumer. It’s about the mechanisms to enable this to happen automatically, without specific programming. There should be no need for an integrator to review thick manuals in order to understand what is really meant with a particular piece of data. It should be fully machine processable. Today, industry standards exist that greatly improve interoperability with significantly reduced effort. They do so by standardizing vocabularies and concepts.

GPT-3, Bloviator: OpenAI’s language generator has no idea what it’s talking about

At first glance, GPT-3 seems to have an impressive ability to produce human-like text. And we don’t doubt that it can used to produce entertaining surrealist fiction; other commercial applications may emerge as well. But accuracy is not its strong point. If you dig deeper, you discover that something’s amiss: although its output is grammatical, and even impressively idiomatic, its comprehension of the world is often seriously off, which means you can never really trust what it says. Below are some illustrations of its lack of comprehension—all, as we will see later, prefigured in an earlier critique that one of us wrote about GPT-3’s predecessor. Before proceeding, it’s also worth noting that OpenAI has thus far not allowed us research access to GPT-3, despite both the company’s name and the nonprofit status of its oversight organization. Instead, OpenAI put us off indefinitely despite repeated requests—even as it made access widely available to the media. Fortunately, our colleague Douglas Summers-Stay, who had access, generously offered to run the experiments for us. OpenAI’s striking lack of openness seems to us to be a serious breach of scientific ethics, and a distortion of the goals of the associated nonprofit.

A Google Drive 'Feature' Could Let Attackers Trick You Into Installing Malware

An unpatched security weakness in Google Drive could be exploited by malware attackers to distribute malicious files disguised as legitimate documents or images, enabling bad actors to perform spear-phishing attacks comparatively with a high success rate. The latest security issue—of which Google is aware but, unfortunately, left unpatched—resides in the "manage versions" functionality offered by Google Drive that allows users to upload and manage different versions of a file, as well as in the way its interface provides a new version of the files to the users. ... According to A. Nikoci, a system administrator by profession who reported the flaw to Google and later disclosed it to The Hacker News, the affected functionally allows users to upload a new version with any file extension for any existing file on the cloud storage, even with a malicious executable. As shown in the demo videos—which Nikoci shared exclusively with The Hacker News—in doing so, a legitimate version of the file that's already been shared among a group of users can be replaced by a malicious file, which when previewed online doesn't indicate newly made changes or raise any alarm, but when downloaded can be employed to infect targeted systems.

What is Microsoft's MeTAOS?

MeTAOS/Taos is not an OS in the way we currently think of Windows or Linux. It's more of a layer that Microsoft wants to evolve to harness the user data in the substrate to make user experiences and user-facing apps smarter and more proactive. A job description for a Principal Engineering Manager for Taos mentions the foundational layer: "We aspire to create a platform on top of that foundation - one oriented around people and the work they want to do rather than our devices, apps, and technologies. This vision has the potential to define the future of Microsoft 365 and make a dramatic impact on the entire industry." A related SharePoint/MeTA job description adds some additional context: "We are excited about transforming our customers into 'AI natives,' where technology augments their ability to achieve more with the files, web pages, news, and other content that people need to get their task done efficiently by providing them timely and actionable notifications that understands their intents, context and adapts to their work habits." In short, MeTAOS/Taos could be the next step along the Office 365 substrate path. Microsoft officials haven't said a lot publicly about the substrate, but it's basically a set of storage and other services at the heart of Office 365.

What Organizations Need to Know About IoT Supply Chain Risk

When it comes to IoT, IT, and OT devices, there is no software bill of materials (SBOM), though there have been some industry calls for one. That means the manufacturer has no obligation to disclose to you what components make up a device. When a typical device or software vulnerability is disclosed, an organization can fairly easily use tools such as device visibility and asset management to find and patch vulnerable devices on its network. However, without a standard requirement to disclose what components are under the hood, it can be extremely difficult to even identify what manufacturers or devices may be affected by a supply chain vulnerability like Ripple20 unless the vendor confirms it. For organizations, this challenge means pressing manufacturers for information on components when making purchasing decisions. While it is not realistic to solely base every purchasing decision based on security, the nature of these supply chain challenges demand at least gaining information in order to make the best risk calculus. What makes supply chain risk unique is that one vulnerability can affect many types of devices.

Quote for the day:

"Learning is a lifetime process, but there comes a time when we must stop adding and start updating." -- Robert Braul

Daily Tech Digest - August 22, 2020

There is a crisis of face recognition and policing in the US

When Jennifer Strong and I started reporting on the use of face recognition technology by police for our new podcast, “In Machines We Trust,” we knew these AI-powered systems were being adopted by cops all over the US and in other countries. But we had no idea how much was going on out of the public eye. For starters, we don’t know how often police departments in the US use facial recognition for the simple reason that in most jurisdictions, they don’t have to report when they use it to identify a suspect in a crime. The most recent numbers are speculative and from 2016, but they suggest that at the time, at least half of Americans had photos in a face recognition system. One county in Florida ran 8,000 searches each month. We also don’t know which police departments have facial recognition technology, because it’s common for police to obscure their procurement process. There is evidence, for example, that many departments buy their technology using federal grants or nonprofit gifts, which are exempt from certain disclosure laws. In other cases, companies offer police trial periods for their software that allow officers to use systems without any official approval or oversight.

Outlook “mail issues” phishing – don’t fall for this scam!

Only if you were to dig into the email headers would it be obvious that this message actually arrived from outside and was not generated automatically by your own email system at all. The clickable link is perfectly believable, because the part we’ve redacted above (between the text https://portal and the trailing /owa, short for Outlook Web App) will be your company’s own domain name. But even though the blue text of the link itself looks like a URL, it isn’t actually the URL that you will visit if you click it. Remember that a link in a web page consists of two parts: first, the text that is highlighted, usually in blue, which is clickable; second, the destination, or HREF (short for hypertext reference), where you actually go if you click the blue text. ... One tricky problem for phishing crooks is what to do at the end, so you don't belatedly realise it's a scam and rush off to change your password (or cancel your credit card, or whatever it might be). In theory, they could try using the credentials you just typed in to login for you and then dump you into your real account, but there's a lot that could go wrong. The crooks almost certainly will test out your newly-phished password pretty soon, but probably not right away while you are paying attention and might spot any anomalies that their attempted login might cause.

Taking on the perfect storm in cybersecurity

The future of cybersecurity depends on a platform approach. This will allow your cybersecurity teams to focus on security rather than continue to integrate solutions from many different vendors. It allows you to keep up with digital transformation and, along the way, battle the perfect storm. Our network perimeters are typically well-protected, and organizations have the tools and technologies in place to identify threats and react to them in real-time within their network environments. The cloud, however, is a completely different story. There is no established model for cloud security. The good news is that there is no big deployment of legacy security solutions in the cloud. This means organizations have a chance to get it right this time. We can also fix how to access the cloud and manage security operations centers (SOCs) to maximize ML and AI for prevention, detection, response and recovery. Cloud security, cloud access and next-generation SOCs are interrelated. Individually and together, they present an opportunity to modernize cybersecurity. If we build the right foundation today, we can break the pattern of too many disparate tools and create a path to consuming cybersecurity innovations and solutions more easily in the future.

FBI and CISA warn of major wave of vishing attacks targeting teleworkers

Collected information included: name, home address, personal cell/phone number, the position at the company, and duration at the company, according to the two agencies. The attackers than called employees using random Voice-over-IP (VoIP) phone numbers or by spoofing the phone numbers of other company employees. "The actors used social engineering techniques and, in some cases, posed as members of the victim company's IT help desk, using their knowledge of the employee's personally identifiable information—including name, position, duration at company, and home address—to gain the trust of the targeted employee," the joint alert reads. "The actors then convinced the targeted employee that a new VPN link would be sent and required their login, including any 2FA or OTP." When the victim accessed the link, for the phishing site hackers had created, the cybercriminals logged the credentials, and used it in real-time to gain access to the corporate account, even bypassing 2FA/OTP limits with the help of the employee. "The actors then used the employee access to conduct further research on victims, and/or to fraudulently obtain funds using varying methods dependent on the platform being accessed," the FBI and CISA said.

Why you need to revisit your IT policies

Part of that proactive planning should be adjustments to your IT policies. These documents are often forgotten until they're most needed, and the recent rushed transition from office work to remote work likely highlighted this condition. In the rushed transition, imagine how helpful it would have been to have some basic policy guidance on what equipment is supported for remote work, what items are reimbursable and where they can be sourced, and which software was recommended. If nothing else, some simple policies and guidance around these topics probably would have saved your already-stretched support staff dozens of phone calls and emails. ... At their best, policies provide guidance based on organizational priorities and experience, and at their worst, they are an extensive list of "Thou Shalt Nots" that assume your colleagues are nefarious scallywags one step away from destroying the organization should you not be there to preempt each of their misguided notions. Many employees dislike policy documents since they bias toward the latter, and unsurprisingly when you treat your colleagues like children and scoundrels, they'll rise to the occasion.

Styles, protocols and methods of microservices communication

For those who choose to stick with asynchronous protocols, consider exploring the advanced message queuing protocol (AMQP). This widely available and mature protocol provides a standard method for microservices communication and should be a priority for those developing truly composite microservices apps. Asynchronous protocols like AMQP use a lightweight service bus similar to a service-oriented architecture (SOA) bus, though much less complex. Unlike HTTP, this bus provides a message broker that acts as an intermediary between the individual microservices, thus avoiding the problems associated with a brokerless approach. Keep in mind, however, that a message broker will introduce extra steps that can add latency. The individual services still contain their functional and operational logic, and will need time to process that logic. The bus simply helps standardize and throttle those communications. Major cloud platforms, such as Azure, provide their own proprietary service bus for message brokering. However, there are also third-party options such as RabbitMQ, an open source message broker based in the Erlang programming language.

Edge computing: 4 problems it helps solve for enterprises

Enterprises in the construction, manufacturing, mining, and oil and gas industries, for example, are embracing the edge, which enables them to run the core elements of any solution locally by empowering local devices to save their state, interact with each other, and send important alerts and notifications. “This means that even if the internet goes down the factory, warehouse, construction site, mine, or field, edge processing continues to work full steam ahead,” Allsbrook says. ... Edge computing can minimize the network and bandwidth issues associated with moving large amounts of data to or from IoT devices and reduce reliance on the network. Companies look to edge solutions that can process data at the source and provide summary information on what’s going on. This eliminates the need for expensive SIM cards, data plans, and other network costs if the data were to have to be transported from the device to a network. “Edges can use simple ‘if-then’ logic or advanced AI algorithms to understand and build those summary reports,” explains Allsbrook of ClearBlade.

The Great Reset requires FinTechs – and FinTechs require a common approach to cybersecurity

Established financial services providers have a number of frameworks, standards and industry-driven initiatives available to test the security of FinTechs and other third parties. However, the volume of industry initiatives – driven by the pace of technological change and the multiplication of regulations – is now creating “noise”. This makes it difficult for FinTechs to direct their resources in a way that allows for security while also facilitating commercial partnerships. Requirements placed on FinTechs sow confusion, increase costs and may incentivise “security through obscurity”, in which less well-resourced firms play a game of chance, betting that they’re too small to be targeted by attackers and setting themselves up for problems in the future. ... The sector needs a mutually understood and widely accepted base level of cybersecurity controls. Clarity at the base level of security will support effective protection of business and client assets across the wider supply chain. This can accelerate the speed at which FinTechs can come to market and create commercial partnerships – and, in turn, incentivise good cyber hygiene

IBM Finds Flaw in Millions of Thales Wireless IoT Modules

The modules, which IBM describes as mini circuit boards, enable 3G or 4G connectivity, but also store secrets such as passwords, credentials and code, according to Adam Laurie, X-Force Red's lead hardware hacker, and Grzegorz Wypych, senior security consultant, who wrote a blog post. "This vulnerability could enable attackers to compromise millions of devices and access the networks or VPNs supporting those devices by pivoting onto the provider's backend network," Laurie and Wypych write. "In turn, intellectual property, credentials, passwords and encryption keys could all be readily available to an attacker." In a statement, Thales says "it takes the security of its products very seriously and therefore has, after communicating and discussing this issue with affected customers, delivered software fixes in Q1/2020." The modules run microprocessors with an embedded Java ME interpreter and use flash storage. Also, there are Java "midlets" that allow for customization. One of those midlets copies custom Java code added by an OEM to a secure part of the flash memory, which should only be in write mode so that code can be written there but not read back.

How to manage unstructured data using an ECM system

Structured data is information governed by a database structure, organized into defined fields, usually within the context of a relational database. The database structure requires that data in the fields follow a prescribed format. For example, a date must have the format of a date and a name must be limited in length. The most common place that people encounter structured data is in the cells of a spreadsheet. Structured data has many applications within businesses and is easy to search. It is found in finance, customer relationship management, supply chain and other applications where compliance to structures is keyed to business tasks. Unstructured data, on the other hand, is data without rules and is not as searchable. Users who create unstructured data are writing free-form, rather than complying with structured data fields. There is minimal enforcement of any rules on the length of content, the format of the content or what content goes where. Despite the lack of formal structure, unstructured information -- which users create in word processing programs, spreadsheets, presentation files, PDFs, social media feeds, and audio and video files -- forms the bulk of the data created in an organization.

Quote for the day:

"When you expect the best from people, you will often see more in them than they see in themselves." -- Mark Miller

Daily Tech Digest - August 21, 2020

How healthcare IT can be kept smart

As with many industries, the healthcare sector has seen a rapid phase of digitalisation, with new connected medical devices intertwining patient treatment with IT infrastructure that was traditionally separate from day to day healthcare practice. There can be no doubt this has boosted efficiencies and had a positive impact on patient care. However, digitalisation comes with a catch. With so many new connected devices, today’s hospital IT networks have more potential points of failure than ever before. As with any information system, the storage and transfer of data is at the heart of all healthcare IT systems. Most if not all medical IoT devices rely on data and information being readily available through various points in the hospital network. For example, a radiologist will routinely require access to patient imaging records in order to review scans that have been automatically uploaded to the system by an MRI machine. To facilitate this degree of connectivity, most hospitals have what is called an integration engine. This is a central IT communications hub that securely stores and distributes information and data where and when it is needed. Think of the integration engine as the hospital’s central nervous system, facilitating all communications across the network.

Why Innovation Takes More Than Genius

It’s easy to look at someone like Steve Jobs or Elon Musk and imagine that their success was inevitable. Their accomplishments are so out of the ordinary that it just seems impossible that they could have ever been anything other than successful. You get the sense that whatever obstacles they encountered, they would overcome. Yet it isn’t that hard to imagine a different path. If, for example, Jobs had remained in Homs, Syria, where he was conceived, it’s hard to see how he would have ever been able to become a technology entrepreneur at all, much less a global icon. If Apartheid never ended, Musk’s path to Silicon Valley would be much less likely as well. The truth is that genius can be exceptionally fragile. Making a breakthrough takes more than talent. It requires a mixture of talent, luck and an ecosystem of support to mold an idea into something transformative. In fact, in my research of great innovators what’s amazed me the most is how often they almost drifted into obscurity. Who knows how many we have lost? On a January morning in 1913, the eminent mathematician G.H. Hardy opened his mail to find a letter written in almost indecipherable scrawl from a destitute young man in India named Srinivasa Ramanujan.

Systems integrators are evolving from tech experts to business strategists

Nigel Fenwick, vice president and principal analyst at Forrester, said that systems integrators (SIs) have been investing in emerging technologies and developing software to accelerate time to value for clients. "There's demand in IT transformations for SIs and service providers to help clients architect their technology so that the business can evolve with new technologies even faster," he said. "Modern system architectures make it easier for services firms to connect systems through APIs and microservices than it used to be." Adya shared a project Infosys completed with a large retailer as one example of this orchestration approach. The client wanted to solve an employee experience problem focused on accessing personal data such as salary information, leave time, and bonus information. Each type of information lived in its own silo, requiring multiple log-ins and creating an unpleasant experience. Infosys combined multiple data sets into a single interface that employees and temp workers access by typing in an employee number. "This solved an experience problem that involved integrating the back end and the front end and building a platform," he said.

A Robust Cybersecurity Policy is Need of the Hour: Experts

“There has been a recent surge in cyberattacks on Indian digitalscape that are only increasing in scope and sophistication, targeting sensitive personal and business data and critical information infrastructure, with an impact on national economy and security. ... And while formulation and adoption of policies might still take time, this is a clarion call to the Indian internet users to pay attention to the threats, on creating robust ‘firewalls’, and conducting regular cybersecurity and data protection audits.” – Nikhil Korgaonkar, regional director, India and SAARC, Arcserve “With cyberattacks increasingly becoming sophisticated, cybersecurity and digitization cannot and should not exist in silos. What we need now is a robust cybersecurity roadmap that will address the gaps and provide us a strong cyber-armor. Covid-19 situation has only accelerated the pace of digitization, potentially amplifying these security concerns. It is time for businesses to take advantage of approaches like micro-segmentation, encryption and dynamic isolation, enhanced by the power of emerging technologies like AI and ML to up their cybersecurity game.” – Sumed Marwaha, regional services vice president and managing director, Unisys India

3 Huge Ways Companies Are Delighting Customers With Artificial-Intelligence-Driven Services

Driven by the likes of Netflix, this notion of customization and personalization is a major business trend. If your customers don’t already expect a more intelligent, personalized service offering, they soon will do. If you aren’t able to offer such a service, rest assured your competitors will. (And, increasingly, that competition may come from the tech sector itself. Consider the rise of personal finance apps that are seriously challenging traditional banking service providers.) We tend to think of retail as a product-based industry, but in fact, it perfectly illustrates this move towards more personalized services. Amazon was an early pioneer of data-driven, personalized shopping recommendations, but now a wave of new services has sprung up to offer a similarly tailored approach for consumers. Stitch Fix, which delivers hand-picked clothing to your door, is a great example. With Stitch Fix, you detail your size, style preferences, and lifestyle in a questionnaire. Then, using AI, the system pre-selects clothes that will fit and suit you, and a (human) personal stylist chooses the best options from that pre-selected list. And voila, the perfect clothes for you arrive at your door every month.

Easy Interpretation of a Logistic Regression Model with Delta-p Statistics

Imagine a situation where a credit customer applies for a credit, the bank collects data about the customer - demographics, existing funds, and so on - and predicts the credit-worthiness of the customer with a machine learning model. The customer’s credit application is rejected, but the banker doesn’t know why exactly. Or, a bank wants to advertise their credits, and the target group should be those who eventually can get a credit. But who are they? In these kinds of situations, we would prefer a model that is easy to interpret, such as the logistic regression model. The Delta-p statistics makes the interpretation of the coefficients even easier. With Delta-p statistics at hand, the banker doesn’t need a data scientist to be able to inform the customer, for example, that the credit application was rejected, because all applicants who apply credit for education purposes have a very low chance of getting a credit. The decision is justified, the customer is not personally hurt, and he or she might come back in a few years to apply for a mortgage.

Shifting Left: The Evolving Role of Automation in DevOps Tools

Advanced automation tools eliminate the manual and time-consuming configuration per project within DevOps, thereby removing the friction between developers and DevOps teams when needing to add scanning steps into the jobs of all CI pipelines. Adding jobs or steps to scan code is challenging using the traditional CI-scan model. Advanced automation tools ultimately break down barriers between teams and allow them to play better together and achieve true DevSecOps integration. At the end of the day, shifting left and automating your CI/CD pipeline will dramatically improve the integration of security within the SDLC. Organizations can instantly onboard their development, security, and operations teams and simplify the governance of their security policies and DevSecOps processes. The traditional AST solution providers are leaving developers behind because without the ability to scan source code directly in your environment, you’re left having to manually process scans — leaving a lot of room for marginal error and adding a lot of time to your end-delivery date. If I can leave you with one thing, it’s that integration is key to automation and the tools you use should enable the most shift left approach possible, where automation can occur within the SDLC — changing the way AST solutions are embedded within all DevOps environments.

GPT-3 Is an Amazing Research Tool. But OpenAI Isn’t Sharing the Code.

At its heart, GPT-3 is an incredibly powerful tool for writing in the English language. The most important thing about GPT-3 is its size. GPT-3 learned to produce writing by analyzing 45 terabytes of data, and that training process reportedly cost millions of dollars in cloud computing. It has seen human writing in billions of combinations. This is a key part of OpenAI’s long-term strategy. The firm has been saying for years that when it comes to deep learning algorithms, the bigger the better. More data and more computing power make a more capable algorithm. For instance, when OpenAI crushed professional esports players at Dota 2, it was due to its ability to train algorithms on hundreds of GPUs at the same time. It’s something OpenAI leaders have told me previously: Jack Clark, policy director for OpenAI, said that the bigger the algorithm, the “more coherent, more creative, and more reliable” it is. When talking about the amount of training the Dota 2 bots needed, CTO Greg Brockman said, “We just kept waiting for the magic to run out. We kept waiting to hit a wall, and we never seemed to hit a wall.” A similar approach was taken for GPT-3.

Indian leaders say upskilling key cybersecurity challenge: Microsoft

The pandemic had direct implications on cybersecurity budgets and staffing, with 33 per cent business leaders in India reporting a 25 per cent budget increase for security. More than half (54 per cent) of the leaders in the country said that they would hire additional security professionals in their security team. “A vast majority (70 per cent) of leaders in India stated that they plan to speed up deployment of Zero Trust capabilities to reduce risk exposure,” the findings showed. Globally, 90 per cent of businesses have been impacted by phishing attacks with 28 per cent admitted to being successfully phished. Notably successful phishing attacks were reported in significantly higher numbers from organizations that described their resources as mostly on-premise (36 per cent) as opposed to being more cloud-based. In response to Covid-19, more than 80 per cent of companies added security jobs. While 58 per cent of companies reported an increase in security budgets globally, 65 per cent reported an increase in compliance budgets. “The shift to remote work is fundamentally changing security architecture,” said the survey.

How to manage your edge infrastructure and devices

“Firstly, lack of external network connectivity to a device making it necessary to process data at the edge. Typically, this has been due to difficult environments or security requirements. Secondly, a need for speed that prevents sending data through a network due to latency, where moving the data costs more in terms of time than having the processing power of a data centre or the cloud available. “This is absolutely true for certain use cases. On the factory floor, for example, there is a desire to prevent network connectivity from bringing an entire plant down. In fact, in many factories, the level of bandwidth currently available can often be too low to have all equipment sending data back to the data centre. In this case, it is critical to place analytics tools at the edge with no disruption, sitting the algorithm next to the hardware. “However, for businesses with non-critical use cases, this is changing. Over time, the drivers behind the need for edge analytics have changed as network speed and connectivity become faster and more prevalent. As such, the roundtrip of data to the network – which is going faster every day – will not hinder digital progress and thus businesses are increasingly happy to manage infrastructure and devices in this way.”

Quote for the day:

“I’m convinced that about half of what separates successful entrepreneurs from non-successful ones is pure perseverance.” -- Steve Jobs