Here's how it works: The Core Infrastructure Initiative (CII) Best Practices badge shows a project follows security best practices. The badges let others quickly assess which projects are following best practices and are more likely to produce higher-quality secure software. Over 3,000 projects are taking part in the badging project. There are three badge levels: Passing, silver, and gold. Each level requires that the OSS project meet a set of criteria; for silver and gold that includes meeting the previous level. The "passing" level captures what well-run OSS projects typically already do. A passing score requires the programmers to meet 66 criteria in six categories. For example, the passing level requires that the project publicly state how to report vulnerabilities to the project, that tests are added as functionality is added, and that static analysis is used to analyze software for potential problems. As of June 14, 2020, there were 3,195 participating projects, and 443 had earned a passing badge. The silver and gold level badges are intentionally more demanding. The silver badge is designed to be harder but possible for one-person projects.
It didn’t take long for the AI research community to realize that this massive parallelization also makes GPUs great for deep learning. Like graphics-rendering, deep learning involves simple mathematical calculations performed hundreds of thousands of times. In 2011, in a collaboration with chipmaker Nvidia, Google found that a computer vision model it had trained on 2,000 CPUs to distinguish cats from people could achieve the same performance when trained on only 12 GPUs. GPUs became the de facto chip for model training and inferencing—the computational process that happens when a trained model is used for the tasks it was trained for. But GPUs also aren’t perfect for deep learning. For one thing, they cannot function as a standalone chip. Because they are limited in the types of operations they can perform, they must be attached to CPUs for handling everything else. GPUs also have a limited amount of cache memory, the data storage area nearest a chip’s processors. This means the bulk of the data is stored off-chip and must be retrieved when it is time for processing. The back-and-forth data flow ends up being a bottleneck for computation, capping the speed at which GPUs can run deep-learning algorithms.
Beyond governance of Big Data and AI, there’s a second bottleneck and that’s talent. The well-worn phrase is true: every business is a technology company now; soon, though, most will also be AI companies. So when it comes to hiring good data scientists and AI experts, these businesses will have to compete not only with their peers but also tech giants like Facebook, Amazon and Google. Instead of attempting to raid the physics and mathematics departments of their local universities for talent, I therefore recommend that companies look elsewhere for AI experts - on their own payroll. Most businesses have incredible talent in-house. All they have to do is provide their staff with the necessary training and support, which can be done with the help of technology partners, provided these are platform-agnostic so that they can support a wide range of technologies and use cases. Training will have to be delivered on two levels. The first is AI enablement, by training staff to program and handle the technical aspects of AI and machine learning; they need to understand how to use bots, deploy robotic process automation and use machine learning to harness big data.
As we exit the immediate crisis here, the health crisis, and move into a period of economic recovery, we're certainly going to see tremendous amounts of job loss, transitions in needed skills, and our labor force is going to be dramatically affected around the world by what's happening now. We do have an opportunity to think about re-skilling in a new way. Can we provide certain swaths of the economy with educational resources that will help them participate in the technology economy in ways that were not permissible or possible before? Can we think through an infrastructure build that will enable schools, for example, in rural areas or in parts of the world that haven't traditionally had access to technology, to train their students in these kinds of skills? I think there is an opportunity to think systemically about changes that are needed, that have been needed for a long time, quite frankly, and to use this recovery period as an opportunity to bridge that divide and to ensure that we're providing opportunities for everyone.
A few projects are also exploring the potential for blockchain-based federated learning, so to speak, in improving AI outcomes. Federated learning makes it possible for AI algorithms to amass experience from a wide range of siloed data. Instead of having the data moved to the computation venue, the computation happens at the data location. Federated learning allows data providers to retain control over their data. However, privacy risks lurk whenever federated learning is employed. Blockchain is able to alleviate this risk thanks to its superior traceability and transparency. Also, a smart contract could be used to discourage malicious players by requiring a security deposit, which is only refundable if the algorithm doesn’t violate the network’s privacy standards. Ocean Protocol and GNY are two projects exploring blockchain-based federated learning. Ocean recently launched a product, called Compute-to-Data, which allows data providers and data consumers to securely buy and sell data on the blockchain. The Singapore-based startup already has some enterprise names including Roche Diagnostics, the diagnostic division of multinational healthcare company F. Hoffmann-La Roche AG using its services.
At one end of the spectrum is data, and the ingestion of data into data warehouses and data lakes. AI systems, and in particular ML, run on large volumes of structured and unstructured data — it is the material from which organizations can generate insights, decisions, and outcomes. In its raw form, it is easy to democratize, enabling people to perform basic analyses. Already, a number of technology providers have created data explorers to help users search and visualize openly available data sets. Next along the spectrum come the algorithms into which the data is fed. Here the value and complexity increase, as the data is put to work. At this point, democratization is still relatively easy to achieve, and algorithms are widely accessible; open source code repositories such as GitHub (purchased by Microsoft in 2018) have been growing significantly over the past decade. But understanding algorithms requires a basic grasp of computer science and a mathematics or statistics background. As we continue to move along the spectrum to storage and computing platforms, the complexity increases. During the past five years, the technology platform for AI has moved to the cloud with three major AI/ML providers: Amazon Web Services (AWS), Microsoft Azure, and Google Compute Engine.
Mostly, though, Memory Bots became routine and part of the social fabric of the future as controversies faded, laws and regulations were refined to curb abuses and maximize safe usage, and people became intrigued and distracted by the latest new gadget that was going to wow them, then scare them, and then become routine. In the old Shlain Goldberg house in Marin County, you could still find Ken, or the essence and memories of Ken, captured inside an eight-inch-tall black cylindrical tube on the kitchen counter that looked remarkably like an ancient Alexa. (Sadly, Ken, as well as Tiffany, had just missed the advent of longevity tech that allowed their daughter to live thousands of years and counting.) Except that Ken-Alexa had a swivel head that was constantly recording everything, with the positive-negative filter still set right where Ken had left it, in the middle of the dial. Even when Odessa was centuries old but still looked the same as she did when she was 25, she could talk to her dad, and ask him questions, and hear him laugh.
We needed to learn to think in monitoring terms, learn more about monitoring tooling, and how best to monitor. Most monitoring systems are set up for platform and operations monitoring. Using these for application monitoring is taking them and engineering somewhere new. Early on, we got some weirdness out of our monitoring. The system was telling us we had issues when we didn’t. It sounds silly now, but reading and re-reading the monitoring system documentation until we really got it helped. Digging deeper into how different types of metrics and monitors were designed to be used allowed us to build a more stable monitoring system. We also found that there were things we wanted to do, that we couldn’t do with out-of-the-box monitoring. Our early application monitoring was noisy and misfired. Too frequently it told us we had problems that we didn’t have. We kept iterating. We ended up building more of the monitoring in code than we expected, but it was well worth the time. We got the bare bones of a monitoring system early, and by using it in the real world, we worked out what we really needed.
The machine vision systems in cars today are excellent at recognizing obstacles like other vehicles and pedestrians. Anticipating how they’ll act is another issue entirely. People behave irrationally by running red lights or jaywalking, and that kind of behavior is hard for an AI to react to or expect. These AI systems will get better with more training data, but collecting that data can be complicated. Right now, putting an autonomous car on the road can be dangerous, but they need to be out there to gather data. As a result, the process of getting all the necessary training may be a long one. Autonomous cars may not be ready to disrupt the industry, but implementation is still possible. Public transportation is an ideal application for today’s self-driving vehicles because it’s a more predictable form of driving. By driving pre-defined routes at slower speeds, autonomous public transports can start to gather that all-important training data. Some companies have already started taking advantage of this area. A business called May Mobility has been running self-driving shuttles to train stops since May 2019.
Including a section in apps that provides transparency on how it uses data can help ease security concerns. Zoom, which has been in the news due to its increased use amid COVID-19 and security concerns, recently brought in leaders in the security space and a new acquisition to help. Having a strong opt-in strategy is also important. Apple and Google have a good approach with their work on contact tracing. But opting in is not going to give you all – or even enough – of the data. ... The CDO should set strategy for managing all of an organization's data – both from a defensive standpoint (addressing compliance regulations, data privacy, good data hygiene, etc.) and from an offensive one (making data more easily consumable for those who want and need it). Some key agencies do plan to have specialist CDOs. The Department of Defense has been working to recruit candidates for its CDO position. And at the end of March, the Centers for Disease Control and Prevention (CDC) published the official job post for its CDO opening. ... Consumers are grappling with data collection, something they've struggled with for a while. People are trying to become more educated about application data collection and personal data privacy and security.
Quote for the day: