Daily Tech Digest - June 13, 2022

The Increasingly Graphic Nature Of Intel Datacenter Compute

What customers are no doubt telling Intel and AMD is that they want highly tuned pieces of hardware co-designed with very precise workloads, and that they will want them at much lower volumes for each multi-motor configuration than chip makers and system builders are used to. Therefore, these compute engine complexes we call servers will carry higher unit costs than chip makers and system builders are used to, but not necessarily with higher profits. In fact, quite possibly with lower profits, if you can believe it. This is why Intel is taking a third whack at discrete GPUs with its Xe architecture and significantly with the “Ponte Vecchio” Xe HPC GPU accelerator that is at the “Aurora” supercomputer at Argonne National Laboratory. And this time the architecture of the GPUs is a superset of the integrated GPUs for its laptops and desktops, not some Frankenstein X86 architecture that is not really tuned for graphics even if it could be used as a massively parallel compute engine in a way that GPUs have been transformed from Nvidia and AMD. 


Under the hood: Meta’s cloud gaming infrastructure

Our goal within each edge computing site is to have a unified hosting environment to make sure we can run as many games as possible as smoothly as possible. Today’s games are designed for GPUs, so we partnered with NVIDIA to build a hosting environment on top of NVIDIA Ampere architecture-based GPUs. As games continue to become more graphically intensive and complex, GPUs will provide us with the high fidelity and low latency we need for loading, running, and streaming games. To run games themselves, we use Twine, our cluster management system, on top of our edge computing operating system. We build orchestration services to manage the streaming signals and use Twine to coordinate the game servers on edge. We built and used container technologies for both Windows and Android games. We have different hosting solutions for Windows and Android games, and the Windows hosting solution comes with the integration with PlayGiga. We’ve built a consolidated orchestration system to manage and run the games for both operating systems.


Google AI Introduces ‘LIMoE’

A typical Transformer comprises several “blocks,” each containing several distinct layers. A feed-forward network is one of these layers (FFN). This single FFN is replaced in LIMoE and the works described above by an expert layer with multiple parallel FFNs, each of which is an expert. A primary router predicts which experts should handle which tokens, given a series of passes to process. ... The model’s price is comparable to the regular Transformer model if only one expert is activated. LIMoE performs exactly that, activating one expert per case and matching the dense baselines’ computing cost. The LIMoE router, on the other hand, may see either image or text data tokens. When MoE models try to deliver all tokens to the same expert, they fail uniquely. Auxiliary losses, or additional training objectives, are commonly used to encourage balanced expert utilization. Google AI team discovered that dealing with numerous modalities combined with sparsity resulted in novel failure modes that conventional auxiliary losses could not solve. To address this, they created additional losses.


Stop Splitting Yourself in Half: Seek Out Work-Life Boundaries, Not Balance

What makes boundaries different from balance? Balance implies two things that aren't equal that you're constantly trying to make equal. It creates the expectation of a clear-cut division. A work-life balance fails to acknowledge that you are a whole person, and sometimes things can be out of balance without anything being wrong. Sometimes you'll spend days, weeks and even whole seasons of life choosing to lean more into one part of your life than the other. Boundaries ask you to think about what's important to you, what drives you, and what authenticity looks like for you. Boundaries require self-awareness and self-reflection, along with a willingness and ability to prioritize. Those qualities help you to be more aware and more capable of making decisions at a given moment. By establishing boundaries grounded in your priorities, you're more equipped to make choices. Boundaries empower you to say, "This is what I'm choosing right now. I need to be fully here until this is done." Boundaries aren't static, either. 


Why it’s time for 'data-centric artificial intelligence'

AI systems need both code and data, and “all that progress in algorithms means it's actually time to spend more time on the data,” Ng said at the recent EmTech Digital conference hosted by MIT Technology Review. Focusing on high-quality data that is consistently labeled would unlock the value of AI for sectors such as health care, government technology, and manufacturing, Ng said. “If I go see a health care system or manufacturing organization, frankly, I don't see widespread AI adoption anywhere.” This is due in part to the ad hoc way data has been engineered, which often relies on the luck or skills of individual data scientists, said Ng, who is also the founder and CEO of Landing AI. Data-centric AI is a new idea that is still being discussed, Ng said, including at a data-centric AI workshop he convened last December. ... Data-centric AI is a key part of the solution, Ng said, as it could provide people with the tools they need to engineer data and build a custom AI system that they need. “That seems to me, the only recipe I'm aware of, that could unlock a lot of this value of AI in other industries,” he said.


How Do We Utilize Chaos Engineering to Become Better Cloud-Native Engineers?

The main goal of Chaos Engineering is as explained here: “Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production.” The idea of Chaos Engineering is to identify weaknesses and reduce uncertainty when building a distributed system. As I already mentioned above, building distributed systems at scale is challenging, and since such systems tend to be composed of many moving parts, leveraging Chaos Engineering practices to reduce the blast radius of such failures, proved itself as a great method for that purpose. We leverage Chaos Engineering principles to achieve other things besides its main objective. The “On-call like a king” workshops intend to achieve two goals in parallel—(1) train engineers on production failures that we had recently; (2) train engineers on cloud-native practices, tooling, and how to become better cloud-native engineers!


The 3 Phases of Infrastructure Automation

Manually provisioning and updating infrastructure multiple times a day from different sources, in various clouds or on-premises data centers, using numerous workflows is a recipe for chaos. Teams will have difficulty collaborating or even sharing a view of the organization’s infrastructure. To solve this problem, organizations must adopt an infrastructure provisioning workflow that stays consistent for any cloud, service or private data center. The workflow also needs extensibility via APIs to connect to infrastructure and developer tools within that workflow, and the visibility to view and search infrastructure across multiple providers. ... The old-school, ticket-based approach to infrastructure provisioning makes IT into a gatekeeper, where they act as governors of the infrastructure but also create bottlenecks and limit developer productivity. But allowing anyone to provision infrastructure without checks or tracking can leave the organization vulnerable to security risks, non-compliance and expensive operational inefficiencies.


Questioning the ethics of computer chips that use lab-grown human neurons

While silicon computers transformed society, they are still outmatched by the brains of most animals. For example, a cat’s brain contains 1,000 times more data storage than an average iPad and can use this information a million times faster. The human brain, with its trillion neural connections, is capable of making 15 quintillion operations per second. This can only be matched today by massive supercomputers using vast amounts of energy. The human brain only uses about 20 watts of energy, or about the same as it takes to power a lightbulb. It would take 34 coal-powered plants generating 500 megawatts per hour to store the same amount of data contained in one human brain in modern data storage centres. Companies do not need brain tissue samples from donors, but can simply grow the neurons they need in the lab from ordinary skin cells using stem cell technologies. Scientists can engineer cells from blood samples or skin biopsies into a type of stem cell that can then become any cell type in the human body.


How Digital Twins & Data Analytics Power Sustainability

Seeding technology innovation across an enterprise requires broader and deeper communication and collaboration than in the past, says Aapo Markkanen, an analyst in the technology and service providers research unit at Gartner. “There’s a need to innovate and iterate faster, and in a more dynamic way. Technology must enable processes such as improved materials science and informatics and simulations.” Digital twins are typically at the center of the equation, says Mark Borao, a partner at PwC. Various groups, such as R&D and operations, must have systems in place that allow teams to analyze diverse raw materials, manufacturing processes, and recycling and disposal options --and understand how different factors are likely to play out over time -- and before an organization “commits time, money and other resources to a project,” he says. These systems “bring together data and intelligence at a massive scale to create virtual mirrored worlds of products and processes,” Podder adds. In fact, they deliver visibility beyond Scope 1 and Scope 2 emissions, and into Scope 3 emissions.


API security warrants its own specific solution

If the API doesn’t apply sufficient internal rate limiting on parameters such as response timeouts, memory, payload size, number of processes, records and requests, attackers can send multiple API requests creating a denial of service (DoS) attack. This then overwhelms back-end systems, crashing the application or driving resource costs up. Prevention requires API resource consumption limits to be set. This means setting thresholds for the number of API calls and client notifications such as resets and lockouts. Server-side, validate the size of the response in terms of the number of records and resource consumption tolerances. Finally, define and enforce the maximum size of data the API will support on all incoming parameters and payloads using metrics such as the length of strings and number of array elements. Effectively a different spin on BOLA, this sees the attacker able to send requests to functions that they are not permitted to access. It’s effectively an escalation of privilege because access permissions are not enforced or segregated, enabling the attacker to impersonate admin, helpdesk, or a superuser and to carry out commands or access sensitive functions, paving the way for data exfiltration.



Quote for the day:

"To make a decision, all you need is authority. To make a good decision, you also need knowledge, experience, and insight." -- Denise Moreland

No comments:

Post a Comment