Daily Tech Digest - June 02, 2023

A Data Scientist’s Essential Guide to Exploratory Data Analysis

Analyzing the individual characteristics of each feature is crucial as it will help us decide on their relevance for the analysis and the type of data preparation they may require to achieve optimal results. For instance, we may find values that are extremely out of range and may refer to inconsistencies or outliers. We may need to standardize numerical data or perform a one-hot encoding of categorical features, depending on the number of existing categories. Or we may have to perform additional data preparation to handle numeric features that are shifted or skewed, if the machine learning algorithm we intend to use expects a particular distribution. ... For Multivariate Analysis, best practices focus mainly on two strategies: analyzing the interactions between features, and analyzing their correlations. ... Interactions let us visually explore how each pair of features behaves, i.e., how the values of one feature relate to the values of the other. 


Resilient data backup and recovery is critical to enterprise success

So, what must IT leaders consider? The first step is to establish data protection policies that include encryption and least privilege access permissions. Businesses should then ensure they have three copies of their data – the production copy already exists and is effectively the first copy. The second copy should be stored on a different media type, not necessarily in a different physical location (the logic behind it is to not store your production and backup data in the same storage device). The third copy could or should be an offsite copy that is also offline, air-gapped, or immutable (Amazon S3 with Object Lock is one example). Organizations also need to make sure they have a centralized view of data protection across all environments for greater management, monitoring and governance, and they need orchestration tools to help automate data recovery. Finally, organizations should conduct frequent backup and recovery testing to make sure that everything works as it should.


Data Warehouse Architecture Types

Different architectural approaches offer unique advantages and cater to varying business requirements. In this comprehensive guide, we will explore different data warehouse architecture types, shedding light on their characteristics, benefits, and considerations. Whether you are building a new data warehouse or evaluating your existing architecture, understanding these options will empower you to make informed decisions that align with your organization’s goals. ... Selecting the right data warehouse architecture is a critical decision that directly impacts an organization’s ability to leverage its data assets effectively. Each architecture type has its own strengths and considerations, and there is no one-size-fits-all solution. By understanding the characteristics, benefits, and challenges of different data warehouse architecture types, businesses can align their architecture with their unique requirements and strategic goals. Whether it’s a traditional data warehouse, hub-and-spoke model, federated approach, data lake architecture, or a hybrid solution, the key is to choose an architecture that empowers data-driven insights, scalability, agility, and flexibility.


What is federated Identity? How it works and its importance to enterprise security

FIM has many benefits, including reducing the number of passwords a user needs to remember, improving their user experience and improving security infrastructure. On the downside, federated identity does introduce complexity into application architecture. This complexity can also introduce new attack surfaces, but on balance, properly implemented federated identity is a net improvement to application security. In general, we can see federated identity as improving convenience and security at the cost of complexity. ... Federated single sign-on allows for sharing credentials across enterprise boundaries. As such, it usually relies on a large, well-established entity with widespread security credibility, organizations such as Google, Microsoft, and Amazon, for example. In this case, applications are usually gaining not just a simplified login experience for their users, but the impression and actual reliance on high-level security infrastructure. Put another way, even a small application can add “Sign in with Google” to its login flow relatively easily, giving users a simple login option, which keeps sensitive information in the hands of the big organization.


Millions of PC Motherboards Were Sold With a Firmware Backdoor

Given the millions of potentially affected devices, Eclypsium’s discovery is “troubling,” says Rich Smith, who is the chief security officer of supply-chain-focused cybersecurity startup Crash Override. Smith has published research on firmware vulnerabilities and reviewed Eclypsium’s findings. He compares the situation to the Sony rootkit scandal of the mid-2000s. Sony had hidden digital-rights-management code on CDs that invisibly installed itself on users’ computers and in doing so created a vulnerability that hackers used to hide their malware. “You can use techniques that have traditionally been used by malicious actors, but that wasn’t acceptable, it crossed the line,” Smith says. “I can’t speak to why Gigabyte chose this method to deliver their software. But for me, this feels like it crosses a similar line in the firmware space.” Smith acknowledges that Gigabyte probably had no malicious or deceptive intent in its hidden firmware tool. But by leaving security vulnerabilities in the invisible code that lies beneath the operating system of so many computers, it nonetheless erodes a fundamental layer of trust users have in their machines. 


Minimising the Impact of Machine Learning on our Climate

There are several things we can do to mitigate the negative impact of software on our climate. They will be different depending on your specific scenario. But what they all have in common is that they should strive to be energy-efficient, hardware-efficient and carbon-aware. GSF is gathering patterns for different types of software systems; these have all been reviewed by experts and agreed on by all member organisations before being published. In this section we will cover some of the patterns for machine learning as well as some good practices which are not (yet?) patterns. If we divide the actions after the ML life cycle, or at least a simplified version of it, we get four categories: Project Planning, Data Collection, Design and Training of ML model and finally, Deployment and Maintenance. The project planning phase is the time to start asking the difficult questions, think about what the carbon impact of your project will be and how you plan to measure it. This is also the time to think about your SLA; overcommitting to strict latency or performance metrics that you actually don’t need can quickly become a source of emission you can avoid.


5 ways AI can transform compliance

Compliance is all about controls. Data must be classified according to multiple rules, and the movement and access to that data recorded. It’s the perfect task for AI. Ville Somppi, vice president of industry solutions at M-Files, says: “Thanks to AI, organisations can automatically classify information and apply pre-defined compliance rules. In the case of choosing the right document category from a compliance perspective, the AI can be trained quickly with a small sample set categorised by people. This is convenient, especially when people can still correct wrong suggestions in the beginning of the learning process. ... Data pools are too big for humans to comb through. AI is the only way. In some sectors, adoption of AI has been delayed owing to regulatory issues. However, full deployment ought now to be possible. Gabriel Hopkins chief product officer at Ripjar, says: “Banks and financial services companies face complex responsibilities when it comes to compliance activities, especially with regard to combatting the financing of terrorism and preventing laundering or criminal proceeds.


Former Uber CSO Sullivan on Engaging the Security Community

CISO is a lonely role. There's a really amazing camaraderie between security executives that I'm not sure exists in any other kind of leadership role. The CISO role is pretty new compared to the other leadership roles. It's far from settled what kind of background is ideal for the role. It's far from settled where the person in the role should report. It’s far from settled what kind of a budget you're going to get. It's far from settled in terms of what type of decision-making power you're going to have. So, as a result, I think security leaders often feel lonely and on an island. They have an executive team above them that expects them to know all the answers about security, and then they have a team underneath them that expects them to know all the answers about security. So, they can't betray ignorance to anybody without undermining their role. And so, the security leader community often turns to each other for support, for guidance. There are a good number of Slack channels and conferences that are just CISOs talking through the role and asking for best practices and advice on how to deal with hard situations.


Google Drive Deficiency Allows Attackers to Exfiltrate Workspace Data Without a Trace

Mitiga reached out to Google about the issue, but the researchers said they have not yet received a response, adding that Google's security team typically doesn't recognize forensics deficiencies as a security problem. This highlights a concern when working with software-as-a-service (SaaS) and cloud providers, in that organizations that use their services "are solely dependent on them regarding what forensic data you can have," Aspir notes. "When it comes to SaaS and cloud providers, we’re talking about a shared responsibility regarding security because you can't add additional safeguards within what is given." ... Fortunately, there are steps that organizations using Google Workspace can take to ensure that the issue outlined by Mitiga isn't exploited, the researchers said. This includes keeping an eye out for certain actions in their Admin Log Events feature, such as events about license assignments and revocations, they said.


How defense contractors can move from cybersecurity to cyber resilience

We’re thinking way too small about a coordinated cyberattack’s capacity for creating major disruption to our daily lives. One recent, vivid illustration of that fact happened in 2022, when the Russia-linked cybercrime group Conti launched a series of prolonged attacks on the core infrastructure of the country of Costa Rica, plunging the country into chaos for months. Over a period of two weeks, Conti tried to breach different government organizations nearly every day, targeting a total of 27 agencies. Soon after that, the group launched a separate attack on the country’s health care system, causing tens of thousands of appointments to be canceled and patients to experience delays in getting treatment. The country declared a national emergency and eventually, with the help of allies around the world including the United States and Microsoft, regained control of its systems. The US federal government’s strict compliance standards often impede businesses from excelling beyond the most basic requirements. 



Quote for the day:

"Uncertainty is not an indication of poor leadership; it underscores the need for leadership." -- Andy Stanley

No comments:

Post a Comment