A Data Scientist’s Essential Guide to Exploratory Data Analysis
Analyzing the individual characteristics of each feature is crucial as it will
help us decide on their relevance for the analysis and the type of data
preparation they may require to achieve optimal results. For instance, we may
find values that are extremely out of range and may refer to inconsistencies or
outliers. We may need to standardize numerical data or perform a one-hot
encoding of categorical features, depending on the number of existing
categories. Or we may have to perform additional data preparation to handle
numeric features that are shifted or skewed, if the machine learning algorithm
we intend to use expects a particular distribution. ... For Multivariate
Analysis, best practices focus mainly on two strategies: analyzing the
interactions between features, and analyzing their correlations. ...
Interactions let us visually explore how each pair of features behaves, i.e.,
how the values of one feature relate to the values of the other.
Resilient data backup and recovery is critical to enterprise success
So, what must IT leaders consider? The first step is to establish data
protection policies that include encryption and least privilege access
permissions. Businesses should then ensure they have three copies of their data
– the production copy already exists and is effectively the first copy. The
second copy should be stored on a different media type, not necessarily in a
different physical location (the logic behind it is to not store your production
and backup data in the same storage device). The third copy could or should be
an offsite copy that is also offline, air-gapped, or immutable (Amazon S3 with
Object Lock is one example). Organizations also need to make sure they have a
centralized view of data protection across all environments for greater
management, monitoring and governance, and they need orchestration tools to help
automate data recovery. Finally, organizations should conduct frequent backup
and recovery testing to make sure that everything works as it should.
Data Warehouse Architecture Types
Different architectural approaches offer unique advantages and cater to
varying business requirements. In this comprehensive guide, we will explore
different data warehouse architecture types, shedding light on their
characteristics, benefits, and considerations. Whether you are building a new
data warehouse or evaluating your existing architecture, understanding these
options will empower you to make informed decisions that align with your
organization’s goals. ... Selecting the right data warehouse architecture is a
critical decision that directly impacts an organization’s ability to leverage
its data assets effectively. Each architecture type has its own strengths and
considerations, and there is no one-size-fits-all solution. By understanding
the characteristics, benefits, and challenges of different data warehouse
architecture types, businesses can align their architecture with their unique
requirements and strategic goals. Whether it’s a traditional data warehouse,
hub-and-spoke model, federated approach, data lake architecture, or a hybrid
solution, the key is to choose an architecture that empowers data-driven
insights, scalability, agility, and flexibility.
What is federated Identity? How it works and its importance to enterprise security
FIM has many benefits, including reducing the number of passwords a user needs
to remember, improving their user experience and improving security
infrastructure. On the downside, federated identity does introduce complexity
into application architecture. This complexity can also introduce new attack
surfaces, but on balance, properly implemented federated identity is a net
improvement to application security. In general, we can see federated identity
as improving convenience and security at the cost of complexity. ... Federated
single sign-on allows for sharing credentials across enterprise boundaries. As
such, it usually relies on a large, well-established entity with widespread
security credibility, organizations such as Google, Microsoft, and Amazon, for
example. In this case, applications are usually gaining not just a simplified
login experience for their users, but the impression and actual reliance on
high-level security infrastructure. Put another way, even a small application
can add “Sign in with Google” to its login flow relatively easily, giving
users a simple login option, which keeps sensitive information in the hands of
the big organization.
Millions of PC Motherboards Were Sold With a Firmware Backdoor
Given the millions of potentially affected devices, Eclypsium’s discovery is
“troubling,” says Rich Smith, who is the chief security officer of
supply-chain-focused cybersecurity startup Crash Override. Smith has published
research on firmware vulnerabilities and reviewed Eclypsium’s findings. He
compares the situation to the Sony rootkit scandal of the mid-2000s. Sony had
hidden digital-rights-management code on CDs that invisibly installed itself
on users’ computers and in doing so created a vulnerability that hackers used
to hide their malware. “You can use techniques that have traditionally been
used by malicious actors, but that wasn’t acceptable, it crossed the line,”
Smith says. “I can’t speak to why Gigabyte chose this method to deliver their
software. But for me, this feels like it crosses a similar line in the
firmware space.” Smith acknowledges that Gigabyte probably had no malicious or
deceptive intent in its hidden firmware tool. But by leaving security
vulnerabilities in the invisible code that lies beneath the operating system
of so many computers, it nonetheless erodes a fundamental layer of trust users
have in their machines.
Minimising the Impact of Machine Learning on our Climate
There are several things we can do to mitigate the negative impact of software
on our climate. They will be different depending on your specific scenario.
But what they all have in common is that they should strive to be
energy-efficient, hardware-efficient and carbon-aware. GSF is gathering
patterns for different types of software systems; these have all been reviewed
by experts and agreed on by all member organisations before being published.
In this section we will cover some of the patterns for machine learning as
well as some good practices which are not (yet?) patterns. If we divide the
actions after the ML life cycle, or at least a simplified version of it, we
get four categories: Project Planning, Data Collection, Design and Training of
ML model and finally, Deployment and Maintenance. The project planning phase
is the time to start asking the difficult questions, think about what the
carbon impact of your project will be and how you plan to measure it. This is
also the time to think about your SLA; overcommitting to strict latency or
performance metrics that you actually don’t need can quickly become a source
of emission you can avoid.
5 ways AI can transform compliance
Compliance is all about controls. Data must be classified according to
multiple rules, and the movement and access to that data recorded. It’s the
perfect task for AI. Ville Somppi, vice president of industry solutions at
M-Files, says: “Thanks to AI, organisations can automatically classify
information and apply pre-defined compliance rules. In the case of choosing
the right document category from a compliance perspective, the AI can be
trained quickly with a small sample set categorised by people. This is
convenient, especially when people can still correct wrong suggestions in the
beginning of the learning process. ... Data pools are too big for humans to
comb through. AI is the only way. In some sectors, adoption of AI has been
delayed owing to regulatory issues. However, full deployment ought now to be
possible. Gabriel Hopkins chief product officer at Ripjar, says: “Banks and
financial services companies face complex responsibilities when it comes to
compliance activities, especially with regard to combatting the financing of
terrorism and preventing laundering or criminal proceeds.
Former Uber CSO Sullivan on Engaging the Security Community
CISO is a lonely role. There's a really amazing camaraderie between security
executives that I'm not sure exists in any other kind of leadership role. The
CISO role is pretty new compared to the other leadership roles. It's far from
settled what kind of background is ideal for the role. It's far from settled
where the person in the role should report. It’s far from settled what kind of
a budget you're going to get. It's far from settled in terms of what type of
decision-making power you're going to have. So, as a result, I think security
leaders often feel lonely and on an island. They have an executive team above
them that expects them to know all the answers about security, and then they
have a team underneath them that expects them to know all the answers about
security. So, they can't betray ignorance to anybody without undermining their
role. And so, the security leader community often turns to each other for
support, for guidance. There are a good number of Slack channels and
conferences that are just CISOs talking through the role and asking for best
practices and advice on how to deal with hard situations.
Google Drive Deficiency Allows Attackers to Exfiltrate Workspace Data Without a Trace
Mitiga reached out to Google about the issue, but the researchers said they
have not yet received a response, adding that Google's security team typically
doesn't recognize forensics deficiencies as a security problem. This
highlights a concern when working with software-as-a-service (SaaS) and cloud
providers, in that organizations that use their services "are solely dependent
on them regarding what forensic data you can have," Aspir notes. "When it
comes to SaaS and cloud providers, we’re talking about a shared responsibility
regarding security because you can't add additional safeguards within what is
given." ... Fortunately, there are steps that organizations using Google
Workspace can take to ensure that the issue outlined by Mitiga isn't
exploited, the researchers said. This includes keeping an eye out for certain
actions in their Admin Log Events feature, such as events about license
assignments and revocations, they said.
How defense contractors can move from cybersecurity to cyber resilience
We’re thinking way too small about a coordinated cyberattack’s capacity for
creating major disruption to our daily lives. One recent, vivid illustration
of that fact happened in 2022, when the Russia-linked cybercrime group Conti
launched a series of prolonged attacks on the core infrastructure of the
country of Costa Rica, plunging the country into chaos for months. Over a
period of two weeks, Conti tried to breach different government organizations
nearly every day, targeting a total of 27 agencies. Soon after that, the group
launched a separate attack on the country’s health care system, causing tens
of thousands of appointments to be canceled and patients to experience delays
in getting treatment. The country declared a national emergency and
eventually, with the help of allies around the world including the United
States and Microsoft, regained control of its systems. The US federal
government’s strict compliance standards often impede businesses from
excelling beyond the most basic requirements.
Quote for the day:
"Uncertainty is not an indication of
poor leadership; it underscores the need for leadership." --
Andy Stanley
No comments:
Post a Comment