The Beautiful Lies of Machine Learning in Security
The biggest challenge in ML is availability of relevant, usable data to solve
your problem. For supervised ML, you need a large, correctly labeled dataset. To
build a model that identifies cat photos, for example, you train the model on
many photos of cats labeled "cat" and many photos of things that aren't cats
labeled "not cat." If you don’t have enough photos or they're poorly labeled,
your model won't work well. In security, a well-known supervised ML use case is
signatureless malware detection. Many endpoint protection platform (EPP) vendors
use ML to label huge quantities of malicious samples and benign samples,
training a model on "what malware looks like." These models can correctly
identify evasive mutating malware and other trickery where a file is altered
enough to dodge a signature but remains malicious. ML doesn't match the
signature. It predicts malice using another feature set and can often catch
malware that signature-based methods miss. However, because ML models are
probabilistic, there's a trade-off. ML can catch malware that signatures miss,
but it may also miss malware that signatures catch.
6 Machine Learning Algorithms to Know About When Learning Data Science
Decision trees are models that resemble a tree like structure containing
decisions and possible outcomes. They consist of a root node, which forms the
start of our tree, decision nodes which are used to split the data based on a
condition, and leaf nodes which form the terminal points of the tree and the
final outcome. Once a decision tree has been formed, we can use it to predict
values when new data is presented to it. ... Random Forest is a supervised
ensemble machine learning algorithm that aggregates the results from multiple
decision trees, and can be applied to classification and regression based
problems. Using the results from multiple decision trees is a simple concept and
allows us to reduce the problem of overfitting and underfitting experienced with
a single decision tree. To create a Random Forest we first need to randomly
select a subset of samples and features from the main dataset, a process known
as “Bootstraping”. This data is then used to build a decision tree. Carrying out
bootstrapping avoids issues of the decision trees being highly correlated and
improves model performance.
Data science isn’t particularly sexy, but it’s more important than ever
Not only is data cleansing an essential part of data science, it’s actually
where data scientists spend as much as 80% of their time. It has ever been thus.
As Mike Driscoll described in 2009, such “data munging” is a “painful process of
cleaning, parsing and proofing one’s data.” Super sexy! Now add to that drudgery
the very real likelihood that enterprises, as excited as they are to jump into
data science, many lack “a suitable infrastructure in place to start getting
value out of AI,” as Jonny Brooks has articulated: The data scientist likely
came in to write smart machine learning algorithms to drive insight but can’t do
this because their first job is to sort out the data infrastructure and/or
create analytic reports. In contrast, the company only wanted a chart that they
could present in their board meeting each day. The company then gets frustrated
because they don’t see value being driven quickly enough and all of this leads
to the data scientist being unhappy in their role. As I have written before:
“Data scientists join a company to change the world through data, but quit when
they realize they’re merely taking out the data garbage.”
Top 7 Skills Required to Become a Data Scientist
Having a deep understanding of machine learning and artificial intelligence is a
must to have to implement tools and techniques in different logic, decision
trees, etc. Having these skill sets will enable any data scientist to work and
solve complex problems specifically that are designed for predictions or for
deciding future goals. Those who possess these skills will surely stand out as
proficient professionals. With the help of machine learning and AI concepts, an
individual can work on different algorithms and data-driven models, and
simultaneously can work on handling large data sets such as cleaning data by
removing redundancies. ... The base of establishing your career as a data
science professional will require you to have the ability to handle complexity.
One must ensure to have the capability to identify and develop both creative and
effective solutions as and when required. You might face challenges in finding
out ways to develop any solution that possibly needs to have clarity in concepts
of data science by breaking down the problems into multiple parts to align them
in a structured way.
The Psychology Of Courage: 7 Traits Of Courageous Leaders
Like so many complex psychological human characteristics, courage can be
difficult to nail down. On the surface, courage seems like one of those “I know
it when I see it” concepts. In my twenty years spent facilitating and coaching
innovation, creativity, strategy and leadership programs, and in partnership
with Dr. Glenn Geher of the Psychology Department of the State University of New
York at New Paltz, I’ve identified behavioral attributes that often correlate
with a person’s access to their courage. Each attribute has influential effects
on organizational culture at all levels. Fostering these attributes in your own
life (at work and beyond) and within your team can help you lead toward the
courageous future you’re striving to achieve. ... Courage requires taking
intentional risks. And the bigger the risk, the more courage it takes (and the
bigger the outcome can be). Those who understand the importance of facing fear
and being vulnerable, who accept that falling and getting up again is part of
the journey, tend to have quicker access to their courage.
There is a path to replace TCP in the datacenter
"The problem with TCP is that it doesn't let us take advantage of the power of
datacenter networks, the kind that make it possible to send really short
messages back and forth between machines at these fine time scales," John
Ousterhout, Professor of Computer Science at Stanford, told The Register. "With
TCP you can't do that, the protocol was designed in so many ways that make it
hard to do that." It's not like the realization of TCP's limitations is anything
new. There has been progress to bust through some of the biggest problems,
including in congestion control to solve the problem of machines sending to the
same target at the same time, causing a backup through the network. But these
are incremental tweaks to something that is inherently not suitable, especially
for the largest datacenter applications (think Google and others). "Every design
decision in TCP is wrong for the datacenter and the problem is, there's no one
thing you can do to make it better, it has to change in almost every way,
including the API, the very interface people use to send and receive data. It
all has to change," he opined.
Typemock Simplifies .NET, C++ Unit Testing
When testing legacy code, you need to test small parts of the logic one by one,
such as the behavior of a single function, method or class. To do that the logic
must be isolated from the legacy code, he explained. As Jennifer Riggins
explained in a previous post, unit testing differs from integration testing,
which focuses on the interaction between these units or components, and catches
errors at the unit level earlier, so the cost of fixing them is dramatically
reduced. ... Typemock uses special code that can intersect with the flow of the
software, and instead of calling the real code, it doesn’t matter whether it’s a
real method or a virtual method, it can intercept it, and you can fake different
things in the code, he said. Typemock has been around since 2004 when Lopian
launched the company with Roy Osherove, a well-known figure in test-driven
development. They first released Typemock Isolator in 2006, a tool for unit
testing SharePoint, WCF and other .NET projects. Isolator provides an API helps
users write simple and human-readable tests that are completely isolated from
the production code.
Why Web 3.0 Will Change the Current State of the Attention Economy Drastically
The attention economy requires improvements, and Web 3.0 is capable of making
them happen. In the foreseeable future, it will drastically change the interplay
between consumers, advertisers and social media platforms. Web 3.0 will give
power to the people. It may sound pompous, but it's true. How is that possible?
Firstly, Web 3.0 will grant users ownership of their data, so you'll be able to
treat your data like it's your property. Secondly, it will enable you to be paid
for the work you are doing when making posts and giving likes on social media.
Both options provide you with the opportunity to monetize the attention that you
give and receive. The agreeable thing about Web 3.0 is that it's all about
honest ownership. If a piece of art can be an NFT with easily traceable
ownership, your data can be too. If you own your data, you can monetize or offer
it on your terms, knowing who is going to use it and how. For instance, there is
Permission, a tokenized Web 3.0 advertising platform that connects brands with
consumers, with the latter getting crypto rewards for their data and
engagement.
Serverless-first: implementing serverless architecture from the transformation outset
While a serverless-first mindset provides a range of benefits, some businesses
may be hesitant to make the transition due to concerns around cloud provider
security, vendor lock-in, sunk costs from other strategies and ongoing issues
with debugging and development environments. However, even among the most
serverless-adverse, this mindset can provide benefits to a select part of an
organisation. Take for example a bank’s operations. While the maintenance of a
traditional network infrastructure is crucial for uptime of the underlying
database, with a serverless approach they have the freedom to implement an agile
mindset with consumer-facing apps and technologies as demand grows. Agile and
serverless strategies typically go hand-in-hand, and both can encourage quick
development, modification and adaptation. In relation to concerns around vendor
lock-in, some organisations may look towards a cloud-agnostic strategy. However,
writing software for multiple clouds removes the ability to use features offered
by one specific cloud, meaning any competitive advantage of using a specific
vendor is then lost.
CISO in the Age of Convergence: Protecting OT and IT Networks
Pan Kamal, head of products at BluBracket, a provider of code security
solutions, says one of the first steps an organization can take is to create an
IT-OT convergence task force that maps out the asset inventory and then
determine where IT security policy needs to be applied within the OT domain.
“Review industry-specific cybersecurity regulations and prioritize
implementation of mandatory security controls where called for,” Kamal adds. “I
also recommend investing in a converged dashboard -- either off the shelf or
create a custom dashboard that can identify vulnerabilities and threats and
prioritize risk by criticality.” Then, organizations must examine the network
architecture to see if secure connections with one-way communications -- via
data diodes for example -- can eliminate the possibility of an intruder coming
in from the corporate network and pivoting to the OT network Another key element
is conducting a review of security policies related to both the equipment and
the software supply chain, which can help identify secrets in code present in
git repositories and help remediate them prior to the software ever being
deployed.
Quote for the day:
"Inspired leaders move a business beyond
problems into opportunities." -- Dr. Abraham Zaleznik
No comments:
Post a Comment