The Critical Role of Data Cleaning
Data cleaning is a crucial step that eliminates irrelevant data, identifies
outliers and duplicates, and fixes missing values. It involves removing errors,
inconsistencies, and, sometimes, even biases from raw data to make it usable.
While buying pre-cleaned data can save resources, understanding the importance
of data cleaning is still essential. Inaccuracies can significantly impact
results. In many cases, before the removal of low-value data, the rest is still
hardly usable. Cleaning works as a filter, ensuring that data passes through to
the next step, which is more refined and relevant to your goals. ... At its
core, data cleaning is the backbone of robust and reliable AI applications. It
helps guard against inaccurate and biased data, ensuring AI models and their
findings are on point. Data scientists depend on data cleaning techniques to
transform raw data into a high-quality, trustworthy asset. ... Interestingly,
LLMs that have been properly trained on clean data can play a significant role
in the data cleaning process itself. Their advanced capabilities enable LLMs to
automate and enhance various data cleaning tasks, making the process more
efficient and effective.
What Is Paravirtualization?
Paravirtualization builds upon traditional virtualization by offering extra
services, improved capabilities or better performance to guest operating
systems. With traditional virtualization, organizations abstract the underlying
resources via virtual machines to the guest so they can run them as is, says
Greg Schulz, founder of the StorageIO Group, an IT industry analyst consultancy.
However, those virtual machines use all of the resources assigned to them,
meaning there is a great deal of idle time, even though it doesn’t appear so,
according to Kalvar. Paravirtualization uses software instruction to dynamically
size and resize those resources, Kalvar says, turning VMs into bundles of
resources. They are managed by the hypervisor, a software component that manages
multiple virtual machines in a computer. ... One of the biggest advantages of
paravirtualization is that it is typically more efficient than full
virtualization because the hypervisor can closely manage and optimize resources
between different operating systems. Users can manage the resources they consume
on a granular basis. “I’m not buying an hour of a server, I’m buying seconds of
resource time,” Kalvar says.
Leaked Access Keys: The Silent Revolution in Cloud Security
The challenge for service accounts is that MFA does not work, and network-level
protection (IP filtering, VPN tunneling, etc.) is not consequently applied,
primarily due to complexity and costs. Thus, service account key leaks often
enable hackers to access company resources. While phishing is unusual in the
context of service accounts, leakages are frequently the result of developers
posting them (unintentionally) online, often in combination with code fragments
that unveil the user to whom they apply. ... Now, Google has changed the game
with its recent policy change. If an access key appears in a public GitHub
repository, GCP deactivates the key, no matter whether applications crash.
Google's announcement marks a shift in the risk and priority tango. Gone are the
days when patching vulnerabilities could take days or weeks. Welcome to the
fast-paced cloud era. Zero-second attacks after credential leakages demand
zero-second fixing. Preventing an external attack becomes more important than
avoiding crashing customer applications – that is at least Google's
opinion.
Juniper advances AI networking software with congestion control, load balancing
On the load balancing front, Juniper has added support for dynamic load
balancing (DLB) that selects the optimal network path and delivers lower
latency, better network utilization, and faster job completion times. From the
AI workload perspective, this results in better AI workload performance and
higher utilization of expensive GPUs, according to Sanyal. “Compared to
traditional static load balancing, DLB significantly enhances fabric bandwidth
utilization. But one of DLB’s limitations is that it only tracks the quality of
local links instead of understanding the whole path quality from ingress to
egress node,” Sanyal wrote. “Let’s say we have CLOS topology and server 1 and
server 2 are both trying to send data called flow-1 and flow-2, respectively. In
the case of DLB, leaf-1 only knows the local links utilization and makes
decisions based solely on the local switch quality table where local links may
be in perfect state. But if you use GLB, you can understand the whole path
quality where congestion issues are present within the spine-leaf level.”
Impact of AI Platforms on Enhancing Cloud Services and Customer Experience
AI platforms enable businesses to streamline operations and reduce costs by
automating routine tasks and optimizing resource allocation. Predictive
analytics, powered by AI, allows for proactive maintenance and issue
resolution, minimizing downtime and ensuring continuous service availability.
This is particularly beneficial for industries where uninterrupted access to
cloud services is critical, such as finance, healthcare, and e-commerce. ...
AI platforms are not only enhancing backend operations but are also
revolutionizing customer interactions. AI-driven customer service tools, such
as chatbots and virtual assistants, provide instant support, personalized
recommendations, and seamless user experiences. These tools can handle a wide
range of customer queries, from basic information requests to complex
problem-solving, thereby improving customer satisfaction and loyalty. The
efficiency and round-the-clock availability of AI-driven tools make them
invaluable for businesses. By the year 2025, it is expected that AI will
facilitate around 95% of customer interactions, demonstrating its growing
influence and effectiveness.
2 Essential Strategies for CDOs to Balance Visible and Invisible Data Work Under Pressure
Short-termism under pressure is a common mistake, resulting in an unbalanced
strategy. How can we, as data leaders, successfully navigate such a scenario?
“Working under pressure and with limited trust from senior management can
force first-time CDOs to commit to an unbalanced strategy, focusing on
short-term, highly visible projects – and ignore the essential foundation.”
... The desire to invest in enabling topics stems from the balance between
driving and constraining forces. The senior management tends to ignore
enabling topics because they rarely directly contribute to the bottom line;
they can be a black box to a non-technical person and require multiple teams
to collaborate effectively. On the other hand, Anne knew that the same people
eagerly anticipated the impact of advanced analytics such as GenAI and were
worried about potential regulatory risks. With the knowledge of the key
enabling work packages and the motivating forces at play, Anne has everything
she needs to argue for and execute a balanced long-term data strategy that
does not ignore the “invisible” work required.
Gen AI Spending Slows as Businesses Exercise Caution
Generative AI has advanced rapidly over the past year, and organizations are
recognizing its potential across business functions. But businesses have now
taken a cautious stance regarding gen AI adoption due to steep implementation
costs and concerns related to hallucinations. ... This trend reflects a
broader shift away from the AI hype, and while businesses acknowledge the
potential of this technology, they are also wary of the associated risks and
costs, according to Michael Sinoway, CEO, Lucidworks. "The flattened spending
suggests a move toward more thoughtful planning. This approach ensures AI
adoption delivers real value, balancing competitiveness with cost management
and risk mitigation," he said. ... Concerns regarding implementation costs,
accuracy and data security have increased considerably in 2024. The number of
business leaders with concerns related to implementation costs has increased
14-fold and those related to response accuracy have grown fivefold. While
concerns about data security have increased only threefold, it remains the
biggest worry.
CIOs are stretched more than ever before — and that’s a good thing
“Many CIOs have built years of credibility and trust by blocking and tackling
the traditional responsibilities of the role,” she adds. “They’re now being
brought to the conversation as business leaders to help the organization think
through transformational priorities because they’re functional experts like
any other executive in the C-suite.” ... “Boards want technology to improve
the top and bottom line, which can be a tough balance, even if it’s one that
CIOs are getting used to managing,” says Nash Squared’s White. “On the one
hand, they’re being asked to promote innovation and help generate revenue, and
on the other, they’re often charged with governance and security, too.” The
importance of technology will only continue to increase going forward as well.
Gen AI, for example, will make it possible to boost productivity while
reducing costs. CyberArk’s Grossman expects the central role of digital
leaders in exploiting these emerging technologies will mean high-level CIOs
will be even more important in the future.
What Is a Sovereign Cloud and Who Truly Benefits From It?
A sovereign cloud is a cloud computing environment designed to help
organizations comply with regulatory rules established by a particular
government. This often entails ensuring that data stored within the cloud
environment remains within a specific country. But it can also involve other
practices, as we explain below. ... For one thing, cost. In general, cloud
computing services on a sovereign cloud cost more than their equivalents on a
generic public cloud. The exact pricing can vary widely depending on a number
of factors, such as which cloud regions you select and which types of services
you use, but in general, expect to pay a premium of at least 15% to use a
sovereign cloud. A second challenge of using sovereign clouds is that in some
cases your organization must undergo a vetting process to use them because
some sovereign cloud providers only make their solutions available to certain
types of organizations — often, government agencies or contractors that do
business with them. This means you can't just create a sovereign cloud account
and start launching workloads in a matter of minutes, as you could in a
generic public cloud.
Securing datacenters may soon need sniffer dogs
So says Len Noe, tech evangelist at identity management vendor CyberArk. Noe
told The Register he has ten implants – passive devices that are observable
with a full body X-ray, but invisible to most security scanners. Noe explained
he's acquired swipe cards used to access controlled premises, cloned them in
his implants, and successfully entered buildings by just waving his hands over
card readers. ... Noe thinks hounds are therefore currently the only reliable
means of finding humans with implants that could be used to clone ID cards. He
thinks dogs should be considered because attackers who access datacenters
using implants would probably walk away scot-free. Noe told The Register that
datacenter staff would probably notice an implant-packing attacker before they
access sensitive areas, but would then struggle to find grounds for
prosecution because implants aren't easily detectable – and even if they were
the information they contain is considered medical data and is therefore
subject to privacy laws in many jurisdictions.
Quote for the day:
"Leadership is liberating people to do
what is required of them in the most effective and humane way possible." --
Max DePree
No comments:
Post a Comment