How LLMs made their way into the modern data stack in 2023

Beyond helping teams generate insights and answers from their data through text
inputs, LLMs are also handling traditionally manual data management and the data
efforts crucial to building a robust AI product. In May, Intelligent Data
Management Cloud (IDMC) provider Informatica debuted Claire GPT, a
multi-LLM-based conversational AI tool that allows users to discover, interact
with and manage their IDMC data assets with natural language inputs. It handles
multiple jobs within the IDMC platform, including data discovery, data pipeline
creation and editing, metadata exploration, data quality and relationships
exploration, and data quality rule generation. Then, to help teams build AI
offerings, California-based Refuel AI provides a purpose-built large language
model that helps with data labeling and enrichment tasks. A paper published in
October 2023 also shows that LLMs can do a good job at removing noise from
datasets, which is also a crucial step in building robust AI. Other areas in
data engineering where LLMs can come into play are data integration and
orchestration. 
Corporate governance in 2023: a year in review
2023 has seen a continuing trend of more responsibilities for directors. Often,
this responsibility comes from regulators; sometimes, it comes from investors or
other stakeholders. One thing is certain, though: directors are rapidly losing
any remaining wiggle room to be “rubber-stamp” individuals. Modern board roles
carry serious accountability; many directors are starting to appreciate that and
adhere to new standards. The trouble is sometimes the new standard overstretch
the director – so much so that we now have concerns about overboarding,
exhaustion, and undue stress. How will that play out if the trend of more
responsibility continues? ... The board dismissed the evidently popular CEO Sam
Altman in a decision made behind closed doors with utmost secrecy. And as the
world’s attention predictably turned their way, they could give no answers.
Soon, Altman was rehired after around 70% of the company’s staff threatened to
resign and join Microsoft (a significant OpenAI investor). The board
subsequently agreed to undergo a major reshuffle for more accountability and
transparent decision-making.
Quantum Computing’s Hard, Cold Reality Check

The problem isn’t just one of timescales. In May, Matthias Troyer, a technical
fellow at Microsoft who leads the company’s quantum computing efforts,
co-authored a paper in Communications of the ACM suggesting that the number of
applications where quantum computers could provide a meaningful advantage was
more limited than some might have you believe. “We found out over the last 10
years that many things that people have proposed don’t work,” he says. “And then
we found some very simple reasons for that.” The main promise of quantum
computing is the ability to solve problems far faster than classical computers,
but exactly how much faster varies. There are two applications where quantum
algorithms appear to provide an exponential speed up, says Troyer. One is
factoring large numbers, which could make it possible to break the public key
encryption the internet is built on. The other is simulating quantum systems,
which could have applications in chemistry and materials science. Quantum
algorithms have been proposed for a range of other problems including
optimization, drug design, and fluid dynamics. 
Navigating the Data Landscape: The Crucial Role of Data Governance in Today’s Business Environment
Data quality management has become increasingly paramount as the volume of data
exponentially raises day by day. Organizations can protect their data with
policies and procedures, ensure that they follow all the rules and regulations,
hire folks that understand the data you are collecting and what it means to
their company but if that data isn’t high quality your organization may get the
short end of the stick. Maybe you’re three weeks late for a TikTok trend or you
miss out on a whole subset of customers because of the misstep with your
collection methods, either way that profit loss and a chance to build on that
data point in the future could be a pivotal misstep. Ensuring that your
organization has processes to monitor and improve your data quality on a
continuous basis will save your organization time and money in the long run.
Despite its importance, implementing effective data governance comes with
challenges. Organizations often face resistance to change, cultural barriers,
and the complexity of managing diverse data sources.
Choosing Between Message Queues and Event Streams

There are numerous distinctions between technologies that allow you to
  implement event streaming and those that you can use for message queueing. To
  highlight them, I will compare Apache Kafka and RabbitMQ. I’ve chosen Kafka
  and RabbitMQ specifically because they are popular, widely used solutions
  providing rich capabilities that have been extensively battle-tested in
  production environments. ... Message queueing and event streaming can both be
  used in scenarios requiring decoupled, asynchronous communication between
  different parts of a system. For instance, in microservices architectures,
  both can power low-latency messaging between various components. However,
  going beyond messaging, event streaming and message queueing have distinct
  strengths and are best suited to different use cases. ... Message queueing is
  a good choice for many messaging use cases. It’s also an appealing proposition
  if you’re early in your event-driven journey; that’s because message queueing
  technologies are generally easier to deploy and manage than event streaming
  solutions. 
5G and edge computing: What they are and why you should care

Instead of relying solely on large, high-powered cell towers (as 4G does), 5G
will run off both those towers and a ton of small cell sites that can be
clustered together. This is how 5G achieves its population density. 5G is also
supposed to be more energy efficient. As such, the communications component of
IoT devices won't drain as much power, resulting in longer battery life for
connected devices. There's also a ton of AI and machine learning in 5G
implementations. 5G nodes and interface devices deployed on the edge, away from
central hubs. They utilize AI and machine learning to analyze communications
performance, and use AI to bandwidth-shape communications, to wring as much
performance out of the hardware as possible. You're familiar with the term
"cloud computing." We've all used cloud services, services that run on a server
someplace rather than on our desktop computers or mobile devices. The cloud, of
course, isn't really a cloud. Amazon, Google, Facebook, Microsoft, and others
operate massive data centers packed with thousands upon thousands of servers.
Soft and fluffy, the cloud is not.
Stolen Booking.com Credentials Fuel Social Engineering Scams

Social engineering expert Sharon Conheady said this type of trickery remains
  extremely difficult to repel, because of the customer-first nature of
  hospitality. Many public-facing people in such organizations, such as
  receptionists, are "trained to help people - that's their job," and of course
  they're going to bend over backwards to try to meet apparent customers'
  demands, Conheady said in an interview at this month's Black Hat Europe
  conference in London. Help desks remain another frequent target. "I had a
  client lately who asked me to call the help desk and obtain BitLocker keys,"
  she said, referring to a recent penetration test. "Every single one of the
  help desk agents gave us the BitLocker key." That prompted her to ask: Do
  these personnel even know what a BitLocker key is, and why they shouldn't
  share it? The client said they didn't know. While training people in
  customer-facing roles can help, Conheady said the only truly effective
  approach would be to put in place strong technical controls to outright
  prevent and block such attacks.
Significantly Improving Security Posture: A CMMI Case Study
“Phoenix Defense has led the way in adopting CMMI Security best practices for
  nearly two decades, and now included the Security best practices,” says Kris
  Puthucode, Certified CMMI High Maturity Lead Appraiser at Software Quality
  Center LLC. “This adoption has yielded quantifiable benefits, enhancing
  security posture across Mission, Personnel, Physical, Process, and
  Cybersecurity domains. Additionally, incorporating Virtual work best practices
  has standardized virtual meetings and events, boosting efficiency.” Phoenix
  Defense has been a CMMI Performance Solutions Organization since 2005, first
  achieving Maturity Level 5 in 2020. ... Before adopting CMMI Security and
  Managing Security Threats and Vulnerabilities Practice Areas in the model,
  Phoenix Defense had a closed network with no outward-facing applications and
  relied on a third-party vendor to monitor threats and spam. They did not
  fully, quantitively track attacks against the networks or other data flows,
  and they required a more robust approach to properly ensure network
  security.
5 common data security pitfalls — and how to avoid them

While regulations like GDPR and SOX set standards for data security, they are
  merely starting points and should be considered table stakes for protecting
  data. Compliance should not be mistaken for complete data security, as robust
  security involves going beyond compliance checks. The fact is that many large
  data breaches have occurred in organizations that were fully compliant on
  paper. Moving beyond compliance requires actively identifying and mitigating
  risks rather than just ticking boxes during audits. ... Data is one of the
  most valuable assets for any organization. And yet, the question, “Who owns
  the data?” often leads to ambiguity within organizations. Clear delineation of
  data ownership and responsibility is crucial for effective data governance.
  Each team or employee must understand their role in protecting data to create
  a culture of security. ... Unpatched vulnerabilities are one of the easiest
  targets for cyber criminals. This means that organizations face significant
  risks when they can’t address public vulnerabilities quickly. Despite the
  availability of patches, many enterprises delay deployment for various
  reasons, which leaves sensitive data vulnerable.
Outmaneuvering AI: Cultivating Skills That Make Algorithms Scratch Their Head

Reasoning, the intellectual ninja of skills, is all about slicing through
  misinformation, assumptions, and biases to get to the heart of the matter.
  It’s not just drawing conclusions, but thinking about how we do that. This
  skill is the brain’s bouncer, keeping cognitive fallacies and hasty
  generalizations at bay. We humans, bless our hearts, are prone to jumping on
  the bandwagon or seeing patterns where there are none (like seeing a face on
  Mars or believing in hot streaks at Vegas). These mental shortcuts, or
  heuristics, can lead us astray, making reasoning not just useful but
  essential. AI is trained on our past reasoning reflected in old works. But it
  can’t reason on its own — at least not yet. Consider a business deciding
  whether to invest in a new technology. Without proper reasoning, they might
  follow the hype (everyone else is doing it!) or rely on gut feelings (it just
  feels right!). But with reasoning, they dissect the decision, weigh the
  evidence, consider alternatives, and make a choice that’s not just good on
  paper, but good in reality.
Quote for the day:
"Whether you think you can or you
    think you can’t, you’re right." -- Henry Ford
 
 
No comments:
Post a Comment