The Architect’s Guide to Open Table Formats and Object Storage
Data lakehouse architectures are purposefully designed to leverage the
scalability and cost-effectiveness of object storage systems, such as Amazon
Web Services (AWS) S3, Google Cloud Storage and Azure Blob Storage. ... Data
lakehouse architectures are purposefully designed to leverage the scalability
and cost-effectiveness of object storage systems, such as Amazon Web Services
(AWS) S3, Google Cloud Storage and Azure Blob Storage. This integration
enables the seamless management of diverse data types — structured,
semi-structured and unstructured — within a unified platform. ... The open
table formats also incorporate features designed to boost performance. These
also need to be configured properly and leveraged for a fully optimized stack.
One such feature is efficient metadata handling, where metadata is managed
separately from the data, which enables faster query planning and execution.
Data partitioning organizes data into subsets, improving query performance by
reducing the amount of data scanned during operations. Support for schema
evolution allows table formats to adapt to changes in data structure without
extensive data rewrites, ensuring flexibility while minimizing processing
overhead.
The future of open source will be messy
First, it’s important to point out that open source software is both pervasive
and foundational. Where would we be without Linux and the vast treasure trove
of other open source projects on which the internet is built? However, the
vast majority of software, written for use or sale, is not open source. This
has always been true. Developers do care about open source, and for good
reason, but it is not their top concern. As Redis CEO Rowan Trollope told me
in a recent interview, “If you’re the average developer, what you really care
about is capability: Does this [software] offer something unique and
differentiated that’s awesome that I need in my application.” ... Meanwhile,
Meta and the rest of the industry keep releasing new code, calling it open
source or open weights (Sam Johnston offers a great analysis), without much
concern for what the OSI or anyone else thinks. Johnston may be exaggerating
when he says, “The more [the word] open appears in an artificial intelligence
product’s branding, the less open it actually tends to be,” but it’s clear
that the term open gets used a lot, starting with category leader OpenAI,
which is not open in any discernible sense, without much concern for any
traditional definitions.
What’s next for generative AI in 2025?
“Data is the lifeblood of any AI initiative, and the success of these projects
hinges on the quality of the data that feeds the models,” said Andrew Joiner,
CEO of Hyperscience, which develops AI-based office work automation tools.
“Alarmingly, three out of five decision makers report their lack of
understanding of their own data inhibits their ability to utilize genAI to its
maximum potential. The true potential…lies in adopting tailored SLMs, which
can transform document processing and enhance operational efficiency.” Gartner
recommends that organizations customize SLMs to specific needs for better
accuracy, robustness, and efficiency. “Task specialization improves alignment,
while embedding static organizational knowledge reduces costs. Dynamic
information can still be provided as needed, making this hybrid approach both
effective and efficient,” the research firm said. ... While Agentic AI
architectures are a top emerging technology, they’re still two years away from
reaching the lofty automation expected of them, according to Forrester. While
companies are eager to push genAI into complex tasks through AI agents, the
technology remains challenging to develop because it mostly relies on
synergies between multiple models, customization through retrieval augmented
generation (RAG), and specialized expertise.
The Perils of Security Debt: Serious Pitfalls to Avoid
Security debt is caused by a failure to “build security in” to software from
the design to deployment as part of the SDLC. Security debt accumulates when a
development organization releases software with known issues, deferring the
redressal of its weaknesses and vulnerabilities. Sometimes the organization
skips certain test cases or scenarios in pursuit of faster deployment and in
the process failing to test software thoroughly. Sometimes the business
decides that the pressure to finish a project is so great that it makes more
sense to release now and fix issues later. Later is better than never, but
when “later” never arrives, existing security debt becomes worse. ... Great
leadership is the beacon that not only charts the course but also ensures your
crew – your IT team, support staff, and engineers – are well-prepared to face
the challenges ahead. It instills discipline, vigilance, and a culture of
security that can withstand the fiercest digital storms. The Board and
leadership must understand and champion the importance of security for the
organization. By setting the tone at the top, they can drive the cultural and
procedural changes needed to prevent the accumulation of the security debt.
Periodic review and monitoring of security metrics, and identifying &
tracking security debt as a risk can help keep the organization accountable
and on track.
The long-term impacts of AI on networking
Every enterprise who self-hosted AI told me the mission demanded more
bandwidth to support “horizontal” traffic than their normal applications, more
than their current data center needed to support. Ten of the group said that
this meant they’d need the “cluster” of AI servers to have faster Ethernet
connections and higher-capacity switches. Everyone agreed that a real
production deployment of on-premises AI would need new network devices, and
fifteen said they bought new switches even for their large-scale trials. The
biggest problem with the data center network I heard from those with
experience is that they believed they built up more of an AI cluster than they
needed. Running a popular LLM, they said, requires hundreds of GPUs and
servers, but small language models can run on a single system, and a third of
current self-hosting enterprises said they believed it is best to start small,
with small models, and build up only when you had experience and could
demonstrate a need. This same group also pointed out that control was needed
to ensure only truly useful AI applications where run. “Applications otherwise
build up, exceed, and then increase, the size of the AI cluster,” said
users.
Bridging Skill Gaps in the Automotive Industry with AI-Led Immersive Simulations
This crisis of personnel shortfall is particularly acute in sectors like
autonomous driving and AI-driven manufacturing, where the required skillset
surpasses the capabilities of the current workforce. This alarming shortage of
specialised expertise poses a serious threat to the industry’s progress. It
could potentially lead to production halts at various facilities, delay the
launch of next-generation vehicles, and hinder the transition to self-driving
cars powered by sustainable energy. In order to address this issue, orthodox
educational methods must be modernised to incorporate cutting-edge
technologies like AI and robotics. ... Unlike traditional training, which
often involves static lessons or expensive hands-on practice, immersive
simulations allow workers to practice in environments that would be too risky
or costly in real life. For example, with autonomous vehicles, workers can
practice fixing and calibrating vehicle systems in a virtual world without the
risk of damaging anything. These simulations can also create different road
conditions for workers to experience, helping them build critical
decision-making skills without real-world consequences.
AI agents might be the new workforce, but they still need a manager
AI agents need to be thoughtfully managed, just as is the case with human
work, and there's work to be done before an agentic AI-driven workforce can
truly assume a broad range of tasks. "While the promise of agentic AI is
evident, we are several years away from widespread agentic AI adoption at the
enterprise level," said Scott Beechuk, partner with Norwest Venture Partners.
"Agents must be trustworthy given their potential role in automating
mission-critical business processes." The traceability of AI agents' actions
is one issue. "Many tools have a hard time explaining how they arrived at
their responses from users' sensitive data and models struggle to generalize
beyond what they have learned," said Ananthakrishnan. ... Unpredictability is
a related challenge, as LLMs "operate like black boxes," said Beechuk. "It's
hard for users and engineers to know if the AI has successfully completed its
task and if it did so correctly." ... Human workers also are capable of
collaborating easily and on a regular basis. For AI workers, it's a different
story. "Because agents will interact with multiple systems and data stores,
achieving comprehensive visibility is no easy task," said Ananthakrishnan.
It's important to have visibility to capture each action an agent takes.
Change management: Achieve your goals with the right change model
You need a good leadership team of influential people who are all pulling in
the same direction. This is the only way to implement upcoming changes and
anchor them in the company. It is important to include people in the
leadership team who have a great deal of influence and/or are well respected
by the workforce. At the same time, these people must be fully committed to
the planned change. ... Communication comes before implementation. Those
affected must understand it to become participants or supporters. Initiating
measures without first explaining the context to those involved would
unnecessarily create unrest in the company. When communicating, it makes sense
to proceed in several steps: the change team first informs the clients and
gets a “go” from them. After that, the change team informs the managers so
that they can answer questions from employees during company-wide
communication. ... Quick wins must be realized and made visible to increase
motivation. Quick wins should therefore also be identified when defining
objectives, because success is important to ensure that the initial motivation
does not fizzle out. Initial successes should be related to the overarching
goal, because then they strengthen intrinsic motivation. Small successes can
thus have a big impact.
Forrester on cybersecurity budgeting: 2025 will be the year of CISO fiscal accountability
Forrester sees the increasing adoption of AI and generative AI (gen AI) as
driving the needed updates to infrastructure. “Any Gen AI project that we
discussed with customers ultimately becomes a data integration project,” says
Pascal Matska, vice president and research director at Forrester. “You have to
invest into specific capabilities and platforms that run specific AI workloads
in the most suitable infrastructure at the right price point, and also drive
investments into cloud-native technologies such as Kubernetes and containers
and modern data platforms that really are there to help you drive out some of
the frictions that exist within the different business silos,” Matska
continued. ... CISOs who drive gains in revenue advance their careers. “When
something touches as much revenue as cybersecurity does, it is a core
competency. And you can’t argue that it isn’t,” Jeff Pollard, VP and principal
analyst at Forrester, said during his keynote titled “Cybersecurity Drives
Revenue: How to Win Every Budget Battle” at the company’s Security and Risk
Forum in 2022. Budgeting to protect revenue needs to start with the weakest,
most at-risk areas. These include software supply chain security, API
security, human risk management, and IoT/OT threat detection.
Passkey technology is elegant, but it’s most definitely not usable security
"The problem with passkeys is that they're essentially a halfway house to a
password manager, but tied to a specific platform in ways that aren't obvious
to a user at all, and liable to easily leave them unable to access ... their
accounts," wrote the Danish software engineer and programmer, who created Ruby
on Rails and is the CTO of web-based software development firm 37signals.
"Much the same way that two-factor authentication can do, but worse, since
you're not even aware of it." ... The security benefits of passkeys at the
moment are also undermined by an undeniable truth. Of the hundreds of sites
supporting passkeys, there isn't one I know of that allows users to ditch
their password completely. The password is still mandatory. And with the
exception of Google's Advanced Protection Program, I know of no sites that
won't allow logins to fall back on passwords, often without any additional
factor. ... Under the FIDO2 spec, the passkey can never leave the security
key, except as an encrypted blob of bits when the passkey is being synced from
one device to another. The secret key can be unlocked only when the user
authenticates to the physical key using a PIN, password, or most commonly a
fingerprint or face scan. In the event the user authenticates with a
biometric, it never leaves the security key, just as they never leave Android
and iOS phones and computers running macOS or Windows.
Quote for the day:
"You are a true success when you help
others be successful." -- Jon Gordon
No comments:
Post a Comment