Daily Tech Digest by Kannan Subbiah: machine unlearning

Showing posts with label machine unlearning. Show all posts

Daily Tech Digest - July 02, 2026

Quote for the day:

"Winners are not afraid of losing. But losers are. Failure is part of the process of success. People who avoid failure also avoid success." -- Robert T. Kiyosaki

🎧 Listen to this digest on YouTube Music

▶ Play Audio Digest

Duration: 25 mins • Perfect for listening on the go.

Shadow agents: How IT leaders must govern ‘headless’ AI before it breaks the enterprise

As businesses increasingly rely on autonomous artificial intelligence to handle complex tasks, technology leaders are facing a new security challenge. Invisible AI programs are operating in the background of enterprise networks, completing workflows without logging in or leaving standard audit trails. Driven by the high costs of cloud computing, organizations are shifting these automated tools to run locally on employee laptops. Because conventional security systems are designed to monitor human behavior, they cannot track these automated processes, leaving teams blind to what the software is accessing or deciding. To safely manage this shift, companies need to move away from traditional perimeter defenses and adopt strict containment strategies. By placing these programs in isolated environments, organizations can strictly control their permissions and limit their access to sensitive information. This transition also requires dedicated engineers focused on establishing behavioral rules, testing instructions, and securing data retrieval. Governing these automated systems at scale demands centralized oversight and clear policies. By establishing this accountability infrastructure now, technology leaders can confidently harness the power of autonomous software without compromising their security or losing visibility into their own networks.

The 20 Software Engineering Laws

The DZone article "The 20 Software Engineering Laws" by Dr. Milan Milanovic explores fundamental principles that dictate how software projects actually unfold, rather than how we hope they will. Instead of focusing on code syntax, these laws address the human, organizational, and structural realities that engineers face when working under pressure. The piece categorizes these principles into several practical themes, such as system building, speed, planning, and metrics. For instance, laws related to system building include Conway’s Law, which states that a system’s architecture inevitably mirrors a company's communication structure, and Gall’s Law, reminding us that successful complex systems must evolve from working simple ones. When exploring lost speed, the author highlights Brooks’s Law, explaining why adding more developers to a late project only delays it further. The article also tackles planning and metrics, citing Parkinson's Law, where work expands to fill available time, and Goodhart's Law, which warns that when a measure becomes a target, it stops being a good measure. By grounding these concepts in real-world examples like Instagram's pivot and Berlin's delayed airport, the article provides a practical framework to help engineers navigate common pitfalls with confidence and clarity.

Machine Unlearning with Minimal Gradient Dependence for High Unlearning Ratios

As machine learning systems process enormous volumes of information, the ability to make them forget specific private data is increasingly critical for security. A recent research paper introduces Mini-Unlearning, a method designed to tackle the difficulties of removing information when a large proportion of the original data must be forgotten. Traditional approaches to this problem usually require saving extensive records of past training updates, which demands heavy memory usage and becomes inefficient at scale. To resolve this, Mini-Unlearning operates on the mathematical insight that unlearned settings naturally correspond to retrained settings through a predictable geometric relationship. By taking advantage of this relationship, the new technique effectively calculates necessary adjustments using only a tiny subset of recent training updates. This approach completely bypasses the need for full historical records, greatly lowering the required computational power and memory. Testing shows that this lightweight method successfully deletes targeted personal information while maintaining overall system accuracy and effectively defending against targeted attempts to uncover hidden user data. Ultimately, this scalable solution allows organizations to reliably comply with strict privacy regulations without compromising the performance or efficiency of their broader systems.

Reliability Comes From the System, Not the Agent

When adopting artificial intelligence, many executives mistakenly judge an AI agent’s reliability in complete isolation. This perspective stems from traditional software development practices, where individual components are expected to function perfectly on their own. However, in complex or high-stakes environments—such as aviation or healthcare—reliability has never depended on the perfection of a single actor. Instead, it naturally emerges from a well-designed surrounding system that anticipates and catches inevitable human errors before they can escalate into a larger issue. The exact same principle applies directly to artificial intelligence agents. Rather than waiting around for a completely flawless model, organizations should focus their efforts on building robust workflows around these tools. A truly dependable system assumes occasional failures and uses practical safeguards like approval gates, continuous feedback loops, and risk-based reviews to ensure consistent outcomes. When an agent produces an error, it is not necessarily a sign that the technology is unready; rather, it highlights the pressing need for stronger operational structures. Ultimately, the competitive advantage in AI will not come from choosing the best model, but from designing resilient organizational workflows that gracefully handle imperfections and deliver predictable results over time.

Detection engineering: A programmatic approach to identifying cyber threats

Detection engineering is rapidly becoming a key focus for cybersecurity teams as organizations look to defend against increasingly advanced digital threats. Instead of relying heavily on rigid, pre-built rules that often fail to catch modern attacks, detection engineering takes a highly tailored approach. It involves building customized systems designed to spot suspicious behaviors specific to an organization’s unique environment, effectively minimizing the flood of false alarms that commonly overwhelm security teams today. The growing interest in this practice is driven by the realization that traditional, signature-based security methods are no longer sufficient to stop modern tactics like fileless malware or complex attacks on cloud infrastructure. By carefully mapping out potential attack paths and analyzing real-world adversary behavior, companies can proactively spot threats rather than just reacting after a damaging incident has occurred. Recent surveys indicate that the vast majority of large enterprises are heavily investing in these active strategies, with many now establishing dedicated detection teams. Additionally, artificial intelligence and automation are playing crucial roles in helping these professionals fine-tune rules and process vast amounts of threat data. Ultimately, adopting detection engineering reduces the time attackers can hide within a network, greatly improving an organization's overall cyber resilience.

Compute Concentration: The Emerging Enterprise Risk Inside the AI Economy

As artificial intelligence transitions from testing to full-scale operations, a new, hidden challenge is emerging for modern businesses: compute concentration. This happens when companies quietly become overly reliant on a very small group of external providers for the core infrastructure needed to run their systems, such as cloud storage, data centers, and computer chips. Often, this dependency develops by accident. A company might start with one provider for ease of use and speed, eventually deeply intertwining all their critical functions within a single technology ecosystem. While working with large providers offers undeniable benefits like strong security and massive scale, heavy reliance creates significant vulnerabilities. If a primary provider experiences an outage, changes their pricing, or alters their policies, the affected business faces immediate disruptions, unexpected costs, and a loss of control over their own operations. It is not just about managing vendors; it is a fundamental issue of business continuity and strategic independence. True resilience does not mean avoiding large providers entirely, but rather fully understanding these deep dependencies. Organizations must ensure they have viable alternatives ready so they are not caught off guard if their primary technology foundation shifts.

Preventing agent-generated infrastructure bloat through spec-driven governance

Autonomous AI engineering agents can drastically improve software delivery speed, but they also risk creating massive infrastructure bloat if left unchecked. Because these agents often default to the inefficient patterns found in their training data, they frequently over-provision resources—such as requesting excessively large Kubernetes pods or pulling bloated container images. This inefficiency replicates rapidly across environments, wasting cloud space and increasing energy consumption. To prevent this, organizations must implement strict, spec-driven governance directly within their development pipelines. Instead of treating sustainability and efficiency as afterthoughts, engineering teams need to embed clear constraints into their infrastructure specifications. By defining rules for machine types, pod resource limits, and minimal base images before the agent generates any code, the agent is forced to execute within those boundaries. Organizations can enforce these constraints using static analysis tools and quality gates that block non-compliant deployments. Addressing this issue upstream ensures that agent-driven development yields efficient, cost-effective, and sustainable infrastructure by design, rather than creating a sprawling operational mess that becomes nearly impossible to fix later.

Agentic AI creates enterprise challenge beyond LLM boom

As businesses move beyond early experiments with artificial intelligence, they face a practical new challenge: managing and governing the automated software programs, or agents, that will soon work alongside human employees. While recent attention has focused on language models, the conversation is shifting toward the infrastructure needed to support these agents. Companies must figure out how to integrate them, control their access to company data, and manage the costs associated with running them. A primary issue is matching the right level of computing power to specific tasks to keep expenses predictable and responses consistent. Because current technology frameworks were built for human users, new standards are emerging to help these agents communicate securely with existing systems. Over time, managing the lifecycle of these digital assistants will become essential to prevent the lack of oversight that accompanied early cloud software adoption. As regulations develop unevenly across different regions, leaders are currently focused on learning how to build the right foundations. Soon, companies will shift from planning to execution, preparing for a future where each employee might collaborate with several automated assistants daily, requiring careful oversight and clear guidelines.

The rise of emotion as a trust signal

Digital identity systems are evolving beyond traditional passwords and basic biometrics by incorporating emotion as a new trust signal. Voice artificial intelligence is now being trained to analyze vocal cues—such as tone and pacing—to determine a speaker's underlying emotional state. By converting these real-time observations into structured data, companies hope to better understand customer intent, improve service routing, and identify potential signs of fraud or distress during live interactions. While this technology aims to close the gap between what people say and what they actually mean, it introduces significant privacy and ethical concerns. Inferring human emotion is inherently complex and can easily lead to bias or inaccurate risk profiling if used improperly. Consequently, industry experts caution that emotional data should merely provide helpful context rather than serve as definitive proof of identity or deception. As the market for this technology grows, organizations must implement it responsibly. This means ensuring clear user consent, strictly limiting data retention, and mandating human oversight so that unverified emotional inferences do not independently drive critical decisions regarding a person's access, credit, or employment.

The endpoint recovery gap many teams discover during an incident

Organizations often make a costly mistake by assuming that having data backups is the same as having a comprehensive recovery plan. According to Matthias Haas, CTO of IGEL, backups are essential for restoring information and applications, but they do not automatically grant users safe access back into their work environments. When a significant incident occurs and knocks thousands of devices offline, companies frequently realize they have planned for infrastructure recovery while completely ignoring endpoint recovery. This gap leads to enormous expenses tied to replacing hardware, reimaging devices, and coordinating manual repairs. A well-planned architecture must focus on restoring both the systems themselves and the trusted access to those systems. Rather than relying on technical heroics to fix thousands of individual devices during a crisis, businesses need pre-planned alternative paths, such as dual-boot options or secure browser resources. The true measure of resilience is not the number of threats a security team blocks, but the time it takes to safely restore trusted user access. By calculating the actual per-hour cost of interrupted workflows, security leaders can successfully justify investing in solid endpoint recovery before an incident even happens.

Daily Tech Digest - July 18, 2025

Quote for the day:

"It is during our darkest moments that we must focus to see the light." -- Aristotle Onassis

Machine unlearning gets a practical privacy upgrade

Machine unlearning, which refers to strategies for removing the influence of specific training data from a model, has emerged to fill the gap. But until now, most approaches have either been slow and costly or fast but lacking formal guarantees. A new framework called Efficient Unlearning with Privacy Guarantees (EUPG) tries to solve both problems at once. Developed by researchers at the Universitat Rovira i Virgili in Catalonia, EUPG offers a practical way to forget data in machine learning models with provable privacy protections and a lower computational cost. Rather than wait for a deletion request and then scramble to rework a model, EUPG starts by preparing the model for unlearning from the beginning. The idea is to first train on a version of the dataset that has been transformed using a formal privacy model, either k-anonymity or differential privacy. This “privacy-protected” model doesn’t memorize individual records, but still captures useful patterns. ... The researchers acknowledge that extending EUPG to large language models and other foundation models will require further work, especially given the scale of the data and the complexity of the architectures involved. They suggest that for such systems, it may be more practical to apply privacy models directly to the model parameters during training, rather than to the data beforehand.

Emerging Cloaking-as-a-Service Offerings are Changing Phishing Landscape

Cloaking-as-a-service offerings – increasingly powered by AI – are “quietly reshaping how phishing and fraud infrastructure operates, even if it hasn’t yet hit mainstream headlines,” SlashNext’s Research Team wrote Thursday. “In recent years, threat actors have begun leveraging the same advanced traffic-filtering tools once used in shady online advertising, using artificial intelligence and clever scripting to hide their malicious payloads from security scanners and show them only to intended victims.” ... The newer cloaking services offer advanced detection evasion techniques, such as JavaScript fingerprinting, device and network profiling, machine learning analysis and dynamic content swapping, and put them into user-friendly platforms that hackers and anyone else can subscribe to, SlashNext researchers wrote. “Cybercriminals are effectively treating their web infrastructure with the same sophistication as their malware or phishing emails, investing in AI-driven traffic filtering to protect their scams,” they wrote. “It’s an arms race where cloaking services help attackers control who sees what online, masking malicious activity and tailoring content per visitor in real time. This increases the effectiveness of phishing sites, fraudulent downloads, affiliate fraud schemes and spam campaigns, which can stay live longer and snare more victims before being detected.”

You’re Not Imagining It: AI Is Already Taking Tech Jobs

It’s difficult to pinpoint the exact motivation behind job cuts at any given company. The overall economic environment could also be a factor, marked by uncertainties heightened by President Donald Trump’s erratic tariff plans. Many companies also became bloated during the pandemic, and recent layoffs could still be trying to correct for overhiring. According to one report released earlier this month by the executive coaching firm Challenger, Gray and Christmas, AI may be more of a scapegoat than a true culprit for layoffs: Of more than 286,000 planned layoffs this year, only 20,000 were related to automation, and of those, only 75 were explicitly attributed to artificial intelligence, the firm found. Plus, it’s challenging to measure productivity gains caused by AI, said Stanford’s Chen, because while not every employee may have AI tools officially at their disposal at work, they do have unauthorized consumer versions that they may be using for their jobs. While the technology is beginning to take a toll on developers in the tech industry, it’s actually “modestly” created more demand for engineers outside of tech, said Chen. That’s because other sectors, like manufacturing, finance, and healthcare, are adopting AI tools for the first time, so they are adding engineers to their ranks in larger numbers than before, according to her research.

The architecture of culture: People strategy in the hospitality industry

Rewards and recognitions are the visible tip of the iceberg, but culture sits below the surface. And if there’s one thing that I’ve learned over the years, it’s that culture only sticks when it’s felt, not just said. Not once a year, but every single day. Hilton’s consistent recognition as a Great Place to Work® globally and in India stems from our unwavering support and commitment to helping people thrive, both personally and professionally. ... What has sustained our culture through this growth is a focus on the everyday. It is not big initiatives alone that shape how people feel at work, but the smaller, consistent actions that build trust over time. Whether it is how a team huddle is run, how feedback is received, or how farewells are handled, we treat each moment as an opportunity to reinforce care and connection. ... Equally vital is cultivating culturally agile, people-first leaders. South Asia’s diversity, across language, faith, generation, and socio-economic background, demands leadership that is both empathetic and inclusive. We’re working to embed this cultural intelligence across the employee journey, from hiring and onboarding to ongoing development and performance conversations, so that every team member feels genuinely seen and supported.

Capturing carbon - Is DAC a perfect match for data centers?

The commercialization of DAC, however, faces several significant challenges. One primary obstacle is navigating different compliance requirements across jurisdictions. Certification standards vary significantly between regions like Canada, the UK, and Europe, necessitating differing approaches in each jurisdiction. However, while requiring adjustments, Chadwick argues that these differences are not insurmountable and are merely part of the scaling process. Beyond regulatory and deployment concerns, achieving cost reductions is a significant challenge. DAC remains highly expensive, costing an average of $680 per ton to produce in 2024, according to Supercritical, a carbon removal marketplace. In comparison, Biochar has an average price of $165 per ton, and enhanced rock weathering has an average price of $310 per ton. In addition, the complexity of DAC means up-front costs are much higher than those of alternative forms of carbon removal. An average DAC unit comprises air-intake manifolds, absorption and desorption towers, liquid-handling tanks, and bespoke site-specific engineering. DAC also requires significant amounts of power to operate. Recent studies have shown that the energy consumption of fans in DAC plants can range from 300 to 900 kWh per ton of CO2 captures, which represents between 20 - 40 percent of total DAC system energy usage.

Rethinking Risk: The Role of Selective Retrieval in Data Lake Strategies

Selective retrieval works because it bridges the gap between data engineering complexity and security usability. It gives teams options without asking them to reinvent the wheel. It also avoids the need to bring in external tools during a breach investigation, which can introduce latency, complexity, or worse, gaps in the chain of custody. What’s compelling about this approach is that it doesn’t require businesses to abandon existing tools or re-architect their infrastructure. ... This model is especially relevant for mid-size IT teams who want to cover their audit requirements, but don’t have a 24/7 security operations center. It’s also useful in regulated sectors such as healthcare, financial services, and manufacturing where data retention isn’t optional, but real-time analysis for everything isn’t practical. ... Data volumes are continuing to rise. As organizations face high costs and fatigue, those that thrive will be the ones that treat storage and retrieval as distinct functions. The ability to preserve signal without incurring ongoing noise costs will become a critical enabler for everything from insider threat detection to regulatory compliance. Selective retrieval isn’t just about saving money. It’s about regaining control over data sprawl, aligning IT resources with actual risk, and giving teams the tools they need to ask, and answer, better questions.

Manufactured Madness: How To Protect Yourself From Insane AIs

The core of the problem lies in a well-intentioned but flawed premise: that we can and should micromanage an AI’s output to prevent any undesirable outcomes. These “guardrails” are complex sets of rules and filters designed to stop the model from generating hateful, biased, dangerous, or factually incorrect information. In theory, this is a laudable goal. In practice, it has created a generation of AIs that prioritize avoiding offense over providing truth. ... Compounding the problem of forced outcomes is a crisis of quality. The data these models are trained on is becoming increasingly polluted. In the early days, models were trained on a vast, curated slice of the pre-AI internet. But now, as AI-generated content inundates every corner of the web, new models are being trained on the output of their predecessors. ... Given this landscape, the burden of intellectual safety now falls squarely on the user. We can no longer afford to treat AI-generated text with passive acceptance. We must become active, critical consumers of its output. Protecting yourself requires a new kind of digital literacy. First and foremost: Trust, but verify. Always. Never take a factual claim from an AI at face value. Whether it’s a historical date, a scientific fact, a legal citation, or a news summary, treat it as an unconfirmed rumor until you have checked it against a primary source.

6 Key Lessons for Businesses that Collect and Use Consumer Data

Ensure your privacy notice properly discloses consumer rights, including the right to access, correct, and delete personal data stored and collected by businesses, and the right to opt-out of the sale of personal data and targeted advertising. Mechanisms for exercising those rights must work properly, with a process in place to ensure a timely response to consumer requests. ... Another issue that the Connecticut AG raised was that the privacy notice was “largely unreadable.” While privacy notices address legal rights and obligations, you should avoid using excessive legal jargon to the extent possible and use clear, simple language to notify consumers about their rights and the mechanisms for exercising those rights. In addition, be as succinct as possible to help consumers locate the information they need to understand and exercise applicable rights. ... The AG provided guidance that under the CTDPA, if a business uses cookie banners to permit a consumer to opt-out of some data processing, such as targeted advertising, the consumer must be provided with a symmetrical choice. In other words, it has to be as clear and as easy for the consumer to opt out of such use of their personal data as it would be to opt in. This includes making the options to accept all cookies and to reject all cookies visible on the screen at the same time and in the same color, font, and size.

How agentic AI Is reshaping execution across BFSI

Several BFSI firms are already deploying agentic models within targeted areas of their operations. The results are visible in micro-interventions that improve process flow and reduce manual load. Autonomous financial advisors, powered by agentic logic, are now capable of not just reacting to user input, but proactively monitoring markets, assessing customer portfolios, and recommending real-time changes.. In parallel, agentic systems are transforming customer service by acting as intelligent finance assistants, guiding users through complex processes such as mortgage applications or claims filing. ... For Agentic AI to succeed, it must be integrated into operational strategy. This begins by identifying workflows where progress depends on repetitive human actions that follow predictable logic. These are often approval chains, verifications, task handoffs, and follow-ups. Once identified, clear rules need to be defined. What conditions trigger an action? When is escalation required? What qualifies as a closed loop? The strength of an agentic system lies in its ability to act with precision, but that depends on well-designed logic and relevant signals. Data access is equally important. Agentic AI systems require context. That means drawing from activity history, behavioural cues, workflow states and timing patterns.

Open Source Is Too Important To Dilute

The unfortunate truth is that these criteria don’t apply in every use case. We’ve seen vendors build traction with a truly open project. Then, worried about monetization or competition, they relicense it under a “source-available” model with restrictions, like “no commercial use” or “only if you’re not a competitor.” But that’s not how open source works. Software today is deeply interconnected. Every project — no matter how small or isolated — relies on dependencies, which rely on other dependencies, all the way down the chain. A license that restricts one link in that chain can break the whole thing. ... Forks are how the OSS community defends itself. When HashiCorp relicensed Terraform under the Business Source License (BSL) — blocking competitors from building on the tooling — the community launched OpenTofu, a fork under an OSI-approved license, backed by major contributors and vendors. Redis’ transition away from Berkeley Software Distribution (BSD) to a proprietary license was a business decision. But it left a hole — and the community forked it. That fork became Valkey, a continuation of the project stewarded by the people and platforms who relied on it most. ... The open source brand took decades to build. It’s one of the most successful, trusted ideas in software history. But it’s only trustworthy because it means something.