Daily Tech Digest - July 26, 2025


Quote for the day:

"Small daily improvements over time lead to stunning results." -- Robin Sharma


Data Engineering in the Age of AI: Skills To Master Now

Streaming requires a new mindset. You must reason about event time compared to processing time, manage watermarking and windowing and guarantee exactly-once semantics even when things change midstream. These design patterns must be built into your pipelines from the beginning. ... Agentic AI stretches the typical data engineer’s streaming data skill set because it is no longer about a single model running in isolation. Today, we see networks of perception agents, reasoning agents and execution agents working together, each handling tasks and passing insights to the next in real time. If you know only how to schedule batch ETL jobs or deploy an inference server, you’re missing a core skill: how to build high-throughput, low-latency pipelines that keep these agents reliable and responsive in production. ... A single slow or broken stream can cause cascading failures in multiagent systems. Use schema registries, enforce data contracts and apply exactly-once semantics to maintain trust in your streaming infrastructure. ... Communication presents another challenge. Data scientists often discuss “precision” as a metric that data engineers must translate into reality. Implement evaluation scores like factual consistency checks, entity precision comparisons and human-in-the-loop review pipelines.


Anthropic unveils ‘auditing agents’ to test for AI misalignment

Anthropic said the first environment it developed tests an agent’s ability to complete an alignment audit for an intentionally misaligned model. For this test, the researchers brought the investigator agent to the environment and equipped it with similar tools that human teams use. The researchers instructed the agent to identify the root cause of the misalignment and determine the specific behaviors that were misaligned. According to the researchers, the investigator agent successfully found the root cause of the issues 10-13% of the time. Still, this performance improved to 42% “when using a super-agent approach that aggregates findings across multiple investigations.” ... The next step in most alignment audits is typically a behavioral evaluation, which determines whether the model consistently misbehaves in various settings. To test its evaluation agent, the researchers designed 10 models “given system prompt instructions to exhibit a specific behavior, such as excessive deference to the user.” They ran the agent five times per model and saw that the agent correctly finds and flags at least one quirk of the model. However, the agent sometimes failed to identify specific quirks consistently. It had trouble evaluating subtle quirks, such as self-promotion and research-sandbagging, as well as quirks that are difficult to elicit, like the Hardcode Test Cases quirk.


The agentic experience: Is MCP the right tool for your AI future?

As enterprises race to operationalize AI, the challenge isn't only about building and deploying large language models (LLMs), it's also about integrating them seamlessly into existing API ecosystems while maintaining enterprise level security, governance, and compliance. Apigee is committed to lead you in this journey. Apigee streamlines the integration of gen AI agents into applications by bolstering their security, scalability, and governance. While the Model Context Protocol (MCP) has emerged as a de facto method of integrating discrete APIs as tools, the journey of turning your APIs into these agentic tools is broader than a single protocol. This post highlights the critical role of your existing API programs in this evolution and how ... Leveraging MCP services across a network requires specific security constraints. Perhaps you would like to add authentication to your MCP server itself. Once you’ve authenticated calls to the MCP server you may want to authorize access to certain tools depending on the consuming application. You may want to provide first class observability information to track which tools are being used and by whom. Finally, you may want to ensure that whatever downstream APIs your MCP server is supplying tools for also has minimum guarantees of security like already outlined above


AI Innovation: 4 Steps For Enterprises To Gain Competitive Advantage

A skill is a single ability, such as the ability to write a message or analyze a spreadsheet and trigger actions from that analysis. An agent independently handles complex, multi-step processes to produce a measurable outcome. We recently announced an expanded network of Joule Agents to help foster autonomous collaboration across systems and lines of business. This includes out-of-the-box agents for HR, finance, supply chain, and other functions that companies can deploy quickly to help automate critical workflows. AI front-runners, such as Ericsson, Team Liquid, and Cirque du Soleil, also create customized agents that can tackle specific opportunities for process improvement. Now you can build them with Joule Studio, which provides a low-code workspace to help design, orchestrate, and manage custom agents using pre-defined skills, models, and data connections. This can give you the power to extend and tailor your agent network to your exact needs and business context. ... Another way to become an AI front-runner is to tackle fragmented tools and solutions by putting in place an open, interoperable ecosystem. After all, what good is an innovative AI tool if it runs into blockers when it encounters your other first- and third-party solutions? 


Hard lessons from a chaotic transformation

The most difficult part of this transformation wasn’t the technology but getting people to collaborate in new ways, which required a greater focus on stakeholder alignment and change management. So my colleague first established a strong governance structure. A steering committee with leaders from key functions like IT, operations, finance, and merchandising met biweekly to review progress and resolve conflicts. This wasn’t a token committee, but a body with authority. If there were any issues with data exchange between marketing and supply chain, they were addressed and resolved during the meetings. By bringing all stakeholders together, we were also able to identify discrepancies early on. For example, when we discovered a new feature in the inventory system could slow down employee workflows, the operations manager reported it, and we immediately adjusted the rollout plan. Previously, such issues might not have been identified until after the full rollout and subsequent finger-pointing between IT and business departments. The next step was to focus on communication and culture. From previous failed projects, we knew that sending a few emails wasn’t enough, so we tried a more personal approach. We identified influential employees in each department and recruited them as change champions.


Benchmarks for AI in Software Engineering

HumanEval and SWE-bench have taken hold in the ML community, and yet, as indicated above, neither is necessarily reflective of LLMs’ competence in everyday software engineering tasks. I conjecture one of the reasons is the differences in points of view of the two communities! The ML community prefers large-scale, automatically scored benchmarks, as long as there is a “hill climbing” signal to improve LLMs. The business imperative for LLM makers to compete on popular leaderboards can relegate the broader user experience to a secondary concern. On the other hand, the software engineering community needs benchmarks that capture specific product experiences closely. Because curation is expensive, the scale of these benchmarks is sufficient only to get a reasonable offline signal for the decision at hand (A/B testing is always carried out before a launch). Such benchmarks may also require a complex setup to run, and sometimes are not automated in scoring; but these shortcomings can be acceptable considering a smaller scale. For exactly these reasons, these are not useful to the ML community. Much is lost due to these different points of view. It is an interesting question as to how these communities could collaborate to bridge the gap between scale and meaningfulness and create evals that work well for both communities.


Scientists Use Cryptography To Unlock Secrets of Quantum Advantage

When a quantum computer successfully handles a task that would be practically impossible for current computers, this achievement is referred to as quantum advantage. However, this advantage does not apply to all types of problems, which has led scientists to explore the precise conditions under which it can actually be achieved. While earlier research has outlined several conditions that might allow for quantum advantage, it has remained unclear whether those conditions are truly essential. To help clarify this, researchers at Kyoto University launched a study aimed at identifying both the necessary and sufficient conditions for achieving quantum advantage. Their method draws on tools from both quantum computing and cryptography, creating a bridge between two fields that are often viewed separately. ... “We were able to identify the necessary and sufficient conditions for quantum advantage by proving an equivalence between the existence of quantum advantage and the security of certain quantum cryptographic primitives,” says corresponding author Yuki Shirakawa. The results imply that when quantum advantage does not exist, then the security of almost all cryptographic primitives — previously believed to be secure — is broken. Importantly, these primitives are not limited to quantum cryptography but also include widely-used conventional cryptographic primitives as well as post-quantum ones that are rapidly evolving.


It’s time to stop letting our carbon fear kill tech progress

With increasing social and regulatory pressure, reluctance by a company to reveal emissions is ill-received. For example, in Europe the Corporate Sustainability Reporting Directive (CSRD) currently requires large businesses to publish their emissions and other sustainability datapoints. Opaque sustainability reporting undermines environmental commitments and distorts the reference points necessary for net zero progress. How can organisations work toward a low-carbon future when its measurement tools are incomplete or unreliable? The issue is particularly acute regarding Scope 3 emissions. Scope 3 emissions often account for the largest share of a company’s carbon footprint and are those generated indirectly along the supply chain by a company’s vendors, including emissions from technology infrastructure like data centres. ... It sounds grim, but there is some cause for optimism. Most companies are in a better position than they were five years ago and acknowledge that their measurement capabilities have improved. We need to accelerate the momentum of this progress to ensure real action. Earth Overshoot Day is a reminder that climate reporting for the sake of accountability and compliance only covers the basics. The next step is to use emissions data as benchmarks for real-world progress.


Why Supply Chain Resilience Starts with a Common Data Language

Building resilience isn’t just about buying more tech, it’s about making data more trustworthy, shareable, and actionable. That’s where global data standards play a critical role. The most agile supply chains are built on a shared framework for identifying, capturing, and sharing data. When organizations use consistent product and location identifiers, such as GTINs (Global Trade Item Numbers) and GLNs (Global Location Numbers) respectively, they reduce ambiguity, improve traceability, and eliminate the need for manual data reconciliation. With a common data language in place, businesses can cut through the noise of siloed systems and make faster, more confident decisions. ... Companies further along in their digital transformation can also explore advanced data-sharing standards like EPCIS (Electronic Product Code Information Services) or RFID (radio frequency identification) tagging, particularly in high-volume or high-risk environments. These technologies offer even greater visibility at the item level, enhancing traceability and automation. And the benefits of this kind of visibility extend far beyond trade compliance. Companies that adopt global data standards are significantly more agile. In fact, 58% of companies with full standards adoption say they manage supply chain agility “very well” compared to just 14% among those with no plans to adopt standards, studies show.


Opinion: The AI bias problem hasn’t gone away you know

When we build autonomous systems and allow them to make decisions for us, we enter a strange world of ethical limbo. A self-driving car forced to make a similar decision to protect the driver or a pedestrian in a case of a potentially fatal crash will have much more time than a human to make its choice. But what factors influence that choice? ... It’s not just the AI systems shaping the narrative, raising some voices while quieting others. Organisations made up of ordinary flesh-and-blood people are doing it too. Irish cognitive scientist Abeba Birhane, a highly-regarded researcher of human behaviour, social systems and responsible and ethical artificial intelligence was asked to give a keynote recently for the AI for Good Global Summit. According to her own reports on Bluesky, a meeting was requested just hours before presenting her keynote: “I went through an intense negotiation with the organisers (for over an hour) where we went through my slides and had to remove anything that mentions ‘Palestine’ ‘Israel’ and replace ‘genocide’ with ‘war crimes’…and a slide that explains illegal data torrenting by Meta, I also had to remove. In the end, it was either remove everything that names names (Big Tech particularly) and remove logos, or cancel my talk.” 

No comments:

Post a Comment