Daily Tech Digest by Kannan Subbiah: Daily Tech Digest

How generative AI will benefit physical industries

To make generative AI’s potential a reality for a physical business, two crucial elements come into play: people and data. Investing in a highly skilled team is a given precondition for success with any business. Also critical is having a diversity of expertise, as well as a diversity of experiences, cultural touch points, and background. Drawing on this expertise and experience to inform how generative AI is developed allows more context to be built-in, and the models can be expanded to serve a global audience versus a regional or national one. Data quality in both edge computing and generative AI models is crucial. This is what has driven Motive to invest in a truly world-class annotations team. Because accuracy is so critical for the safety and optimization of our customers, this team ensures that the processes behind our use of generative AI are strong and consistent. These processes include ensuring the highest quality data and labels to train our models, and thus our products and services. At the same time, generative AI in the physical economy will only be as useful as the insights and capabilities it creates.

Do you need a larger project team?

There is plenty of anecdotal evidence in the industry where GCs have taken on data center projects in EU regions and have not fully understood the local resourcing requirements and supply chain logistics. In addition, they have incorrectly assumed that a UK labor force will be as effective as normal, when they are on rotational-based attendance in a regional project office. Instead, the solution may lie in developing smaller, fully supported, highly competent, highly motivated, and well-compensated teams capable of delivering increased outputs to realize your competitive potential – a theme also adopted by the World Quality Week in 2023. To meet the strong imperative for quick time-to-market in the industry within the context of an acute skills shortage, we argue that the solution lies in focusing on training people and empowering them with the capabilities of AI. Streamlined, lean teams with mature AI tools have a better chance of efficiently delivering on larger projects. Investment in training is crucial across the industry, particularly innovative approaches that enable smaller teams to achieve more thanks to AI assistance and other technological advancements.

Data Observability in the Cloud: Three Things to Look for in Holistic Data Platform Governance

To be truly meaningful in addressing the pain associated with data and AI pipelines, data observability tools must expand into FinOps. It’s no longer enough to know where a pipeline stalls or breaks -- data teams need to know how much the pipelines cost. In the cloud, inefficient performance drives up computing costs, which in turn drives up total costs. Tools must encompass FinOps to provide observability into costs pertaining to both infrastructure and computing resources, broken down by job, user, and project. They must also include advanced analytics to provide guidance on how to make individual pipelines cost-efficient. This will free up data teams to focus on strategic decision-making rather than spending their time reconfiguring pipelines for cost. ... To meet these demands, data observability solution vendors must offer custom products that allow customers to see on a platform-specific level such things as detailed cost visibility, efficient management of storage costs, chargeback/showback, and where the expensive projects, queries, and users lie.

Fundamentals of Functions and Relations for Software Quality Engineering

Effective testing is not just about covering every line of code. It's about understanding the underlying relationships. How do we effectively test the complex relationships in our software code? Understanding functions and relations proves an invaluable asset in this endeavor. ... It's worth noting that while all programs can be viewed as functions in a broad sense, not all are "pure" functions. Pure functions have no side effects, meaning they solely rely on their inputs to produce outputs without altering any external state. In contrast, many practical programs involve side effects, complicating their pure function interpretation. ... While functions provide clear input-output connections, not all relationships in software are so straightforward. Imagine tracking dependencies between tasks in a project management tool. Here, multiple tasks might relate to each other, forming a more complex network. ... Relations can sometimes group elements into equivalence classes, where elements within a class behave similarly. Testers can leverage this by testing one element in each class, assuming similar behavior for others, saving time and resources.

Your AI Girlfriend Is Cheating On You, Warns Mozilla

Mozilla said it could find only one chatbot that met its minimum security standards, with a worrying lack of transparency over how the intensely personal information that might be shared in such apps is protected. Almost two thirds of the apps didn’t reveal whether the data they collect is encrypted. Just under half of them permitted the use of weak passwords, with some even accepting a password as flimsy as “1”. More than half of the apps tested also failed to let users delete their personal data. One even claimed that “communication via the chatbot belongs to the software.” Mozilla also found the use of trackers—tiny pieces of code that gather information about your device and what you do on it— was widespread among the romantic chatbots. ... The main tip is not to say anything to the chatbot that you wouldn’t want friends or colleagues to discover, as the privacy of these services cannot be guaranteed. Also use a strong password, request that personal data is deleted once you’ve finished using the chatbot, opt out of having your data used to train AI models and don’t accept phone permissions that give the chatbot access to your location, camera, microphone or files on your device.

A Balanced Look at the Potential and Challenges of Popular LLMs

A beautiful symphony requires more than just individual talent. Ethical considerations like potential biases and misinformation risks demand attention. We must ensure responsible development, ensuring these LLMs don’t become instruments of discord but rather powerful tools for good. The potential for collaboration is even more exciting. Imagine Bard fact-checking Claude’s poems, or Qwen providing real-time data for GPT-3.5-Turbo-0613’s code generation. Such collaborations could lead to groundbreaking innovations, a true ensemble performance exceeding the capabilities of any single LLM. This is just the opening act of a much grander performance. As the music evolves, LLMs hold immense potential. Advancements in natural language understanding could enable nuanced conversations, personalized education could become a reality, and creative collaboration could reach unprecedented heights. This orchestra is just beginning its performance, and the future holds a symphony of possibilities waiting to be composed. In short, The key lies in understanding their technical nuances, recognizing their individual strengths, and fostering responsible development.

Your fingerprints can be recreated from the sounds made when you swipe on a touchscreen

Without contact prints or finger detail photos, how can an attacker hope to get any fingerprint data to enhance MasterPrint and DeepMasterPrint dictionary attack results on user fingerprints? One answer is as follows: the PrintListener paper says that “finger-swiping friction sounds can be captured by attackers online with a high possibility.” The source of the finger-swiping sounds can be popular apps like Discord, Skype, WeChat, FaceTime, etc. Any chatty app where users carelessly perform swiping actions on the screen while the device mic is live. Hence the side-channel attack name – PrintListener. ... To prove the theory, the scientists practically developed their attack research as PrintListener. In brief, PrintListener uses a series of algorithms for pre-processing the raw audio signals which are then used to generate targeted synthetics for PatternMasterPrint. Importantly, PrintListener went through extensive experiments “in real-world scenarios,” and, as mentioned in the intro, can facilitate successful partial fingerprint attacks in better than one in four cases, and complete fingerprint attacks in nearly one in ten cases.

ClickHouse: Scaling Log Management with Managed Services

A viable solution emerges in the merging of the advantages of open-source tools with the efficiency of managed services. This combination effectively addresses scalability and cost concerns, while upholding the operational efficiency required. Striking this balance between functionality, cost, and effort is particularly critical for teams constrained by budget and limited engineering resources. To illustrate this approach, consider specific log management strategies, such as the one implemented by DoubleCloud, which embody these principles. DoubleCloud, for instance, employs services like ClickHouse for data transfer and visualization, effectively managing substantial log volumes within a modest budget. ClickHouse is renowned for its efficient data compression techniques, serving as a prime example of how open source tools, when properly managed, can significantly enhance log management processes. This scenario provides a practical demonstration of how the integration of open source benefits with managed services can offer optimal solutions to the challenges previously discussed.

4 hidden risks of your enterprise cloud strategy

Cloud vendors themselves can encounter any number of business-related issues that can challenge their ability to provide service to the standard enterprise CIOs committed to when the contract was signed, including the introduction of new risks. ... Many enterprise IT executives see the cloud as delivering near-infinite scalability — something that is not mathematically true. This is not helped by cloud marketing, which strongly implies — if not outright promises — unlimited scalability. Most of the time, the cloud’s elasticity affords great levels of scalability for its tenets. When emergency strikes, however, all bets are off, says Charles Blauner, operating partner and CISO in residence at cybersecurity investment firm Team8, and former CISO for Citigroup, Deutsche Bank, and JP Morgan Chase. ... “CIOs believe that by using multiple cloud providers, they think that it is improving availability, but it’s not. All it’s doing is increasing complexity, and complexity has always been the enemy of security,” Winckless says. “It is far more cost-effective to use the cloud provider’s zones.” Enterprises also often fall short on the financial and efficiency benefits promised by the cloud because they are unwilling to trust the cloud environment’s mechanisms sufficiently — or so argues Rich Isenberg, a partner at consulting firm McKinsey who oversees their cybersecurity strategy practice.

Data Governance in the Era of Generative AI

GenAI accelerates trends already evident with traditional AI: the importance of data quality and privacy, growing focus on responsible and ethical AI, and the emergence of AI regulations. This will create both new challenges and opportunities for DG. ... Traditional DG processes provide a well-trodden path for proper management and usage of data across organizations: discover and classify data to identify critical/sensitive data; map the data to policies and other business context; manage data access and security; manage privacy and compliance; and monitor and report on effectiveness. Similarly, as DG frameworks expand to support AI governance, they have an important role to play across the GenAI/LLM value chain. ... Traditional AI/ML will continue to be critical for automating and scaling various DG processes. These include data classification; associating policy and business context with data; and detecting anomalies/issues and creating and applying data quality rules to fix them. Building on these capabilities, GenAI has the potential to turbocharge data democratization and drive dramatic gains in productivity for data teams.

Quote for the day:

"You may be good. You may even be better than everyone esle. But without a coach you will never be as good as you could be." -- Andy Stanley

Daily Tech Digest by Kannan Subbiah

Daily Tech Digest - February 20, 2024