Daily Tech Digest - September 16, 2022

The AI-First Future of Open Source Data

If we take it one step further from the GPL for data, we begin to see the value equation of data, or “the data-in-to-data-out ratio” as Augustin calls it. He uses the example of why people are so willing to give up parts of their data and privacy to websites because the small amount of data they’re handing over returns greater value back to them. Augustin sees the data-in-to-data-out ratio as a tipping point in open source data. Calling it one of his application principles, Augustin suggests that data engineers should focus on providing users with more value but take less and less information from them. He also wants to figure out a way never to ask your users for anything. You’re only providing them an advantage. For example, new app users will always be asked for information. But how can we skip that step and collect data directly in exchange for providing value? “Most people are willing to [give up data] because they get a lot of utility back. Think about the ratio of how much you put in versus how much you get back. You get back an awful lot. People are willing to give up so much of their personal information because they get a lot back,” he says.


How User Interface Testing Can Fit into the CI/CD Pipeline

Reliance on manual testing is why organizations can’t successfully implement CI/CD. If CI/CD involves manual processes that cannot be sustained as it slows down the entire delivery cycle. Testing is no longer the sole responsibility of developers or testers only and it takes investment and integration in infrastructure. Developer teams need to focus on building the coverage that is essential. They should focus on testing workflows and not features to be more efficient. Additionally, manual testers who are not developers can still be part of the process, provided that they use a testing tool that gives them the required automation capabilities in a low code environment. For example, with Telerik Test Studio, a manual tester can create an automated test by interacting with the application’s UI in a browser. That test can be presented in a way without code and as a result, they can learn how the code behaves. Another best practice in making UI testing efficient is to be selective with what is included in the pipeline. 


Want to change a dysfunctional culture? Intel’s Israel Development Center shows how

Intel’s secret weapon, one that until recently it did not talk about much, is its Israel Development Center. It is the largest employer in Israel, a nation surrounded by hostile countries, and women and men are treated more equally than in most other countries I’ve studied. They are highly supportive of each other, making it an incredibly supportive country for women in a wide variety of industries. The facility itself is impressively large and well-built and eclipses Intel’s corporate office in both size and security. The work done there really defines Intel’s historic success in both product performance and quality, making it an example of how a company should be run. Surprisingly, the collaborative and supportive country culture overrode the hostile and self-destructive corporate culture that has defined the US tech industry. What Gelsinger has done is showcase the development center as a template for the rest of Intel, as a firm more tolerant of failure, more supportive of women and focused like a laser on product quality, performance and caring for Intel’s customers.


Uber security breach 'looks bad', potentially compromising all systems

While it was unclear what data the ride-sharing company retained, he noted that whatever it had most likely could be accessed by the hacker, including trip history and addresses. Given that everything had been compromised, he added that there also was no way for Uber to confirm if data had been accessed or altered since the hackers had access to logging systems. This meant they could delete or alter access logs, he said. In the 2016 breach, hackers infiltrated a private GitHub repository used by Uber software engineers and gained access to an AWS account that managed tasks handled by the ride-sharing service. It compromised data of 57 million Uber accounts worldwide, with hackers gaining access to names, email addresses, and phone numbers. Some 7 million drivers also were affected, including details of more than 600,000 driver licenses. Uber later was found to have concealed the breach for more than a year, even resorting to paying off hackers to delete the information and keep details of the breach quiet.


What Is GPS L5, and How Does It Improve GPS Accuracy?

L5 is the most advanced GPS signal available for civilian use. Although it’s primarily meant for life-critical and high-performance applications, such as helping aircraft navigate, it’s available for everyone, like the L1 signal. So the manufacturers of mass-market consumer devices such as smartphones, fitness trackers, in-car navigation systems, and smartwatches are integrating it into their devices to offer the best possible GPS experience. One of the key advantages that the L5 signal possesses is that it uses the 1176.45MHz radio frequency, which is reserved for aeronautical navigation worldwide. As such, it doesn’t have to worry about interference from any other radio wave traffic in this frequency, such as television broadcasts, radars, and any ground-based navigation aids. With L5 data, your device can access more advanced methods to determine which signals have less error and effectively pinpoint the location. It’s particularly helpful at areas where GPS signal can be received but is severely degraded.


Digital transformation: How to get buy-in

Today’s IT leader has to be much more than tech-savvy, they have to be business-savvy. IT leaders of today are expected to identify and build support for transformational growth, even when it’s not popular. At Clarios, I included “Challenge the Status Quo, Be a Respectful Activist” to our IT guiding principles, knowing that around any CEO or general manager’s table they need one or two disruptors – IT leaders should be one. However, once that activist IT leader sells their vision to the boss, now they have to drive change in their peers and the entire organization, without formal authority. ... Our IT leaders can gain buy-in on new ideas by actively listening to our business partners. Our focus is to understand from their perspective the challenges impeding their work by rounding in our hospital locations to see first-hand the issues. So when we propose solutions, it is from their perspective. Utilizing these practices, we can bring forth the vision of Marshfield Clinic Health System because we can implement technology that bridges human interaction between our patients and care teams, which is at the heart of healthcare.


How to Prepare for New PCI DSS 4.0 Requirements

There are several impactful changes to the requirements associated with DSS v4.0 compliance, ranging from policy development (all changes will require some level of policy changes), to Public Key Infrastructure (PKI), as there will be multiple changes related to how keys and certificates are managed. Carroll points out there will also be remote access issues, including defined changes to how systems may be accessed remotely, and risk assessments -- now required to multiple and regular “targeted risk assessments” to capture risk in a format specified by the PCI DSS. Dan Stocker, director at Coalfire, a provider of cybersecurity advisory services, points out fintech is growing rapidly, with innovative uses for credit card data. “Entities should realistically evaluate their obligations under PCI," he says. “Use of descoping techniques, such as tokenization, can reduce total cost of compliance, but also limit product development choices.” He explains modern enterprises have multiple compliance obligations across diverse topics, such as financial reporting, privacy, and in the case of service providers, many more.


Building Large-Scale Real-Time JSON Applications

A critical part of building large-scale JSON applications is to ensure the JSON objects are organized efficiently in the database for optimal storage and access. Documents may be organized in the database in one or more dedicated sets (tables), over one or more namespaces (databases) to reflect ingest, access and removal patterns. Multiple documents may be grouped and stored in one record either in separate bins (columns) or as sub-documents in a container group document. Record keys are constructed as a combination of the collection-id and the group-id to provide fast logical access as well as group-oriented enumeration of documents. For example, the ticker data for a stock can be organized in multiple records with keys consisting of the stock symbol (collection-id) + date (group-id). Multiple documents can be accessed using either a scan with a filter expression (predicate), a query on a secondary index, or both. A filter expression consists of the values and properties of the elements in JSON. For example, an array larger than a certain size or value is present in a sub-tree. A secondary index defined on a basic or collection type provides fast value-based queries described below.


Digital self defense: Is privacy tech killing AI?

AI needs data. Lots of it. The more data you can feed a machine learning algorithm, the better it can spot patterns, make decisions, predict behaviours, personalise content, diagnose medical conditions, power smart everything, detect cyber threats and fraud; indeed, AI and data make for a happy partnership: “The algorithm without data is blind. Data without algorithms is dumb.” Even so, some digital self defense maybe in order. But AI is at risk. Not everyone wants to share, at least, not under the current rules of digital engagement. Some individuals disengage entirely, becoming digital hermits. Others proceed with caution, using privacy-enhancing technologies (PETs) to plug the digital leak: a kind karate chop, digital self defense — they don’t trust website privacy notices, they verify them with tools like DuckDuckGo’s Privacy Grade extension and soon, machine-readable privacy notices. They don’t tell companies their preferences; they enforce them with dedicated tools, and search anonymously using AI-powered privacy-protective search engines and browsers like Duck Duck Go, Brave and Firefox. 


Why Mutability Is Essential for Real-Time Data Analytics

At Facebook, we built an ML model that scanned all-new calendar events as they were created and stored them in the event database. Then, in real-time, an ML algorithm would inspect this event and decide whether it is spam. If it is categorized as spam, then the ML model code would insert a new field into that existing event record to mark it as spam. Because so many events were flagged and immediately taken down, the data had to be mutable for efficiency and speed. Many modern ML-serving systems have emulated our example and chosen mutable databases. This level of performance would have been impossible with immutable data. A database using copy-on-write would quickly get bogged down by the number of flagged events it would have to update. If the database stored the original events in Partition A and appended flagged events to Partition B, this would require additional query logic and processing power, as every query would have to merge relevant records from both partitions. Both workarounds would have created an intolerable delay for our Facebook users, heightened the risk of data errors, and created more work for developers and/or data engineers.



Quote for the day:

"Leadership and learning are indispensable to each other." -- John F. Kennedy

No comments:

Post a Comment