The AI-First Future of Open Source Data
If we take it one step further from the GPL for data, we begin to see the value
equation of data, or “the data-in-to-data-out ratio” as Augustin calls it. He
uses the example of why people are so willing to give up parts of their data and
privacy to websites because the small amount of data they’re handing over
returns greater value back to them. Augustin sees the data-in-to-data-out ratio
as a tipping point in open source data. Calling it one of his application
principles, Augustin suggests that data engineers should focus on providing
users with more value but take less and less information from them. He also
wants to figure out a way never to ask your users for anything. You’re only
providing them an advantage. For example, new app users will always be asked for
information. But how can we skip that step and collect data directly in exchange
for providing value? “Most people are willing to [give up data] because they get
a lot of utility back. Think about the ratio of how much you put in versus how
much you get back. You get back an awful lot. People are willing to give up so
much of their personal information because they get a lot back,” he says.
How User Interface Testing Can Fit into the CI/CD Pipeline
Reliance on manual testing is why organizations can’t successfully implement
CI/CD. If CI/CD involves manual processes that cannot be sustained as it slows
down the entire delivery cycle. Testing is no longer the sole responsibility of
developers or testers only and it takes investment and integration in
infrastructure. Developer teams need to focus on building the coverage that is
essential. They should focus on testing workflows and not features to be more
efficient. Additionally, manual testers who are not developers can still be part
of the process, provided that they use a testing tool that gives them the
required automation capabilities in a low code environment. For example, with
Telerik Test Studio, a manual tester can create an automated test by interacting
with the application’s UI in a browser. That test can be presented in a way
without code and as a result, they can learn how the code behaves. Another best
practice in making UI testing efficient is to be selective with what is included
in the pipeline.
Want to change a dysfunctional culture? Intel’s Israel Development Center shows how
Intel’s secret weapon, one that until recently it did not talk about much, is
its Israel Development Center. It is the largest employer in Israel, a nation
surrounded by hostile countries, and women and men are treated more equally than
in most other countries I’ve studied. They are highly supportive of each other,
making it an incredibly supportive country for women in a wide variety of
industries. The facility itself is impressively large and well-built and
eclipses Intel’s corporate office in both size and security. The work done there
really defines Intel’s historic success in both product performance and quality,
making it an example of how a company should be run. Surprisingly, the
collaborative and supportive country culture overrode the hostile and
self-destructive corporate culture that has defined the US tech industry. What
Gelsinger has done is showcase the development center as a template for the rest
of Intel, as a firm more tolerant of failure, more supportive of women and
focused like a laser on product quality, performance and caring for Intel’s
customers.
Uber security breach 'looks bad', potentially compromising all systems
While it was unclear what data the ride-sharing company retained, he noted that
whatever it had most likely could be accessed by the hacker, including trip
history and addresses. Given that everything had been compromised, he added that
there also was no way for Uber to confirm if data had been accessed or altered
since the hackers had access to logging systems. This meant they could delete or
alter access logs, he said. In the 2016 breach, hackers infiltrated a private
GitHub repository used by Uber software engineers and gained access to an AWS
account that managed tasks handled by the ride-sharing service. It compromised
data of 57 million Uber accounts worldwide, with hackers gaining access to
names, email addresses, and phone numbers. Some 7 million drivers also were
affected, including details of more than 600,000 driver licenses. Uber later was
found to have concealed the breach for more than a year, even resorting to
paying off hackers to delete the information and keep details of the breach
quiet.
What Is GPS L5, and How Does It Improve GPS Accuracy?
L5 is the most advanced GPS signal available for civilian use. Although it’s
primarily meant for life-critical and high-performance applications, such as
helping aircraft navigate, it’s available for everyone, like the L1 signal. So
the manufacturers of mass-market consumer devices such as smartphones, fitness
trackers, in-car navigation systems, and smartwatches are integrating it into
their devices to offer the best possible GPS experience. One of the key
advantages that the L5 signal possesses is that it uses the 1176.45MHz radio
frequency, which is reserved for aeronautical navigation worldwide. As such, it
doesn’t have to worry about interference from any other radio wave traffic in
this frequency, such as television broadcasts, radars, and any ground-based
navigation aids. With L5 data, your device can access more advanced methods to
determine which signals have less error and effectively pinpoint the location.
It’s particularly helpful at areas where GPS signal can be received but is
severely degraded.
Digital transformation: How to get buy-in
Today’s IT leader has to be much more than tech-savvy, they have to be
business-savvy. IT leaders of today are expected to identify and build support
for transformational growth, even when it’s not popular. At Clarios, I included
“Challenge the Status Quo, Be a Respectful Activist” to our IT guiding
principles, knowing that around any CEO or general manager’s table they need one
or two disruptors – IT leaders should be one. However, once that activist IT
leader sells their vision to the boss, now they have to drive change in their
peers and the entire organization, without formal authority. ... Our IT leaders
can gain buy-in on new ideas by actively listening to our business partners. Our
focus is to understand from their perspective the challenges impeding their work
by rounding in our hospital locations to see first-hand the issues. So when we
propose solutions, it is from their perspective. Utilizing these practices, we
can bring forth the vision of Marshfield Clinic Health System because we can
implement technology that bridges human interaction between our patients and
care teams, which is at the heart of healthcare.
How to Prepare for New PCI DSS 4.0 Requirements
There are several impactful changes to the requirements associated with DSS v4.0
compliance, ranging from policy development (all changes will require some level
of policy changes), to Public Key Infrastructure (PKI), as there will be
multiple changes related to how keys and certificates are managed. Carroll
points out there will also be remote access issues, including defined changes to
how systems may be accessed remotely, and risk assessments -- now required to
multiple and regular “targeted risk assessments” to capture risk in a format
specified by the PCI DSS. Dan Stocker, director at Coalfire, a provider of
cybersecurity advisory services, points out fintech is growing rapidly, with
innovative uses for credit card data. “Entities should realistically evaluate
their obligations under PCI," he says. “Use of descoping techniques, such as
tokenization, can reduce total cost of compliance, but also limit product
development choices.” He explains modern enterprises have multiple compliance
obligations across diverse topics, such as financial reporting, privacy, and in
the case of service providers, many more.
Building Large-Scale Real-Time JSON Applications
A critical part of building large-scale JSON applications is to ensure the
JSON objects are organized efficiently in the database for optimal storage and
access. Documents may be organized in the database in one or more dedicated
sets (tables), over one or more namespaces (databases) to reflect ingest,
access and removal patterns. Multiple documents may be grouped and stored in
one record either in separate bins (columns) or as sub-documents in a
container group document. Record keys are constructed as a combination of the
collection-id and the group-id to provide fast logical access as well as
group-oriented enumeration of documents. For example, the ticker data for a
stock can be organized in multiple records with keys consisting of the stock
symbol (collection-id) + date (group-id). Multiple documents can be accessed
using either a scan with a filter expression (predicate), a query on a
secondary index, or both. A filter expression consists of the values and
properties of the elements in JSON. For example, an array larger than a
certain size or value is present in a sub-tree. A secondary index defined on a
basic or collection type provides fast value-based queries described below.
Digital self defense: Is privacy tech killing AI?
AI needs data. Lots of it. The more data you can feed a machine learning
algorithm, the better it can spot patterns, make decisions, predict
behaviours, personalise content, diagnose medical conditions, power smart
everything, detect cyber threats and fraud; indeed, AI and data make for a
happy partnership: “The algorithm without data is blind. Data without
algorithms is dumb.” Even so, some digital self defense maybe in order. But AI
is at risk. Not everyone wants to share, at least, not under the current rules
of digital engagement. Some individuals disengage entirely, becoming digital
hermits. Others proceed with caution, using privacy-enhancing technologies
(PETs) to plug the digital leak: a kind karate chop, digital self defense —
they don’t trust website privacy notices, they verify them with tools like
DuckDuckGo’s Privacy Grade extension and soon, machine-readable privacy
notices. They don’t tell companies their preferences; they enforce them with
dedicated tools, and search anonymously using AI-powered privacy-protective
search engines and browsers like Duck Duck Go, Brave and Firefox.
Why Mutability Is Essential for Real-Time Data Analytics
At Facebook, we built an ML model that scanned all-new calendar events as they
were created and stored them in the event database. Then, in real-time, an ML
algorithm would inspect this event and decide whether it is spam. If it is
categorized as spam, then the ML model code would insert a new field into that
existing event record to mark it as spam. Because so many events were flagged
and immediately taken down, the data had to be mutable for efficiency and
speed. Many modern ML-serving systems have emulated our example and chosen
mutable databases. This level of performance would have been impossible with
immutable data. A database using copy-on-write would quickly get bogged down
by the number of flagged events it would have to update. If the database
stored the original events in Partition A and appended flagged events to
Partition B, this would require additional query logic and processing power,
as every query would have to merge relevant records from both partitions. Both
workarounds would have created an intolerable delay for our Facebook users,
heightened the risk of data errors, and created more work for developers
and/or data engineers.
Quote for the day:
"Leadership and learning are
indispensable to each other." -- John F. Kennedy
No comments:
Post a Comment