Daily Tech Digest - May 08, 2021

Gartner says composable data and analytics key to digital transformation

Gartner said business-facing data initiatives were key drivers of digital transformation in the enterprise. Research showed that 72% of data and analytics leaders are leading, or are heavily involved, in their organizations’ digital transformation efforts. These data leaders now confront emerging trends on various fronts. XOps: The evolution of DataOps to support AI and machine learning workflows is now XOps. The X could also stand for MLOps, ModelOps, and even FinOps. This promises to bring flexibility and agility in coordinating the infrastructure, data sources, and business needs in new ways. Engineering decision intelligence: Decision support is not new, but now decision-making is more complex. Engineering decision intelligence frames a wide range of techniques, from conventional analytics to AI to align and tune decision models and make them more repeatable, understandable, and traceable. Data and analytics as the core business function: With the chaos of the pandemic and other disruptors, data and analytics are becoming more central to an organization’s success. Companies will have to prioritize data and analytics as core functions rather than as secondary activity handled by IT.

Everything you need to know to land a job in data science

What does it take to get hired? Organizations are looking for job candidates with a bachelor's or master's degree in computer science, as well as experience with data modeling tools, XML, Python, Java, SQL, AWS and Hadoop. Many data scientist job descriptions also mention the ability to work with a distributed and fast-moving team. Interpreting data for colleagues in business units is increasingly important as well. Ryan Boyd, head of developer relations at Databricks, said that data science will soon be a commonplace skill outside engineering and IT departments as data becomes increasingly fundamental to businesses. "To stay competitive, data scientists need to be equally as obsessed with data storytelling as they are with the minutiae of data software and programs," said Boyd. "Tomorrow's best data scientists will be expected to translate their know-how into actionable insights and compelling stories for different stakeholders across the business, from C-suite executives to product managers." Whether you are looking for your first data science job or figuring out your next career move in the field, the following advice from hiring managers and data science professionals will help you plot a smart and successful course.

Observability and GitOps

The old supervision methods have reached their limits in the supervision of the new standards of application architecture. The management of highly scalable and portable microservices requires the adaptation of tools in order to facilitate debugging and diagnosis at all times, thus, requiring the observability of systems. Often, monitoring and observability are confused. Basically, the idea of a monitoring system is to get a state of the system based on a predefined set of metrics to detect a known set of issues. According to the SRE book by Google, a monitoring system needs to answer two simple questions: “What’s broken, and why?” Analyzing an application over the long term makes it possible to profile it in order to better understand its behavior regarding external events and, thus, be proactive in its management. Observability, on the other hand, aims to measure the understanding of a system state based on multiple outputs. This means observability is a system capability, like reliability, scalability, or security, that must be designed and implemented during system design, coding, and testing.

Defending Against Web Scraping Attacks

Web scraping can easily lead to more significant attacks. At my company, we routinely use Web scraping as one of the initial steps in a red team or phishing engagement. By pulling the metadata from posted documents, we can find employee names, usernames, and deduce username and email formats, which is particularly helpful when the username format would otherwise be difficult to guess. Mix this with scraping a list of current employees from sites like LinkedIn, and an adversary can perform targeted phishing and credential brute-force attacks. ... Scraping document metadata is also useful for detecting internal hostnames and software versions in use at the targeted company. This enables an attacker to customize the attack to exploit vulnerabilities specific to that company, and it is an important part of victim reconnaissance. Adversaries can also use scraping to collect gated information from a website if that information isn't properly protected. Take Facebook's password-reset page: Anyone can find privately listed people through a simple query with a phone number. While a password-reset page may be necessary, does it really need to confirm or, worse, return a user's private information?

From DevOps to MLOPS: Integrate Machine Learning Models using Jenkins and Docker

Continuous integration (CI) and continuous delivery (CD), known as CI/CD pipeline, embody a culture with agile operating principles and practices for DevOps teams that allows software development teams to change code more frequently and reliably or data scientist to continuously test the models for accuracy. CI/CD is a way to focus on business requirements such as improved models accuracy, automated deployment steps or code quality. Continuous integration is a set of practices that drive development teams to continuously implement small changes and check in code to version control repositories. Today, data scientists and IT ops have at their disposal different platforms (on premises, private and public cloud, multi-cloud …) and tools that need to be addressed by an automatic integration and validation mechanism allowing building, package and test applications with agility. Continuous delivery steps in when continuous integration ends by automating the delivery of applications to selected platforms.

Data Discovery for Business Intelligence

Any company that has had a BI tool for more than a year will deal with the dashboard clutter problem. Ad-hoc analysis, quarterly reports, and even core dashboards get outdated or change to a new version over time. The problem is, old dashboards usually don’t get deleted. No one wants to delete a dashboard in the shared folder because someone might be using it. This creates a long tail of clutter and inactive reports that people may poke around in, but they won’t be sure if the data is reliable or relevant. Navigating BI tools becomes its own tribal knowledge task and, it ends up being best to ask others to send you a specific link to open. What could be worse is that there may be someone relying on an outdated dashboard for their day-to-day operations. This often happens because dashboard metadata and its freshness isn’t tracked automatically. Connecting dashboard metadata along with its operational metrics like the last successful report run, last edited time, and top users can give visibility into the health of the dashboard. By comparing usage data along with operational metrics, outdated data models can easily be identified and cleaned out.

Big data is the key to everything. Here are four ways to improve how you use it

While most companies want to focus on the exciting bits, it's the infrastructure that matters. "I think it's almost like a bamboo tree; unless your roots are strong, your tree won't shoot up 90 feet. So for me, the focus on roots is super important," he says. When the foundation is right, you can then start to explore some of the interesting elements of data. During the past 12 months, for example, KFC has strengthened its own digital channels in response to the coronavirus pandemic. Traffic to the web app increased significantly through 2020 as click-and-collect and curb-side pick-up became more popular. ... "When the grape is cut from the vineyard, you don't have much time to make the fermentation process because the grape is degrading in the truck. So we have to move fast," he says. With brands such as Casillero del Diablo and Don Melchor, Concha y Toro operates in over 140 countries, making it one of the biggest wine companies in the world. Data is especially important at harvest time, when the company brings trucks with grapes from different parts of Chile to its wineries.

Four Technologies Disrupting Banking

Blockchain, or distributed ledger technology, has the potential to radically change who has control over our personally identifiable information (PII) and make financial institutions — and online transactions — much more trustworthy. Blockchain can help prove a person’s identity, allowing consumers to create a verified, digital identity they can use with any online institution. By leveraging public key cryptography and referencing a person’s verified credentials on a trustworthy, shared log (the distributed ledger), blockchain can help give people control over their digital identity credentials. Consumers could keep their identity credentials safe and use them as cryptographic evidence whenever their bank or another online business needs to verify their identity. They could also revoke access at any time. A blockchain infrastructure across the internet would give consumers a portable identity to use in digital channels and true control over their PII disclosure. This can help stop fraudulent payment transactions. Currently, if a transaction is disputed as fraud, there are few ways for a business to prove it is legitimate, which results in billions of dollars in losses annually due to chargebacks.

Email security is a human issue

Humans will inevitably make mistakes when it comes to phishing emails, but it is possible to mitigate these risks by ensuring that cyber defense strategies are at the front and center of business processes, as well as integrated within company culture. This will ensure teams are made aware of potential threats before they run the risk of falling victim to them. IT teams are often expected to take sole responsibility for a company’s cybersecurity strategy, yet it is impossible for these experts to monitor the email activity of each employee. With human error cited as a contributing factor in 95% of breaches, it is important to remember that email security – alongside many other areas of cyber defense – is a human issue and each member of the team poses a significant risk. While IT professionals should take the lead by distributing relevant information about the latest phishing campaigns targeting their industry, it is also the responsibility of managerial staff to flag IT concerns in their team meetings and integrate cybersecurity issues into regular company updates. These discussions can be started by IT leaders, but the topic of cybersecurity must be discussed by each department in order to ensure phishing emails do not fly under the radar.

Key Metrics to Track and Drive Your Agile Devops Maturity

Agile software delivery is a complex process that can hide significant inefficiencies and bottlenecks. Fortunately the process is easily measureable as there is a rich digital footprint in the tool-sets used across the process – from pre-development; development; integration & deployment; and out into live software management. However surfacing data from these myriad data sources and synthesising meaningful metrics that compare ‘apples with apples’ across complex Agile delivery environments is very tricky. Hence until recently, software delivery metrics have been much discussed but little used, until the arrival of Value Stream Management and BI solutions that enable the surfacing of accurate end-to-end software delivery metrics for the first time. ... Cycle Time is an ideal delivery metric for early stage practitioners. It simply measures the time taken to develop an increment of software. Unlike the more comprehensive measure of Lead Time, Cycle Time is easier to measure as it looks only at the time taken to take a ticket from the backlog, code and test that ticket – in preparation for integration and deployment to live.

Quote for the day:

"The litmus test for our success as Leaders is not how many people we are leading, but how many we are transforming into leaders" -- Kayode Fayemi

1 comment: