ML systems need to have two components to be able to do that, Huyen notes. They need fast inference, i.e. models that can make predictions in the order of milliseconds. And they also need real-time pipelines, i.e. pipelines that can process data, input it into models, and return a prediction in real-time. To achieve faster inference, Huyen goes on to add, models can be made faster, they can be made smaller, or hardware can be made faster. The focus on inference, TinyML, and AI chips that we've been covering in this column is perfectly aligned to this, and naturally, these approaches are not mutually exclusive either. Huyen also embarked on an analysis on streaming fundamentals and frameworks, something that has also seen wide coverage on this column from early on. Many companies are switching from batch processing to stream processing, from request-driven architecture to event-driven architecture, and this is tied to the popularity of frameworks such as Apache Kafka and Apache Flink. This change is still slow in the US but much faster in China, Huyen notes.
While the industry frets over this counterrevolution of sorts, crypto insiders who report fraud and illegal activity to the government could see significant upside. Regulators, such as the SEC, the CFTC, the Financial Crimes Enforcement Network, and the Internal Revenue Service, need whistleblowers who can provide an inside look at the operations of a company or industry segment, helping regulators identify fraud and illegal activities well before wrongdoers irreparably injure investors, customers and the public. Information from insiders can also help regulators target their enforcement actions and rulemaking to address the worst actors in the space, which can help prevent regulators from unnecessarily quashing innovative and valuable aspects of the cryptocurrency industry. In exchange for this information, whistleblowers can earn awards under various federal whistleblower rewards programs, provided the whistleblower properly filed a tip that contributed to a qualifying enforcement action. In the case of the SEC and CFTC programs, and now the newly enhanced AML whistleblower program, a whistleblower can receive an award of up to 30% of an enforcement action of more than $1 million.
Everyone has data pipelines compiled of lots of different systems. Some may even look very sophisticated on the surface, but the reality is there’s lots of complexity to them––and maybe unnecessarily so. Between the plumbing work to connect different components, the constant performance monitoring required, or the large team with unique expertise to run, debug and manage them, all these factors can add time-to-market delays and operational overhead for product teams. And that’s not all. The more systems you use, the more places you are duplicating your data, which increases the chances of data going out-of-sync or stale. Further, since components may be developed independently by different companies, the upgrades or bug fixes might break your pipeline and data layer. ... The variables such as the data format, schema and protocol add up to what’s called the “transformation overhead.” Other variables like performance, durability and scalability add up to what’s called the “pipeline overhead.” Put together, these classifications contribute to what’s known as the “impedance mismatch.”
Under this new definition, decentralized exchanges such as Uniswap would be subject to SEC regulations and would therefore need to register with the SEC as a securities broker. As decentralized exchanges have no way of complying with the current demands placed on securities exchanges by the SEC, the new legislation would effectively kill decentralized exchanges operating within the United States. DeFi enthusiast Gabriel Shapiro highlighted the potential devastating effects of the proposal in a blog post, noting that “because the proposal achieves this expansion by providing new restraints on ‘communication protocols,’ I believe it may also be unconstitutional as a restraint on free speech,” taking a strong stance against the proposed changes. He also suggested that under the new definition, the SEC could class block explorers, such as Etherscan, as securities exchanges because they allow users to interact with smart contracts to communicate trading interests. Shapiro is not the only prominent figure to come out against the SEC’s proposed legislation.
In many businesses, when an employee moves to a new job, all that’s left behind is a digital shadow. Their knowledge, expertise and experience disappear, and new hires and old colleagues alike struggle to fill the gaps. A trail of data breadcrumbs that lead to nowhere — old messages, outdated docs and dusty email chains — are often all busy ex-teammates are left to rely on. As a result, business productivity suffers. Of course, this isn’t the fault of the person who has moved roles. Their expertise belongs to them, and too often, organizations undervalue that expertise, further fuelling resignations. It’s in the hands of businesses to do more to retain business-critical knowledge and smooth the transition for new teammates. Nobody should be having to rely on guesswork from day one. And if they are, chances are they too won’t stick around for long. To overcome these challenges, we need to think innovatively and start optimizing our tech stacks to reduce knowledge drain and fast-track problem-solving. The solution isn’t more collaboration or communication apps.
The yearlong investigation by Bergman and Mazzetti also alleges that a group of Israeli computer engineers arrived at a New Jersey building used by the bureau in June 2019 and started testing their equipment. The report alleges that the FBI had bought a version of Pegasus, NSO’s premier spying tool. "For nearly a decade, the Israeli firm had been selling its surveillance software on a subscription basis to law-enforcement and intelligence agencies around the world, promising that it could do what no one else - not a private company, not even a state intelligence service - could do: consistently and reliably crack the encrypted communications of any iPhone or Android smartphone," says the NYT report. As part of their training on the tool, bureau employees bought new smartphones, with SIM cards from other countries. This version of Pegasus that the FBI bought was zero click, i.e. it did not require users to click on a malicious attachment or link - so the users in the U.S. monitoring phones could see no evidence of an ongoing breach.
Keeping software updated is key to applying both these rules, and unfortunately that’s often a problem for enterprises. Desktop software, particularly with WFH, is always a challenge to update, but a combination of centralized software management and a scheduled review of software versions on home systems can help. For operations tools, don’t be tempted to skip versions in open source tools just because they seem to happen a lot. It’s smart to include a version review of critical operations software as part of your overall program of software management and take a close look at new versions at least every six months. Even with all of this, it’s unrealistic to assume that an enterprise can anticipate all the possible threats posed by all the possible bad actors. Preventing disease is best, but treating it once symptoms arise is essential, too. The most underused security principle is that preventing bad behavior means understanding good behavior. Whatever the source of a security problem, it almost always means that something is doing something it shouldn’t be. How can we know that? By watching for different patterns of behavior.
The ELT steps can seem simple enough on the surface, but with a lot of moving parts, an increasing number of sources and increasing ways to use the data, a lot can go wrong. Data engineers need to contend with complex scheduling requirements, creating dependencies between tasks, figuring out what can run in parallel and what needs to run in series, what makes for a successful task run, how to checkpoint tasks and handle failures and restarts, how to check data quality, how and who to alert on fails -- all the stuff Airflow was designed to handle. The cloud only makes that process more complicated, with cloud buckets used to stage data from sources before loading that data into cloud-based distributed data management systems like Snowflake, Google Cloud Platform or Databricks. And here’s what I think is important: For many organizations, making the leap from exploratory data analysis [EDA] to formalizing what’s found into data pipelines has become increasing valuable.
Ironically, the decentralized markets selling NFTs are starting to centralize around one or two providers. One of the most popular, OpenSea, has a full takedown team dedicated to situations like York’s or Quinni’s. The company has taken off, reaching a stratospheric $13 billion valuation after a $300 million round in early January. The company is far and away the biggest player in the NFT market, with an estimated 1.26 million active users and over 80 million NFTs. According to DappRadar, the platform took in $3.27 billion in transactions in the last 30 days and managed 2.33 million transactions. Its nearest competitor, Rarible, saw $14.92 million in transactions in the same period. ... Interestingly, the company also seems to be cracking down on deep fakes or, as OpenSea calls it, non-consensual intimate imagery (NCII), a problem that hasn’t surfaced widely yet but could become pernicious for influencers and media stars. “We have a zero-tolerance policy for NCII,” they said. “NFTs using NCII or similar images (including images doctored to look like someone that they are not) are prohibited, and we move quickly to ban accounts that post this material.
The benefits of a decentralized network are varied, but because they don’t have to go through a “trusted party,” nobody has to know or trust anyone else. Every person in the network has a copy of the distributed ledger which contains the exact same data. If a person’s ledger is altered or corrupted, it will be rejected by the other members in the network. One of the cons of a decentralized network is that the more members that are in a network, the slower the network tends to be. In decentralized blockchain systems, unlike distributed systems, security is prioritized over performance. When a blockchain network scales up or out, while the network becomes more secure, performance slows down. This is because every member node has to validate all of the data that is being added to the ledger. “Most references place blockchain squarely in the realm of currencies or finances, but the applicability is far greater,” said Perella.“When the world wide web came about, most websites were maintained by individuals or groups hosting their own systems and data. This format would eventually become known as Web 1.0.
Quote for the day:
Integrity is the soul of leadership! Trust is the engine of leadership! - Amine A. Ayad