November 14, 2015

How Cloud Computing Changes Storage Tiering

New challenges in controlling data traversing the data center and the cloud have emerged. How do we handle replication? How do we ensure data integrity? How do we optimally utilize storage space within our cloud model? The challenge is translating the storage-efficiency technology that’s already been created for the data center — things like deduplication, thin provisioning, and data tiering — for the cloud. ... cloud computing adds an extra tier. “Cloud introduces another storage tier which, for example, allows for moving data to an off-premise location for archival, backups, or the elimination of off-site infrastructure for disaster recovery,” he said. “This, when combined with virtual DR data center, can create a very robust cloud-ready data tier.”

3 Tips for Managing a Boss You Don't Even Like

“You don’t have to love your boss but you need to be able to work well with them. One of the main reasons employees leave their job is because of their boss. A troubled relationship with your boss can negatively affect your morale, your productivity, your happiness, and of course, your career. A positive relationship can improve your morale, productivity and happiness which could lead to more career success in the form of promotions, raises and higher self-esteem.” Everyone can contribute to the workplace happiness, which means that much too often, no-one does.

Big Data and Social Listening

The purposes behind social listening are simple: To extract unsolicited opinion, to gather real world case studies, and to examine sentiment about products and services. For example, a company that makes smartwatches wants to know what its competition has done correctly, where the flaws are in its products, and how consumers feel that the products could be improved upon. The company wants to gather opinions, ratings, reviews, and sentiments about competitor’s smartwatches before investing time and money into a product that has no advantages over current offerings. To gather this information manually would put the company out of the proper release cycle and perhaps make the product obsolete by the time it debuts.

Service design thinking

This move towards service design thinking is not even necessarily an adjunct of the digital movement - better service design need not involve technology at all, but instead brings a long-absent focus on the users of public services and their needs in place of the internal imperatives of the providing organisations. Pioneering initiatives from the "The Public Office" in 2007 to the joint seminar series on "Innovating Through Design in Public Services"hosted by the London School of Economics Public Policy Group and the Design Council in 2010-11 highlighted some of the art of the possible. At last, some of that thinking finally seems to be entering the mainstream.

You Don’t Have to Choose Between ‘Big Data’ and ‘IoT’

With the technologies that are at our disposal today, we can attempt way more than we were able to in the past. In fact, we now can uncover problems that we didn’t know we had. The key is to examine each process that makes a difference to the business and ask questions that challenge the status quo – it is important to imagine new ways of doing current tasks and to ponder the possibility of doing things we always wished we could do but didn’t have a way of doing. To those out there who are trying to decide where to spend their IT budget, don’t get trapped in the mindset that you have to consider a “big data” or an “IoT” initiative. Make the call solely on the merits of the problem at hand and make a commitment to using the most capable technology platform out there.

Agile, DevOps and Cloud, 1 + 1 + 1 can equal more than 3

Where agile breaks the barriers in development, DevOps (development and operations) integrates operations. It industrializes the process of creating software and gets it into production. Webopedia defines DevOps X as a phrase in enterprise software development used to mean a type of agile relationship between development and IT operations. The goal of DevOps is to change and improve the relationship by advocating better communication and collaboration between the two business units. ... Cloud enables the developer to provision his development environment at the touch of a button. When he logs on, he’s provided with a number of development environment options. He chooses the one most applicable for the job and has it provisioned quickly. His advantage is that he doesn’t need to construct his development environment any further.

The seven people you need on your data team

You’ve been asked to start a data team to extract valuable customer insights from your product usage, improve your company’s marketing effectiveness, or make your boss look all “data-savvy” (hopefully not just the last one of these). And even better, you’ve been given carte blanche to go hire the best people! But now the panic sets in – who do you hire? Here’s a handy guide to the seven people you absolutely have to have on your data team. ... The one I have in mind is a team that takes raw data from various sources and turns it into valuable insights that can be shared broadly across the organization. This team needs to understand both the technologies used to manage data, and the meaning of the data – a pretty challenging remit, and one that needs a pretty well-balanced team to execute.

Building a Recommendation Engine with Spark ML on Amazon EMR using Zeppelin

Spark is commonly used for iterative machine learning algorithms at scale. Furthermore, Spark includes a library with common machine learning algorithms, MLlib, which can be easily leveraged in a Spark application. For an example, see the "Large-Scale Machine Learning with Spark on Amazon EMR" post on the AWS Big Data Blog. Spark succeeds where traditional MapReduce approach fails, making it easy to develop and execute iterative algorithms. Many ML algorithms are based on iterative optimization, which makes Spark a great platform for implementing them. Other open-source alternatives for building ML models are either relatively slow, such as Mahout using Hadoop MapReduce, or limited in their scale, such as Weka or R.

Microsoft open sources Distributed Machine Learning Toolkit

The toolkit, available now on GitHub, is designed for distributed machine learning -- using multiple computers in parallel to solve a complex problem. It contains a parameter server-based programing framework, which makes machine learning tasks on big data highly scalable, efficient and flexible. It also contains two distributed machine learning algorithms, which can be used to train the fastest and largest topic model and the largest word-embedding model in the world. The toolkit offers rich and easy-to-use APIs to reduce the barrier of distributed machine learning, so researchers and developers can focus on core machine learning tasks like data, model and training.

The Biggest Misconception About Information Security

Technologies are used to protect information and ensure its confidentiality, integrity and availability. But according to a recent survey by the Ponemon Institute on “Risk & Innovation in Cybersecurity Investments,” 90% of respondents said their organization invested in a technology that was ultimately discontinued or scrapped before or soon after deployment. In other words, these technology investments become “shelfware” which means they sit on the shelf instead of being properly implemented or utilized. There are a variety of reasons to explain this shelfware phenomenon but they predominantly boil down to people and process issues. Some organizations lack the resources to properly staff and support their technologies, a problem that many are able to solve through the use of Managed Security Services.

Quote for the day:

"Strength comes from overcoming adversity, not avoiding it." -- Gordon Tredgold

Tech Bytes - Daily Digest

Pages