Machine Learning, Cognitive Computing, and the Storage Industry

In context with my recent posts about object storage and software defined storage, this is another topic that simply interested me enough to want to do a bit of research about the topic in general, as well as how it relates to the industry that I work in.  I discovered that there is a wealth of information on the topics of Machine Learning, Cognitive Computing, Artificial Intelligence, and Neural Networking, so much that writing a summary is difficult to do.  Well, here’s my attempt.

There is pressure in the enterprise software space to incorporate new technologies in order to keep up with the needs of modern businesses. As we move farther into 2017, I believe we are approaching another turning point in technology where many concepts that were previously limited to academic research or industry niches are now being considered for actual mainstream enterprise software applications.  I believe you’ll see Machine learning and cognitive systems becoming more and more visible in the coming years in the enterprise storage space. For the storage industry, this is very good news. As this technology takes off, it will result in the need to retain massive amounts of unstructured data in order to train the cognitive systems. Once machines can learn for themselves, they will collect and generate a huge amount of data to be stored, intelligently categorized and subsequently analyzed.

The standard joke about artificial intelligence (or machine learning in general) is that, like nuclear fusion, it has been the future for more than half a century now.  My goal in this post is to define the concepts, look at ways this technology has already been implemented, look at how it affects the storage industry, and investigate use cases for this technology.  I’m writing this paragraph before I start, so we’ll see how that goes. 🙂

 What is Cognitive Computing?

Cognitive computing is the simulation of human thought processes using computerized modeling (the most well know example is probably IBM’s Watson). It incorporates self-learning systems that use data mining, pattern recognition and natural language processing to imitate the way our brains process thoughts. The goal of cognitive computing is to create automated IT systems that are capable of solving problems without requiring human assistance.

This sounds like the stuff of science fiction, right? HAL (from the movie “2001 Space Odyssey”) came to the logical conclusion that his crew had to be eliminated. It’s my hope that intelligent storage arrays utilizing cognitive computing will come to the conclusion that 99.9 percent of stored data has no value and therefore should be deleted.  It would eliminate the need for me to build my case for archiving year after year. J

Cognitive computing systems work by using machine learning algorithms, they are inescapably linked. They will continuously gather knowledge from the data fed into them by mining data for information. The systems will progressively refine the methods the look for and process data until they become capable of anticipating new problems and modeling possible solutions.

Cognitive computing is a new field that is just beginning to emerge. It’s about making computers more user friendly with an interface that understands more of what the user wants. It takes signals about what the user is trying to do and provides an appropriate response. Siri, for example, can answer questions but also understands context of the question. She can ascertain whether the user is in a car or at home, moving quickly and therefore driving, or moving more slowly while walking. This information contextualizes the potential range of responses, allowing for increased personalization.

What Is Machine Learning?

Machine Learning is a subset of the larger discipline of Artificial Intelligence, which involves the design and creation of systems that are able to learn based on the data they collect. A machine learning system learns by experience. Based on specific training, the system will be able to make generalizations based on its exposure to a number of cases and will then be able to perform actions after new or unforeseen events. Amazon already use this technology, it’s part of their recommendation engine. It’s also commonly used by ad feed systems that provide ads based on web surfing history.

While machine learning is a tremendously powerful tool for extracting information from data, but it’s not a silver bullet for every problem. The questions must be framed and presented in a way that allows the learning algorithms to answer them. Because the data needs to be set up in the appropriate way, that can add additional complexity. Sometimes the data needed to answer the questions may not be available. Once the results are available, they also need to be interpreted to be useful and it’s essential to understand the context. A sales algorithm can tell a salesman what’s working the best, but he still needs to know how to best use that information to increase his profits.

What’s the difference?

Without cognition there cannot be good Artificial intelligence, and without Artificial Intelligence cognition can never be expressed. I Cognitive computing involves self-learning systems that use pattern recognition and natural language processing to mimic the way how the human brain works. The goal of cognitive computing is to create automated systems that are capable of solving problems without requiring human assistance. Cognitive computing is used in A.I. applications, hence Cognitive Computing is also actually subset of Artificial Intelligence.

If this seems like a slew of terms that all mean almost the same thing, you’d be right. Cognitive Computing and Machine Learning can both be considered subsets of Artificial Intelligence. What’s the difference between artificial intelligence and cognitive computing? Let’s use a medical example. In an artificial intelligence system, machine learning would tell the doctor which course of action to take based on its analysis. In cognitive computing, the system would provide information to help the doctor decide, quite possibly with a natural language response (like IBM’s Watson).

In general, Cognitive computing systems include the following ostensible characteristics:

  • Machine Learning
  • Natural Language Processing
  • Adaptive algorithms
  • Highly developed pattern recognition
  • Neural Networking
  • Semantic understanding
  • Deep learning (Advanced Machine Learning)

How is Machine Learning currently visible in our everyday lives?

Machine Learning has fundamentally changed the methods in which businesses relate to their customers. When you click “like” on a Facebook post your feed is dynamically adjusted to contain more content like that in the future. When you buy a Sony PlayStation on Amazon, and it recommends that you also buy an extra controller and a top selling game for the console, that’s their recommendation engine at work. Both of those examples use machine learning technology, and both affect most people’s everyday lives. Machine language technology delivers educated recommendations to people to help them make decisions in a world of almost endless choices.

Practical business applications of Cognitive Computing and Machine Learning

Now that we have a pretty good idea of what this all means, how is this technology actually being used today in the business world? Artificial Intelligence has been around for decades, but has been slow to develop due to the storage and compute requirements being too expensive to allow for practical applications. In many fields, machine learning is finally moving from science labs to commercial and business applications. With cloud computing and robust virtualized storage solutions providing the infrastructure and necessary computational power, machine learning developments are offering new capabilities that can greatly enhance enterprise business processes.

The major approaches today include using neural networkscase-based learninggenetic algorithmsrule induction, and analytic learning. The current uses of the technology combine all of these analytic methods, or a hybrid of them, to help guarantee effective, repeatable, and reliable results. Machine learning is a reality today and is being used very effectively and efficiently. Despite what many business people might assume, it’s no longer in its infancy. It’s used quite effectively across a wide array of industry applications and is going to be part of the next evolution of enterprise intelligence business offerings.

There are many other machine learning can have an important role. This is most notable in systems that with so much complexity that algorithms are difficult to design, when an application requires the software to adapt to an operational environment, or with applications that need to work with large and complex data sets. In those scenarios, machine learning methods play an increasing role in enterprise software applications, especially for those types of applications that need in-depth data analysis and adaptability like analytics, business intelligence, and big data.

Now that I’ve discussed some general business applications for the technology, I’ll dive in to how this technology is being used today, or is in development and will be in use in the very near future.

  1. Healthcare and Medicine. Computers will never completely replace doctors and nurses, but in many ways machine learning is transforming the healthcare industry. It’s improving patient outcomes and in general changing the way doctors think about how they provide quality care. Machine learning is being implemented in health care in many ways: Improving diagnostic capabilities, medicinal research (medicines are being developed that are genetically tailored to a person’s DNA), predictive analytics tools to provide accurate insights and predictions related to symptoms, diagnoses, procedures, and medications for individual patients or patient groups, and it’s just beginning to scratch the surface of personalized care. Healthcare and personal fitness devices connected via the Internet of Things (IoT) can also be used to collect data on human and machine behavior and interaction. Improving quality of life and people’s health is one of the most exciting use cases of Machine Learning technologies.
  2. Financial services. Machine Learning is being used for predicting credit card risk, managing an individual’s finances, flagging criminal activity like money laundering and fraud, as well as automating business processes like call centers and insurance claim processing with trained AI agents. Product recommendation systems for a financial advisor or broker must leverage current interests, trends, and market movements for long periods of time, and ML is well suited to that task.
  3. Automating business analysis, reporting, and work processes. Machine learning automation systems that use detailed statistical analysis to process, analyze, categorize, and report on their data exist today. Machine learning techniques can be used for data analysis and pattern discovery and can play an important role in the development of data mining applications. Machine learning is enabling companies to increase growth and optimize processes, increase customer satisfaction, and improve employee engagement.As one specific example, adaptive analytics can be used to help stop customers from abandoning a website by analyzing and predicting the first signs they might log off and causing live chat assistance windows to appear. They are also good at upselling by showing customers the most relevant products based on their shopping behavior at that moment. A large portion of Amazon’s sales are based on their adaptive analytics, you’ll notice that you always see “Customers who purchased this item also viewed” when you view an item on their web site.Businesses are presently using Machine learning to improve their operations in other many ways. Machine learning technology allows business to personalize customer service, for example with chatbots for customer relations. Customer loyalty and retention can be improved by mining customer actions and targeting their behavior. HR departments can improve their hiring processes by using ML to shortlist candidates. Security departments can use ML to assist with detecting fraud by building models based on historical transactions and social media. Logistics departments can improve their processes by allowing contextual analysis of their supply chain. The possibilities for the application of this technology across many typical business challenges is truly exciting.
  4. Playing Games. Machine learning systems have been taught to play games, and I’m not just talking about video games. Board game like Go, IBM’s Watson in games of Chess and Jeopardy, as well as in modern real time strategy video games, all with great success. When Watson defeated Brad Rutter and Ken Jennings in the Jeopardy! Challenge of February 2011, showcasing Watson’s ability to learn, reason, and understand natural language with machine learning technology. In game development, Machine learning has been used for gesture recognition in Kinect and camera based interfaces, and It has also been used in some fighting style games to analyze the style of moves of the human to mimic the human player, such as the character ‘Mokujin’ in Tekken.
  5. Predicting the outcome of legal proceedings. A system developed by a team of British and American researchers was proven to be able to correctly predict a court’s decision with a high degree of accuracy. The study can be viewed here: https://peerj.com/articles/cs-93/. While computers are not likely to replace judges and lawyers, the technology could very effectively be used to assist the decision making process.
  6. Validating and Customizing News content. Machine learning can be used to create individually personalized news and screening and filtering out “fake news” has been a more recent investigative priority, especially given today’s political landscape. Facebook’s director of AI research Yann LeCun was quoted saying that machine learning technology that could squash fake news “either exists or can be developed.” A challenge aptly named the “Fake News Challenge” was developed for technology professionals, you can view their site http://www.fakenewschallenge.org/ for more information. Whether or not it actually works is dubious at the moment, but the application of it could have far reaching positive effects for democracy.
  7. Navigation of self-driving cars. Using sensors and onboard analytics, cars are learning to recognize obstacles and react to them appropriately using Machine Learning. Google’s experimental self-driving cars currently rely on a wide range of radar/lidar and other sensors to spot pedestrians and other objects. Eliminating some or all of that equipment would make the cars cheaper and easier to design and speed up mass adoption of the technology. Google has been developing its own video-based pedestrian detection system for years using machine learning algorithms. Back in 2015, its system was capable of accurately identifying pedestrians within 0.25 seconds, with 0.07-second identification being the benchmark needed for such a system to work in real-time.This is all good news for storage manufacturers. Typical luxury cars have up to around 200 GB of storage today, primarily for maps and other entertainment functionality. Self-driving cars will likely need terabytes of storage, and not just for the car to drive itself. Storage will be needed for intelligent assistants in the car, advanced voice and gesture recognition, caching software updates, and caching files to storage to reduce peak network bandwidth utilization.
  8. Retail Sales. Applications of ML are almost limitless when it comes to retail. Product pricing optimization, sales and customer service trending and forecasting, precise ad targeting with data mining, website content customization, prospect segmentation are all great examples of how machine learning can boost sales and save money. The digital trail left by customer’s interactions with a business both online and offline can provide huge amounts of data to a retailer. All of that data is where Machine learning comes in. Machine learning can look at history to determine which factors are most important, and to find the best way to predict what will occur based on a much larger set of variables. Systems must take into account today’s market trends not only for the past year, but for what happened as recently as 1 hour ago in order to implement real-time personalization. Machine learning applications can discover which items are not selling and pull them from the shelves before a salesperson notices, and even keep overstock from showing up in the store at all with improved procurement processes. A good example of the machine learning personalized approach to customers can be found once you get in the Jackets and Vests section of the North Face website. Click on “Shop with IBM Watson” and experience what is almost akin to a human sales associate helping you choose which jacket you need.
  9. Recoloring black and white images. Ted Turner’s dream come true. J Using computers to recognize objects and learn what they should look like to humans, color can be returned to both black and white pictures and video footage. Google’s DeepDream (https://research.googleblog.com/2015/07/deepdream-code-example-for-visualizing.html) is probably the most well-known example of one. It has been trained by examining millions of images of just about everything. It analyzes images in black and white and then colors them the way it thinks they should be colored. The “colorize” project is also taking up the challenge, you can view their progress at http://tinyclouds.org/colorize/ and download the code. A good online example is at Algorithmia, which allows you to upload and convert an image online. http://demos.algorithmia.com/colorize-photos/
  10. Enterprise Security. Security and loss of are major concerns for the modern enterprise. Some storage vendors are beginning to use artificial intelligence and machine learning to prevent data loss, increase availability and reduce downtime via smart data recovery and systematic backup strategies. Machine learning allows for smart security features to detect data and packet loss during transit and within data centers.Years ago it was common practice to spend a great deal of time reviewing security logs on a daily basis. You were expected to go through everything and manually determine the severity of any of the alerts or warnings as you combed through mountains of information. As time progresses it becomes more and more unrealistic for this process to remain manual. Machine learning technology is currently implemented and is very effective at filtering out what deviates from normal behavior, be it with live network traffic or mountains of system log files. While humans are also very good at finding patterns and noticing odd things, computers are really good at doing that repetitive work at a much larger scale, complementing what an analyst can do.Interested in looking at some real world examples of Machine Learning as it relates to security? There’s many out there. Clearcut is one example of a tool that uses machine learning to help you focus on log entries that really need manual review. David Bianco created a relatively simple Python script that can learn to find malicious activity in HTTP proxy logs. You can download David’s script here: https://github.com/DavidJBianco/Clearcut. I also recommend taking a look at the Click Security project, which also includes many code samples. http://clicksecurity.github.io/data_hacking/, as well as PatternEx, a SecOps tool that predicts cyber attacks. https://www.patternex.com/.
  11. Musical Instruments. Machine learning can also be used in more unexpected ways, even in creative outlets like making music. In the world of electronic music there are new synthesizers and hardware created and developed often, and the rise in machine learning is altering the landscape. Machine learning will allow instruments the potential to be more expressive, complex and intuitive in ways previously experienced only through traditional acoustic instruments. A good example of a new instrument using machine learning is the Mogees instrument. This device has a contact microphone that picks up sound from everyday objects and attaches to your iPhone. Machine learning could make it possible to use a drum machine then adapts to your playing style, learning as much about the player as the player learns about the instrument. Simply awe inspiring.

What does this mean for the storage industry?

As you might expect, this is all very good news for the storage industry and very well may lead to more and more disruptive changes. Machine learning has an almost insatiable appetite for data storage. It will consume huge quantities of capacity while at the same time require very high levels of throughput. As adoption of Cognitive Computing, Artificial Intelligence, and machine learning grows, it will attract a growing number of startups eager to solve the many issues that are bound to arise.

The rise of Machine learning is set to alter the storage industry in very much the same way that PC’s helped reshape the business world in the 1980’s. Just as PCs have advanced from personal productivity applications like Lotus 1-2-3 to large-scale Oracle databases, Machine learning is poised to evolve from consumer type functions like Apple’s Siri to full scale data driven programs that will drive global enterprises. So, in what specific ways is this technology set to alter and disrupt the storage industry? I’ll review my thoughts on that below.

  1. Improvements in Software-Defined Storage. I recently dove into Software defined storage in a blog post (https://thesanguy.com/2017/06/15/defining-software-defined-storage-benefits-strategy-use-cases-and-products/). As I described in that post, there are many use cases and a wide variety of software defined storage products in the market right now. Artificial Intelligence and machine learning will spark faster adoption of software-defined storage, especially as products are developed that use the technology to allow storage to be self-configurable. Once storage is all software-defined, algorithms can be integrated and far-reaching enough to process and solve complicated storage management problems because of the huge amount of data they can now access. This is a necessary step to build the monitoring, tuning, healing service abilities needed for self-driving software defined storage.
  2. Overall Costs will be reduced. Enterprises are moving towards cloud storage and fewer dedicated storage arrays. Dynamic software defined software that integrates machine learning could help organizations more efficiently utilize the capacity that they already own.
  3. Hybrid Storage Clouds. public vs. private clouds has been a hot topic in the storage industry, and with the rise of machine learning and software-defined storage it’s becoming more and more of a moot point. Well-designed software-defined architectures should be able to transition data seamlessly from one type of cloud to another, and machine learning will be used to implement that concept without human intervention. Data will be analyzed and logic engines will automate data movement. The hybrid cloud is very likely to flourish as machine learning technologies are adopted into this space.
  4. Flash Everywhere. Yes, the concept of “flash first” has been promoted for years now, and machine learning simply furthers that simple truth. The vast amount of data that machine learning needs to process will further increase the demand for throughput and bandwidth, and flash storage vendors will be lining up to fill that need.
  5. Parallel File Systems. Storage systems will have to deliver performance and throughput at scale in order to support machine learning technologies. Parallel file system can effectively reduce the problems of massive data storage and I/O bottlenecks. With its focus on high performance access to large data sets, parallel file systems combined with flash could be considered an entry point to full scale machine learning systems.
  6. Automation. Software-defined storage has had a large influence in the rise of machine learning in storage environments. Adding a heterogeneous software layer abstracted from the hardware allows the software to efficiently monitor many more tasks. The additional automation allows adminisrators like myself much more time for more strategic work.
  7. Neural Storage. Neural storage (“deep learning”) is designed to recognize and respond to problems and opportunities without any human intervention. It will drive the need for massive amounts of storage as it is utilized in modern businesses. It uses artificial neural networks, which are simplified computer simulations of how biological neurons behave to extract rules and patterns from sets of data. Unsurprisingly (based on it’s name) the concept is inspired by the way biological nervous systems process information. In general, think of of neural storage as many layers of processing on mountain-sized mounds of data. Data is fed through neural networks that are logical constructions that ask a series of binary true/false questions, or extract a numerical value of every bit of data which pass through them and classify it according to the answers that were tallied up. Deep Learning work is focused on developing these networks, which is why they became what are known as Deep Neural Networks (logic networks of the complexity needed to deal with classifying enormous datasets, think google-scale data). Using Google Images as an example, with datasets as massive and comprehensive as these and logical networks sophisticated enough to handle their classification, it becomes relatively trivial to take an image and state with a high probability of accuracy what it represents to humans.

How does Machine Learning work?

At its core, Machine learning works by recognizing patterns (such as facial expressions or spoken words), extracting insight from those patterns, discovering anomalies in those patterns, and then making evaluations and predictions based on those discoveries.

The principle can be summed up with the following formula:

Machine Learning = Model Representation + Parameter Evaluation + Learning & Optimization

Model Representation: The system that makes predictions or identifications. Includes the use of a object element represented in a formal language that a computer can handle and interpret.

Parameter Evaluation: A function needed to distinguish or evaluate the good and bad objects, the factors used by the model to form it’s decisions.

Learning & Optimization: The method used to search among these classifiers within the language to find the highest scoring ones. This is the learning system that adjust the parameters and looks at predictions vs. actual outcome.

How do we apply machine learning to a problem? First and foremost, a pattern must exist in the input data that would allow a conclusion to be drawn. To solve a problem with machine learning, the machine learning algorithm must have a pattern to deduce information from. Next, there must be a sufficient amount of data to apply machine learning to a problem. If there isn’t enough data to analyze, it will compromise the validity of the end result. Finally, machine learning is used to derive meaning from the data and perform structured learning to arrive at a mathematical approximation to describe the behavior of the problem. Therefore if the conditions above aren’t met, it will be a waste of time to apply machine learning to a problem through structured learning. All of these conditions must be met for machine learning to be successful.

Summary

Machines may not have reached the point where they can make full decisions without humans, but they have certainly progressed to the point where they can make educated, accurate recommendations to us so that we have an easier time making decisions. Current machine learning systems have delivered tremendous benefits by automating tabulation and harnessing computational processing and programming to improve both enterprise productivity and personal productivity.

Cognitive systems will learn and interact to provide expert assistance to scientists, engineers, lawyers, and other professionals in a fraction of the time it now takes. While they will likely never replace human thinking, cognitive systems will extend our cognition and free us to think more creatively and effectively, and be better problem solvers.

 

 

 

 

 

 

 

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s