Ravindra Bidnur: November 2021

Saturday, November 20, 2021

Can AI catalyze the Semiconductor chip development ?

Abstract: Semiconductor industry is seeing new opportunity/investment for AI chip development and is racing against the time to catchup to meet the requirement. The complexity of the chip and time-to-market are huge challenges. So naturally the question will arise what companies are doing to address this. While Semiconductor chips plays a huge role as business growth opportunity, the same technology can benefit the industry to address the challenges in complex AI chip development (and also benefit other domain chip design) and can the investment in this helps for better margin. This blog will look at how AI can itself catalyse the semiconductor chip development.

Market opportunity –

Technology inflections such as 5G wireless, artificial intelligence, Internet of Things, cloud computing and machine learning are driving up the long-term demand for the chip industry. The artificial intelligence chip market was valued at $6,638 million in 2018, and is projected to reach $91,185 million by 2025, registering a CAGR of 45.2% from 2019 to 2025. Research shows that AI/ML now contributes between $5 billion and $8 billion annually to earnings before interest and taxes at semiconductor companies (Exhibit 2). This is impressive, but it reflects only about 10 percent of AI/ML’s full potential within the industry. Within the next two to three years, AI/ML could potentially generate between $35 billion and $40 billion in value annually. Over a longer time, frame—gains achieved four or more years in the future—this figure could rise to between $85 billion to $95 billion per year.

Challenges in front of Semiconductor chip industry –

Complexity – Cramming more and different kinds of processors and memories onto a die or into a package is causing the number of unknowns and the complexity of those designs to skyrocket. There are good reasons for combining all of these different devices into an SoC like capability of higher integration capacity in higher technology nodes, new methodologies and techniques of integrating more memories etc. If one see’s the complexity of AI chips it self the training chips ranges from 100sq mm to 400+ sq mm in die sizes in higher technology nodes (7nm and below). Due to higher integration the design and development complexity multiplies many folds. The following picture from cadence aptly explains this. It is clearly visible now to embrace new technologies and flows to aid development.

Time to Market – Time to market (TTM) is key for any product design. The products we are concerned having chips as integrated components, their development time steers the whole product development time. This itself (chip development) is getting challenged with increase in the complexity of the design taking more time. Sometime the present-day IT infrastructure and tools used create a bottleneck for development. So, it is going to be apparent to come up with new methodologies and adapt new technologies to speed-up, and cut development time.

Cost of development – Cost of development increasing and hence puts pressure of the profit margin. Mere increase in complexity forces more skilled man power, new tools, and increased compute resources. Increasing development resources will not help as we move with increased complexity, instead may create more hurdles to manage. So the point we are reaching now to see how we are going to re-look at our development to take advantage of new technologies in the development cycle. The new technology in development will itself is evolving and requires investment. So, companies have to have long term strategy to mange this with clear focus along with intended outcome.

Does AI itself comes as a catalyst for semiconductor chip development?

Surely Yes. AI itself will play a major role in chip development. This is being slowly getting proved in the industry. Let us look at the market survey result conducted by McKinsey in this area.

A comprehensive case-study shows that AI & ML will help in each development phase of Semiconductor chip and most of the benefits can be seen in manufacturing and Chip-design phase. The survey compares the current gains vs Near and long-term gains if Ai-ML solutions are deployed in development.

Let us look at just one development phase of Chip design and see how and where AI & ML can play a role. IF the development phased deployed with AI & ML enabled flows, companies can avoid time consuming iterations, accelerate yield ramp-up, and decrease the costs required to maintain yield. They may also automate the time-consuming processes related to physical-layout design and the verification process.

Although we are not yet at the point where AI/ML acceleration can be applied to all designs and to all stages of chip design, we do not see a fundamental reason why it cannot penetrate further over time. Above all, scaling AI/ML efforts must be a strategic priority for companies. The initial effort, which involves coordinating data, agreeing on priority use cases, and encouraging collaboration among the right business, data-science, and engineering talent, is too great to be successful as a bottom-up project, so what it means is a meaningful R&D strategy to be in-place with clear road-map. One way to do it is Ideally, the AI/ML effort will be linked to clear business targets, giving business units and business functions a joint interest in making the transformation successful.

For example, companies could identify cost savings development phases that are suits most and get to achieve the goal and result. The major time-consuming development phases like functional verification and physical design could be the use-cases. If current acquired human knowledge is utilised to come-up with appropriate Ai models and then tune them with on-going experiments, it is possible to see the end results.

This is even true for the VLSI design CAD tool vendors to see how they can involve AI & ML technology to upgrade their offering and helping the design & development community. If a collaboration can happen with tool vendors and chip design companies, there are umpteen possibilities of various development flows and methodologies can evolve which can help the chip development in turn the bottom line each company is looking. For a CAD tool company, they can offer AI & ML feature enabled tools to gain more margin for their tool cost.

This is another huge opportunity for universities and research institutes to work and collaborate with companies to help and develop the AI/ML technology. As AI/ML’s potential is still to explore and more innovations needed for better deployment of this technology across wide variety of market segment. Currently AI/ML is getting deployed widely into e-commerce, IoT, Automobile and other sectors like Health-care, energy, education, defense/military are also getting more traction. Every industry requires fine tuning the data model and training pattern and may require customization. This opens-up opportunity for innovation and collaboration for University/Research institutes along with industry/Govt sectors

Companies must allocate sufficient resources to their AI/ML initiatives and investigate supportive partnerships with third parties that have complementary skills, rather than trying to reinvent the wheel themselves. Some larger players may have the spending power required to develop most capabilities in-house, as well as sufficient data from their large installed tool fleet to train AI/ML models, allowing them to retain full control over all associated intellectual property. Given the required resources, smaller players might find it beneficial to leverage commercially available solutions where available, or to partner with others to develop or share algorithms, or to create joint data-sharing platforms that increase the amount of information available to train models. Examples of potential partners include other semiconductor device makers, companies involved in electronic design automation, hyperscale cloud providers, or equipment OEMs.

Conclusion –

The semiconductor industry is at a turning point, and companies that don’t devote significant resources to AI/ML strategies could be left behind. Although semiconductor companies may take different approaches, depending on business model, experience with AI/ML, and strategic priorities, the goal is the same: to take productivity and innovation to new levels. If they set themselves to do it in a planned way and achieve the goal, then AI/ML may eventually reduce the current R&D cost base and similar approaches can be taken to other development steps to leverage the cost base from AI/ML.

References

· Research data from McKinsey & Company

· AI/ML related blogs applicable to Semiconductor

Labels: AI, CAD, Chip, IoT, ML, Semiconductor, VLSI

Thursday, November 18, 2021

AI in the Eyes of Semiconductor

Abstract:

Semiconductor chips enabling the Artificial Intelligence products and systems. The higher technology nodes enabled higher integration capability and hence provides a road-map of improved products and systems over a period of time. Over the next couple of decades, the technological developments around storage and processing power will enable some innovative products that we know and love today, such as Netflix’s recommendation engine or self-driving cars. In general, the AI systems comprises of two major functionalities namely machine learning and then the inference. Both these functions need specific set of computation and storage needs to enable the target application in an effective way. This blog talks about machine learning and inference concepts, scope of semiconductor chips in AI systems and types of different chips to enable based on the applications.

What is Machine Learning?

Machine Learning is a major field in data science. In simple words Data science encompasses preparing data for analysis, including cleansing, aggregating, and manipulating the data to perform advanced data analysis. The amount of data to be analyzed is growing leaps and bounds in current applications and hence necessitating the need of machine interference in data analysis. Just to visualize the amount data, take an example of how e-commerce platforms like Amazon, Flipcart or Netflix comes up of interested or probable items to buy when user login. This is done through tracking his items he searches/orders(learning) and then trying to infer what items he may be interested.

Machine learning can also be treated as a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy. Machine learning is an important component of the growing field of data science. Through the use of statistical methods, algorithms are trained to make classifications or predictions, uncovering key insights within data mining projects. These insights subsequently drive decision making within applications and businesses.

Let us briefly look into how machine learning works or in other wards what are the basic functionalities involved in the aspect of machine learning. One can classify the into basic 7 steps in this machine learning

Each of the seven steps depicted in the adjacent picture logically explains and connect each step. Starting with collection of data for an end application, massaging/preparing the useful data from raw data collected to feed into the model to analyse for the required result and then keep refining the results with new data each time. This results in a required model for the application which will be used for future predications. In one looks deeply into each step, it involves deep analytical and mathematical knowledge clubbed with probability theory and statistics. Using this knowledge along with programming skills (Python etc.) one can develop s/w tools implementing algorithms for achieving the functionality of each step.

The three major building blocks of a Machine Learning system are the model, the parameters, and the learner.

Model is the system which makes predictions
The parameters are the factors which are considered by the model to make predictions
The learner makes the adjustments in the parameters and the model to align the predictions with the actual results

To put the same thing in another way to simplify the explanation to suit the implementation …

An ML lifecycle can be broken up into two main, distinct parts. The first is the training phase, in which an ML model is created or “trained” by running a specified subset of data into the model. ML inference is the second phase, in which the model is put into action on live data to produce actionable output. The data processing by the ML model is often referred to as “scoring,” so one can say that the ML model scores the data, and the output is a score.

Training and Inference in AI:

Artificial intelligence is essentially the simulation of the human brain using artificial neural networks, which are meant to act as substitutes for the biological neural networks in our brains. A neural network is made up of a bunch of nodes which work together, and can be called upon to execute a model.

This is where AI chips come into play. They are particularly good at dealing with these artificial neural networks, and are designed to do two things with them: training and inference.

Chips designed for training essentially act as teachers for the network, like a kid in school. A raw neural network is initially under-developed and taught, or trained, by inputting masses of data. Training is very compute-intensive, so we need AI chips focused on training that are designed to be able to process this data quickly and efficiently. The more powerful the chip, the faster the network learns.

Once a network has been trained, it needs chips designed for inference in order to use the data in the real world, for things like facial recognition, gesture recognition, natural language processing, image searching, spam filtering etc. think of inference as the aspect of AI systems that you’re most likely to see in action, unless you work in AI development on the training side.

You can think of training as building a dictionary, while inference is akin to looking up words and understanding how to use them. Both go hand in hand.

It’s worth noting that chips designed for training can also inference, but inference chips cannot do training. Chips designed for training and inference are the two eyes to steer the future mankind.

AI system at a glance:

The above two are generic examples showing how the ML systems looks like and function.

The data sources are typically a system that captures the live data from the mechanism that generates the data. For example, a data source might be a various server cluster (ex apachi kafka) that stores data created by an Internet Of Things (IoT) device, a web application log file, or a point-of-sale (POS) machine. Or a data source might simply be a web application that collects user clicks and sends data to the system that hosts the ML model.

The host system for the ML model accepts data from the data sources and inputs the data into the ML model. It is the host system that provides the infrastructure to turn the code in the ML model into a fully operational application. After an output is generated from the ML model, the host system then sends that output to the data destinations. The host system can be, for example, a web application that accepts data input via a REST interface, or a stream processing application that takes an incoming feed of data from Server cluster (ex Apache Kafka) to process many data points per second.

The data destinations are where the host system should deliver the output score from the ML model for the final inference. A destination can be any type of data repository like Apache Kafka or a database, and from there, downstream applications take further action on the scores. For example, if the ML model calculates a fraud score on purchase data, then the applications associated with the data destinations might send an “approve” or “decline” message back to the purchase site.

AI – Enabled by Semiconductor chips ( Training & Inference chips, 2 Eyes of AI systems )

Looking at various ML systems and the requirements to build smart, efficient systems with present day technology, semiconductor industry plays a significant role in enabling through well-defined chips. As we now understand the mammoth computational requirement along with real-time processing, the architecture of ML chips evolved as couplets as training and Inference chips.

As the industry and technology is evolving the semiconductor chip echo system is also evolving to enable the ML systems. This is very well seen from the usage of available chips how they have been made to use for ML applications and also how start-ups are emerging with custom define ASICs for efficient ML system for target applications.

Let us look at how the GPU (Graphics Process Unit) and generic CPU (Central Processing Unit) fares in ML application usage and also how ASICs and FPGA’s are making way into usage.

GPUs and CPU’s:

GPU chips were originally developed for rendering 3D graphics onscreen. Nevertheless, GPUs have proved optimal for specialized computational tasks due to their ability to perform parallel computation in a way that CPUs may not.

CPUs perform serial tasks very fast but with very little parallelism. A mid-range CPU may have a handful of cores and a mid-range GPU will have several thousand. GPU cores are much slower/less powerful but run in parallel. The parallelism of GPUs are optimal for neural networks because of the kind of math that is performed: Sparse matrix multiplication.

GPUs were popularized in the ML community after discoveries in 2009 and 2012 during which researchers co-opted NVIDIA GPUs and an NVIDIA library called CUDA to train an image recognition model orders of magnitude faster than was previously possible.

For performance reasons, CPUs are not optimal for training models. That said, CPUs are often used to perform inference as GPUs are over-tuned for the task.

Custom ASIC/FPGAs :

While typically GPUs are better than CPUs when it comes to AI processing, they’re not perfect. The industry needs specialised processors to enable efficient processing of AI applications, modelling and inference. As a result, chip designers are now working to create processing units optimized for executing these algorithms. These come under many names, such as NPU, TPU, DPU, SPU etc., but a catchall term can be the AI processing unit (AI PU).

The AI PU was created to execute machine learning algorithms, typically by operating on predictive models such as artificial neural networks. They are usually classified as either training or inference as these processes are generally performed independently.

As a result, several purpose-built AI chips are currently under development by tech giants and start-ups alike:

· FPGAs (field-programmable gate array) are purpose-built but generic enough to accommodate multiple types of tasks, from encryption to encoding. Example: Microsoft Brainwave

· ASICs (application-specific integrated circuit) are typically designed for a single, specific task. Example: Google TPU.

Other examples include: Intel Nervana, Cerebras, Graphcore, SambaNova, Wave Computing, Groq, etc.

Examples of AI systems architecture:

Cloud + Training

The purpose of this pairing is to develop AI models used for inference. These models are eventually refined into AI applications that are specific towards a use case. These chips are powerful and expensive to run, and are designed to train as quickly as possible.

Example systems include NVIDIA’s DGX-2 system, which totals 2 petaFLOPS of processing power. It is made up of 16 NVIDIA V100 Tensor Core GPUs. Another example is Intel Habana’s Gaudi chip.

Examples of applications that people interact with every day that require a lot of training include Facebook photos or Google translate.

As the complexity of these models increases every few months, the market for cloud and training will continue to be needed and relevant

Cloud + Inference

The purpose of this pairing is for times when inference needs significant processing power, to the point where it would not be possible to do this inference on-device. This is because the application utilizes bigger models and processes a significant amount of data.

Sample chips here include Qualcomm’s Cloud AI 100, which are large chips used for AI in massive cloud datacentres. Another example is Alibaba’s Huanguang 800, or Graphcore’s Colossus MK2 GC200 IPU.

Where training chips were used to train Facebook’s photos or Google Translate, cloud inference chips are used to process the data you input using the models these companies created. Other examples include AI chatbots or most AI-powered services run by large technology companies.

Edge + Inference

Using on-device edge chips for inference removes any issues with network instability or latency, and is better for preserving privacy of data used, as well as security. There are no associated costs for using the bandwidth required to upload a lot of data, particularly visual data like images or video, so as long as cost and power-efficiency are balanced it can be cheaper and more efficient than cloud inference.

Examples here include KL520 and KL720 chip from Kneron’s(specific to AI) , which are lower-power, cost-efficient chips designed for on-device use. Other examples include Intel Movidius and Google’s Coral TPU.

Use cases include facial recognition surveillance cameras, cameras used in vehicles for pedestrian and hazard detection or drive awareness detection, and natural language processing for voice assistants.

Summary:

All of these different types of chips and their different implementations, models, and use cases are essential for the development of the Artificial Intelligence of Things (AIoT) future. When supported by other nascent technologies like 5G, the possibilities only grow. AI is fast becoming a big part of our lives, both at home and at work, and development in the AI chip space will be rapid in order to accommodate our increasing reliance on the technology

Challenges in AI implementation:

Even though artificial intelligence is developing and gaining more popularity in both business and society, the subject still faces significant hurdles. There are many challenges that must be overcome before AI (implementation) is able to achieve maximum potential. To list few of the challenges like Compute performance, Data privacy & security, speed of communication and finally the Bias, the acceptance of AI results. Each of the challenges are being accepted as opportunities and tremendous amount of research and development work is happening in various MNC’s and starts-ups. AI will ultimately prove to be cheaper, more efficient, and potentially more impartial in its actions than human beings.

To conclude …

Man has long feared the rise of the machine – his own creation becoming smarter and more intelligent than he. But while artificial intelligence and machine learning are rapidly changing our world and powering the Fourth Industrial Revolution, humanity does not need to be afraid !!

Resources & References

Blogs and research papers on machine learning
Information from Industry web-portal information
Pictures - Sources from Internet

Labels: AI, CPU, GPUs, Inference, Machine Learning, Training

Wednesday, November 10, 2021

Artificial Intelligence – An opportunity to Semiconductor Industry

Abstract:

Over the last decades one might have though or perceived slowness in innovation in semiconductor. To some extent it is true as software has been the star of high tech over the past few decades, and it’s easy to understand why. With PCs and mobile phones, the game-changing innovations that defined this era, the architecture and software layers of the technology stack enabled several important advances. The semiconductor companies were in a difficult position. Although their innovations in chip design and fabrication enabled next-generation devices, they received only a small share of the value coming from the technology stack. With new domains like Machine Learning (ML) and Artificial intelligence (AI) gaining popularity in almost all application domain, it has opened-up new innovation in chip architecture and design. The story of semiconductor industry is changing with the growth of AI - typically defined as the ability of a machine to perform cognitive functions associated with human minds, such as perceiving, reasoning, and learning. This blog captures some of the AI ideas and how it is changing the landscape of semiconductor industry.

A brief overview on AI:

Artificial Intelligence or AI in short, is a branch of computer science which displays or simulates Human intelligence by machines or a process to make machines think intelligently. AI is based on the study of how human brain thinks, and how humans learn, decide, and work while trying to solve a problem. The great American computer scientist, also called Father of AI, John McCarthy, first coined the term in 1956. In today’s world, this term encompasses everything from Robotics to Process automation.

The goal of AI is to implement Human intelligence in machines and to create smarter systems. Artificial intelligence is a science and technology based on disciplines such as Computer Science, Biology, Psychology, Linguistics, Mathematics, and Engineering. A major impetus of AI is in the amelioration of computer functions correlated with human intelligence, such as Problem solving, Learning and reasoning. AI is a multi-disciplinary domain, where in there is an equal opportunity for every field to contribute.

AI techniques heightens the speed of execution of the complex program it is equipped with and which is normally not achievable by humans. Some of the applications or major advances in areas of AI are Significant demonstrations in machine learning, Case-based reasoning, Multi-agent planning and Scheduling, Gaming, Natural language processing (understanding and translation), Expert systems (examples involve Flight tracking system, clinical systems), Vision systems, speech and voice recognition, Intelligent robots, Data mining, Virtual Reality etc.

The biggest challenge for AI is Creativity which is a fundamental trait of human intelligence. AI techniques can be used to spawn innovative ideas, by generating innovative combinations of familiar ideas, by exploring potential of conceptual spaces and making transformations that enable the generation of previously impossible ideas.

There is a large multitude of applications where AI is serving or integrated into human beings in their everyday lives with or without their realization, like Washing machines, dish washers, cars we drive, Automatic doors, Smart phones etc. to Autonomous vehicles, space robots and the list is end-less. Ai is playing very advanced role in medical field of diagnosis and helping in early detection and warning of various medical ailments like heart attack, paralysis strokes etc.

AI’s role in Semiconductor industry:

As one understands the AI’s goal, it becomes apparent how it can open-up opportunities in various business opportunities across various domain.

Diverse solutions, as well as other emerging AI applications, share one common feature: a reliance on hardware as a core enabler of innovation, especially for logic and memory functions. This leads to the following questions …

What will this development mean for semiconductor sales and revenues? And which chips will be most important to future innovations?

To answer these questions, it is important to reviewed current AI solutions and the technology that enables them. Also examined opportunities for semiconductor companies across the entire technology stack. The outcome of this study, can be concluded as

· AI could allow semiconductor companies to capture 40 to 50 percent of total value from the technology stack, representing the best opportunity they’ve had in decades.

· Storage will experience the highest growth, but semiconductor companies will capture most value in compute, memory, and networking

· To avoid mistakes that limited value capture in the past, semiconductor companies must undertake a new value-creation strategy that focuses on enabling customized, end-to-end solutions for specific industries, or “microverticals.”

· Innovate and enable multi-disciplinary domains coming together to define a end-to-end solutions.

By keeping these beliefs in mind, semiconductor leaders can create a new road map for winning in AI. We will look at opportunities to enable AI applications by taking an example below.

AI will drive a large portion of semiconductor revenues for data centers and the edge:

With hardware serving as a differentiator in AI, semiconductor companies will find greater demand for their existing chips, but they could also profit by developing novel technologies, such as workload-specific AI accelerators

Domain	Current	Trend for AI
Compute	GPU’s and FPGA’s	Workload specific AI accelerators
Memory	HBM’s On-chip SRAMs	New NVM (non-volatile Memories)
Storage	Data centers with increased capacity	AI optimized data centers with enabled by NVM
Networking	Infrastructure for data communication	Programmable switched with high speed interconnects

Research revealed that AI-related semiconductors will see growth of about 18 percent annually over the next few years—five times greater than the rate for semiconductors used in non-AI applications. If this growth materializes as expected, semiconductor companies will be positioned to capture more value from the AI technology stack than they have obtained with previous innovations—about 40 to 50 percent of the total.

To conclude,

· It’s clear that opportunities re plenty, but success isn’t guaranteed for semiconductor players. To capture the value they deserve, they’ll need to focus on end-to-end solutions for specific industries (also called microvertical solutions), ecosystem development, and innovation that goes far beyond improving compute, memory, and networking technologies.

· Semiconductor companies must define their AI strategy. With both major technology players and start-ups launching independent efforts in the AI hardware space now, the window of opportunity for staking a claim will rapidly shrink over the next few years. Companies should be very clear on Where, How and When to play to capture the AI opportunity.

· Hardware can be the differentiator that determines whether leading-edge applications reach the market and grab attention. As AI advances, hardware requirements will shift for compute, memory, storage, and networking—and that will translate into different demand patterns. The best semiconductor companies will understand these trends and pursue innovations that help take AI hardware to a new level. In addition to benefitting their bottom line, they’ll also be a driving force behind the AI applications transforming our world.

Reference:

Various internet blogs on AI
Research papers by McKinsey

Labels: AI - Semiconductor industry

: Name: Ravindra Bidnur; Location: Bangalore, Karnatak, India

Electronics & Communication Engineer by profession with specialization in VLSI - chip architecture, design, development. Skills in project and team management, organization development and process definitions. Technical reading and writing, writing blogs is a new hobby

View my complete profile

Subscribe to
Posts [Atom]

Ravindra Bidnur

Saturday, November 20, 2021

Can AI catalyze the Semiconductor chip development ?

Thursday, November 18, 2021

AI in the Eyes of Semiconductor

Wednesday, November 10, 2021

Artificial Intelligence – An opportunity to Semiconductor Industry

About Me

Links

Previous Posts

Archives