Read this to find out about:
- The reasons for the rapid adoption of machine learning technology in the embedded world
- The types of semiconductor hardware which can support artificial intelligence in embedded devices
- The development tools that semiconductor manufacturers provide to support machine learning applications
Technology companies have dreamed about the potential applications for machine learning in mainstream electronics for decades.
Wartime code-breaker Alan Turing was already thinking about the concept of machine learning in the 1940s. Practical progress was slow until the 1990s, however, when techniques for Optical Character Recognition (OCR) enabled machines for the first time to read handwriting. The technology was deployed commercially in the late 1990s, for instance, to read bank checks written by hand.
These limited breakthroughs in OCR did not, however, immediately herald a stream of new applications. The great leap forward came in 2012, when a team of researchers from Toronto won an annual machine learning developers’ competition, achieving an image-recognition error rate of just 15%, some ten percentage points better than the next best competitor’s score.
The Toronto team’s success showed what had been holding back the adoption of machine learning: its hugely improved image-recognition capability stemmed from the decision to train its machine-learning algorithm on a much larger data set than had ever been used before, and to do so by training its algorithm on a large array of now affordable Graphics Processing Units (GPUs).
By 2012 it had become clear that the wider use of machine learning was going to depend on the availability both of data, and of cheap, high-speed processor chips.
These factors explain why machine learning has, in 2019, changed from being theoretically attractive, to being practically viable for any manufacturer of industrial or consumer electronics products. Training data sets are now large enough to support a wide range of applications such as speech recognition, image recognition, people counting and video analytics thanks to the huge stores of data generated by services such as YouTube, Facebook and the apps associated with smart watches and wristbands.
At the same time, today’s microcontrollers, applications processors and FPGAs can perform inferencing at the edge, that is to say, without requiring a connection to an Artificial Intelligence (AI) service in the cloud, at high speed and with good accuracy. Silicon chip manufacturers have also introduced new tools to support the implementation of machine learning software on their devices.
An ecosystem for training machines has also emerged. Training frameworks such as Caffe™, TensorFlowLite™ and PyTorch are available to any OEM to train neural networks on its data set. The machine learning models that they produce are packaged in standard data formats that most MCU, processor and FPGA manufacturers support.
So now machine learning is ready to be implemented on the hardware that embedded developers are already familiar with: devices such as Arm® Cortex®-M-based microcontrollers, Cortex®-A based applications processors, or low-power FPGAs. But to what extent do the manufacturers of these hardware devices provide a bridge between the machine learning model produced by a framework such as Caffe or TensorFlow Lite, and the OEM’s chosen MCU, processor or FPGA?
Diverse Approaches to Machine Learning Enablement
In fact, different manufacturers have taken different approaches to the porting of machine learning models to a hardware target. Some provide extensions to existing development tools to enable the developer to compile a machine learning model to a specific hardware device. Others have created a complete Integrated Development Environment (IDE) for machine learning.
Microchip Technology has taken a third route: rather than providing a specific set of machine learning tools for users of its PolarFire® FPGAs, it has partnered with a third party, ASIC Design Services, to perform conversion of the OEM’s Neural Network (NN) model to an FPGA bitstream output. This output can then be compiled to a PolarFire FPGA target using Microchip’s familiar Libero® SoC design suite.
The advantage of this approach is that an engagement with ASIC Design Services provides for consultancy as well as model conversion, and avoids the need for the PolarFire user to learn how to perform model conversion.
Another FPGA manufacturer, Lattice Semiconductor, provides a comprehensive set of services and tools, sensAI, to enable users of its iCE40 and ECP FPGAs to do the porting themselves, as shown in Figure 1. The sensAI stack incorporates modular hardware platforms, example demonstrations, reference designs, neural network IP cores, software tools for development, as well as custom design services.
This comprehensive stack is particularly useful for streamlining the development of applications supported by the Lattice reference designs, such as people counting: Lattice supplies ready-made training scripts and sample data sets to support model training in a standard framework such as Caffe, as well as the tools for compiling the finished model to the chosen hardware target.
Fig. 1: Implementation flow on Lattice Semiconductor sensAI tools
A similarly comprehensive offering of tools, but for applications processors rather than FPGAs, is provided by NXP Semiconductors. Its eIQ™ ML Software Development Environment offers the key components required to deploy a wide range of machine learning algorithms. It includes inference engines, NN compilers, vision and sensor solutions, and hardware abstraction layers, as shown in Figure 2.
Fig. 2: NXP’s eIQ platform for hosting an inference engine on an embedded processor or MCU
At launch, it supported the i.MX RT crossover microcontroller family and the i.MX family of applications processors, but NXP’s intention is to deploy the platform to its complete range of processors and MCUs over time. Implementation is particularly well supported for NXP’s eIQ application samples: face recognition, object detection, anomaly detection and voice recognition.
In addition, an NXP development kit for the i.MX 8M provides a useful demonstration of the potential for Alexa voice-recognition applications.
While NXP’s eIQ platform is a discrete set of tools for machine learning, STMicroelectronics has chosen to support machine learning development through its existing STM32CubeMX design environment for 32-bit MCUs. With STM32Cube.AI, developers can convert pre-trained neural networks into C code that runs on the company’s STM32 Arm Cortex-M-based MCUs.
The strength of the STM32Cube.AI package is that it is integrated with ST’s portfolio of environmental and motion sensors. Predictive maintenance, which depends on accurate recognition of patterns in parameters such as vibration and temperature, is expected to be the killer app for machine learning. STM32Cube.AI is supplied with ready-to-use software function packs which include example code for two applications:
- Human activity recognition
- Audio scene classification
These code examples are immediately usable with the ST SensorTile reference board and the ST BLE Sensor mobile app. The SensorTile board features an accelerometer/gyroscope module, pressure sensor, microphone and Bluetooth® radio transceiver.
Machine Learning Framework Built for the Embedded World
The tool offerings from most semiconductor suppliers assume that the OEM’s model will be trained in a standard third-party model-training framework such as Caffe or TensorFlow. These frameworks have their roots in the enterprise computing world, and their output, the inference engine they produce, is ideally suited to a cloud-based run-time environment featuring arrays of high-power processors, rather than for resource-constrained embedded hardware.
That’s why the machine learning platform provided by QuickLogic and its SensiML subsidiary is so interesting to embedded developers. SensiML’s Endpoint AI Toolkit, a complete model training tool suite, was built from the ground up for embedded hardware targets. Intended for use in typical embedded applications such as predictive maintenance, the SensiML toolkit does not assume that a neural network is necessarily the right type of inference engine for the intended application. SensiML maintains that:
- Slowly varying sensors, such as temperature or humidity sensors, can often use basic rules or threshold analysis
- Dynamic time-series data can nearly always be handled by machine learning classifiers
- Spatial (image) data typically requires neural networks
Offering a growing library of feature transforms and classifier algorithms, SensiML’s toolkit automates the search for, and optimization of, the best approach for a given problem, rather than imposing a one-size-fitsall deep-learning approach, as shown in Figure 3. Benefits include the potential to occupy a much smaller code footprint, and performing model training with much less data than a neural network requires.
Crucially, it enables developers to build intelligent IoT sensing devices in just days or weeks without the expertise in data science or embedded firmware that frameworks such as TensorFlow or Caffe typically require. And because SensiML is part of QuickLogic, its sensor algorithms are optimized for implementation on QuickLogic’s EOS™ S3 sensor processing system-on-chip, although they are also compatible with various other hardware targets including Arm Cortex-M-based MCUs.
Fig. 3: The SensiML inference engine development flow
Rapid Development in Tool Offerings
Manufacturers of MCUs, applications processors and FPGAs have already introduced a wide range of tools to support the needs of embedded developers who are implementing machine learning models. But this field is young, and intensive development continues. The performance of these tools is only going to get better and the features of them more sophisticated.
For embedded developers, this means that there has never been a better time to start experimenting with machine learning in real-world applications.