By André Schwarz
Regional Technical Director (Central Europe North)
Read this article to find out about:
- The image-processing capabilities required to support machine vision in embedded systems
- Features of the new S32V SoC from NXP that support machine vision
- The NXP software and tools that enable rapid vision system development
How Might the Operation of Industrial Machinery be Transformed if it Could See the World Around it?
Until recently, this has been a hypothetical question for many types of industrial equipment. True, some applications for machine vision do exist: high-volume manufacturing or process plants, for instance, often rely on cameras to perform Automated Optical Inspection (AOI), to spot defects or classify goods more reliably, quickly and cheaply than human operators. Camera equipment also performs visual tasks such as optical character recognition, for instance to log car registration plates in car parks. Until now, such applications for machine vision have called for a system architecture based on a camera connected to a PC or server.
But a PC-based system design has considerable size, cost and power implications which put it beyond the reach of many mainstream industrial applications. As this article shows, however, the introduction of a new generation of vision-enabled Systems-on-Chip (SoCs) backed by user-friendly tools for vision application development is making it possible for the first time for almost any kind of embedded system to add intelligent vision to the range of functions it provides.
The Cumbersome Hardware Architecture of Today’s Machine Vision Systems
The difficulty of implementing intelligent machine vision arises from the size and character of the data to be processed. An industrial image sensor typically has between 1 million and 10 million pixels, and operates at frame rates of up to several hundred frames per second. When IT professionals talk about the potential of ‘big data’, they are referring in part to the huge amounts of image data generated by such embedded cameras.
The data sets are not only large: they are also difficult to classify. A human brain finds it easy to look at a set of pictures of objects, and attach names to them such as ‘dog’, ‘bird’ or ‘bus’ with absolute certainty. For a machine, this involves a complex set of logical operations. Of course, some forms of image data are easier to process: today, the development of a camera system that can perform alphanumeric character recognition, level sensing or color detection is more straightforward.
But some applications require more complex forms of processing. Autonomous vehicle technology, for instance, has been in development already for several years. This points to the difficulty of teaching machines to distinguish moving objects such as cars, buses, bicycles and pedestrians. And there are machine vision tasks even harder than this: technologists have begun experimenting with machines that can read emotions on human faces, a capability which might eventually be applied by the advertising industry or in crime prevention.
Machine vision, then, calls for the fast processing of vast amounts of data through complex algorithms. Today, this function is typically implemented by PC- or server-based architectures. They off er the twin advantages of:
- Massive data-processing capability provided by the latest ultra-high speed, multi-core processors or Graphics Processing Units (GPUs)
- The comprehensive, high-level applications programming resources and tools provided by the Windows® or Linux® operating environments
The problem for embedded system developers is that the PC is normally an unsuitable hardware platform for their applications. Typically, embedded systems operate under tight power, size and cost constraints. The PC is optimized for none of these factors.
Yet there is clearly huge scope to add value to embedded systems by equipping them with intelligent vision capability. Automatic doors, for instance, might today use infrared proximity sensors or in-floor pressure sensors to provide the trigger to open. Dumb sensors, however, can produce false results. In a storefront on a busy street, automatic doors might open for passers-by as well as for those who want to enter the store.
Doors that can see, and distinguish pixels in the shape of a human face, could distinguish people facing the door from those who are merely walking past it, and who present their profile rather than their face to the camera. This kind of intelligence would enable the door system to save energy by keeping heat inside the store and draughts out. It would also prolong service life and reduce maintenance costs by reducing the number of opening operations.
This is a typical example of an application that faces size and cost constraints, and for which a PC is therefore an unsuitable hardware platform. When embedding machine vision into such an application, systems designers will be looking for certain important features:
- Low power consumption
- Support for high-level programming at the functional level
- Connectivity options which enable the system to export its outputs to a host system or to the cloud
- A versatile front end for image sensor connections
- Basic hardware-accelerated image enhancement blocks which today are typically implemented in an FPGA
- Efficient image compression
- Interfaces to mass storage
- Support for a standard, robust operating system. In the embedded world, this normally means the Linux OS
Now a viable platform for embedded machine vision which offers these features is available in the form of a new vision-optimized SoC platform, the S32V, introduced by NXP Semiconductors.
Comprehensive Hardware and Software Ecosystem
The value of the new S32V SoC comes from its combination of both the hardware and software resources required by embedded system developers.
The hardware is an all-new SoC which includes dedicated vision functional blocks together with a high-performance general-purpose processing block, as shown in Figure 1. The CPU platform includes two or four Arm® Cortex®-A53 cores operating at a frequency of up to 1GHz. A 133MHz Arm Cortex-M4 core takes care of functional safety and security operations and other housekeeping tasks.
Fig. 1: Block diagram of the S32V SoC
The image processing functional block is centered on dual APEX-2 vision accelerator cores which are enabled by the OpenCL™, APEX-CV and APEX graph tools, part of the software enablement which is described below. The S32V also provides dual camera interfaces, enabling stereo vision applications in order to make the algorithms even more robust in natural environments.
Embedded Image Sensor Processing (ISP) supports High Dynamic Range video, color conversion, tone mapping and other functions, making traditional FPGA-based approaches obsolete in many cases.
Multiple connectivity options provide for high-speed transfer of processed images in an edge computing architecture. They include a Gigabit Ethernet controller, dual CAN-FD, dual FlexRay™ and single-channel PCIe 5Gbits/s interfaces for proper system scalability.
The optional graphics subsystem includes a 3D graphics processing unit that supports the OpenCL 1.2 EP 2.0, OpenGL ES 3.0 and OpenVG 1.1 graphics rendering interfaces. This enables the image processing required for sophisticated video outputs in use cases such as smart advertising, driver assistance, or security scenarios which involve human supervision.
Benefiting from the power efficiency of the Arm A-class processor family, the S32V’s typical power consumption is in the range of 5 to 10W, the S32V also provides sufficient raw processing capability to perform sophisticated functions such as face recognition or moving object detection, as shown in Figure 2. But if it is being used, as in the automatic door example, to enable the addition of vision capability to a system for the first time, the developer might be unfamiliar with the process of writing software for sophisticated machine-vision applications. An important question for the developer to consider is therefore how well application software development is supported by the S32V ecosystem.
The basis for the S32V’s software enablement is NXP’s Vision Software Development Kit (SDK), which is part of the S32 Design Studio for Vision Integrated Development Environment (IDE). The Vision SDK is supported on S32V evaluation boards supplied by NXP, and is supplied with application examples for functions such as face detection, and homography for feature matching. It features an open-source code base: running on the Linux OS, it includes an open-source library (OpenCV) and open, standard languages and APIs, including OpenCL and OpenGL.
Fig. 2: A face detection algorithm is provided as example code in the S32V’s Vision SDK
Crucially, software development can be accomplished with the C programming language alone. The Vision SDK makes light work of task distribution between hardware accelerators and general-purpose cores, which in the past has been a distinctly difficult element of vision system design implementation.
An additional computer vision library, APEX-CV, is also included in the Vision SDK. In addition to the Vision SDK, NXP also provides a Model Based Design Toolbox for Vision tool, which enables designers to use MATLAB® software to develop their application on the S32V.
Fast Proof-of-Concept Development on S32V
The S32V is supplied pre-configured for operation with standard, off -the-shelf camera modules for quick prototyping, enabling developers to concentrate immediately on software development:
- The S32V-SONYCAM module from NXP
- The Omnivision OV10640CSP-S32V or MXOV10635-S32V, both MIPI cameras with high dynamic range
Integration of other sensors is possible of course, but low-level driver development will be required. The S32V image pipeline has been designed for use with image sensors featuring up to 2Mpixel resolution. This means that bandwidth limitations for higher-resolution systems have to be taken into consideration.
NXP also supplies various evaluation boards for the development of prototypes based on the S32V. These include the:
- SBC-S32V234 vision and sensor fusion evaluation board, as shown in Figure 3
- S32V234-EVB2, a vision and sensor fusion evaluation system and development platform
Fig. 3: The SBC-S32V234 evaluation board for the S32V vision processor
The combination of off -the-shelf development hardware with a comprehensive tool suite and readymade application examples gives first-time developers of an embedded vision system the best possible platform for rapid system development. This means that, for the first time, embedded applications for which a PC-based vision architecture was unsuitable can now benefit from the addition of intelligent machine vision, with the potential to dramatically increase the value of many of today’s embedded system designs.