A Machine’s-Eye View
With AI, there are aspects where the program will learn what
to do based on training examples set up by its developer. Part
of the developer’s job is to assemble sufficient instances of the
diverse kinds of data that the program can expect to find, and
identify what the desired results should be for these different
cases. The AI program learns to perform correctly through
using one of several different possible learning algorithms.
An AI’s understanding of its environment relies on its
world model, a complex data structure that that contains its
knowledge about things and relationships between things. For
example, an autonomous vehicle’s world model will know a
great deal about roads, signs, other cars, and everything that
it can reasonable expect to encounter on the road. Its world
model also contains information about relationships between
things, such as the distance between itself and other cars, as
well as how that distance is changing over time.
An AI builds its world model from a combination of
information generated by the developer and raw sensor data
that it collects, correlates and assembles into a coherent set of
things and relationships.
In order to function effectively, the AI must use the data it
collects and analysis of its past performance to consistently
update and refine its world model. In real world/real-time
applications, such as autonomous vehicles, the AI updates its
model based on inputs from multiple sensors, many of which
provide visual data from cameras, or image-like inputs (e.g.
LIDAR, RADAR ).
This task of assembling a unified ‘picture” from a
heterogeneous collection of sensors is known as sensor fusion,
a technique that will be discussed in greater depth later in this
The Power of Inference
The types of AI/DL applications currently under development
will not only be required to learn responses to a wide range
of known conditions, but to also use inferences to synthesize
new responses to effectively deal with novel situations it’s
never encountered. This involves applying a method (or multiple
methods) for identifying the situations that it has already
learned which are most relevant to the new situation, and then
interpolating between them.
Making reliable inferences is especially challenging because
it requires the AI to identify the situations it already knows about
that are the closest match to its current inputs using information
that noisy or incomplete. In the case of an autonomous vehicle,
for example, a sudden change of pixel values between two
objects that it has previously identified as being parked cars
can have multiple interpretations and correspondingly different
responses. If it infers that the newly-detected features are
a paper bag blown by the wind, it will have a very different
response than if it is a child.
The algorithms used to extract inferences from a model
typically require a great deal of computation. Some of these
tasks may exceed the processing abilities of the AI’s on-board
resources and must be off-loaded to more a powerful system
(typically cloud-based) via a cellular connection or other
Developing strategies for efficiently segmenting computing
tasks between local and remote resources present several
challenges addressed later in the article.
GPUs – Not your Grandma’s CPUs
Artificial Intelligence’s evolution from an academic curiosity
to a commercially viable technology has been largely due to
the past decade’s advances in computing hardware. Arguably,
the most common AI-friendly computing architecture is the
GPU (graphics processing unit), originally developed as an
array processor, optimized for the pixel and vector manipulation
tasks associated with graphics and video acceleration (Figure
2). Those same capabilities are also essential for accelerating
neural networks, which are essentially vector computations.
NVIDIA, in particular, has spearheaded the GPU’s evolution
with a series of processors, each tailored to different
processing environments and purposes. Among these are
Xavier, a system-on-a-chip (SoC) designed for autonomous
cars, which will be capable of running at 20 TOPS (trillion
operations per second), while consuming only 20 watts of
power. The Xavier (Figure 3) integrates the new NVIDIA GPU
architecture called Volta that integrates a custom 8-core CPU,
as well as a new computer vision accelerator.
NVIDIA also pioneered the use of software tools like CUDA,
a parallel computing platform and application programming
interface (API) model that allow developers to create powerful
GPU-based applications without having to master the
intricacies of the GPU’s unique architecture and command set.
Autonomous Vehicles Drive Sensor Fusion
The autonomous ground-based and aerial vehicles, expected
to dominate transportation by 2030, continue to be the earliest
and largest markets for GPUs and AI technologies. All major
Figure 2. A block diagram of an early GPU (Radeon 9700). Image source: ScotXW,
courtesy of Wikipedia