English | Other languages

What does Intel's MMX mean for developers of scientific/industrial imaging systems?
by Stephen Albanese, Matrox Imaging
Original article featured in Advanced Imaging - July 1997
Intel MMX technology. You've heard about it for over a year now. Pentium processors started sporting it around January. More recently, the Pentium Pro had MMX grafted onto it and was reborn the Pentium II. So what exactly does Intel's MMX technology mean to you if you are developing an all digital medical imaging system or retrofitting a semiconductor inspection machine? Will MMX replace high-end, dedicated imaging hardware? No. Is it a boost for host-based image processing? Yes.
If you are involved in building or retrofitting machine vision, medical imaging, or image analysis systems, then read on to gain a little insight into what MMX is, what it can do for you, and what it can't.
From its acronym and the press that Intel MMX technology has received thus far, you might be tempted to think that multimedia is all that this technology will aid. But this isn't strictly the case. Scientific/industrial imaging stands to benefit the most from Intel's endeavors to bring more power to the desktop (and factory floor, medical facility, research lab). But you can only reap the rewards of increased performance using software carefully optimized for Intel MMX technology.
What is MMX?
As you may already know, Intel MMX technology is a set of 57 instructions added to Pentium (P55C) and Pentium Pro (Pentium II) processors. These instructions were specifically designed to manipulate and process video, audio and graphical data more efficiently. They are oriented to the highly parallel, repetitive sequences found in multimedia (and scientific/industrial) operations.
You may also know that MMX is based on the Single Instruction Multiple Data technique commonly referred to as SIMD. As its name implies, multiple data can be processed using one instruction, rather than using multiple instructions.
What can it do for you?
So what does all this mean to you? It can mean more performance for your imaging dollar. How much more? Well, as illustrated in the table, execution times improved dramatically after careful optimization of the Matrox Imaging Library (MIL) for MMX. Now, more than ever, the host can be used to solve demanding applications.
Pentium II Processor @ 266 MHz
As illustrated, optimization of MIL for MMX offers significant performance gains, especially for Pentium with MMX platform
Pentium Processor @ 200 MHz with MMX Technology

  MMX-enbaled MIL 5.0 non-MMX MIL 5.0 MMX-enbaled MIL 5.0 non-MMX MIL 5.0
512 x 512 x 8-bit image Intel Pentium II @ 266 MHz Intel Pentium with MMX @ 200 MHz
Point-to-Point Operation
Add two images with saturation * 5.6 ms 11.2 ms 4.4 ms 18.3 ms
Threshold * 2.9 ms 9.7 ms 2.7 ms 16.5 ms
Filtering Operations
Sharpen 10.2 ms 27.7 ms 13.3 ms 78.0 ms
General convolution 3 x 3 with saturation 12.6 ms 38.7 ms 17.4 ms 200.9 ms
Edge detection (Sobel) 14.9 ms 52.1 ms 21.0 ms 108.6 ms
Morphological Operations
Grayscale erosion/dilation 5.7 ms 23.2 ms 6.8 ms 40.6 ms
Pattern Matching
Find a 128 x 128 model 7.2 ms 11.9 ms 11.7 ms 28.6 ms
Scientific and industrial imaging may prove to benefit the most from MMX, as benchmarks indicate (note: operations denoted with * are I/O bound).

Why is MMX good for imaging?
While the areas that Intel has targeted with MMX technology include games, communications, and multimedia; scientific/industrial image processing stands to gain the most. MMX addresses the key characteristics of applications that must handle large amounts of image, video and graphics data. It was built to handle compute-intensive algorithms that perform repetitive operations on small data types, such as imaging algorithms, and the 8-bit pixels commonly used in all areas of image processing. Data is also manipulated in word (16-bit), double word (32-bit), and quad word (64-bit) format and packed into 64-bit registers.
It's easy to see just how MMX can speed up imaging operations. If we take the example of working with 8-bit data, with MMX, parallel processing of up to eight 8-bit pixels simultaneously is possible, something that previously was performed serially.
MMX vs. Non-MMX
Intel MMX technology has the potential to speed up several classes of operations used in scientific/industrial image processing. The most noticeable improvements are for neighborhood operations like convolutions and morphology, point-to-point class operations like arithmetic, and to a lesser extent, higher level lgorithms like pattern matching.
MMX helps with the frequent multiply/accumulate operations that characterize algorithms like convolutions, frame averaging, and normalized grayscale correlation.
Most image processing done for applications like motion angiography and ophthalmology involve some form of preprocessing of images for improved visualization. The ability to speed up the operations used for them, such as sharpening, and other convolution operations is quite significant for these and similar applications.
Other tasks that benefit from MMX fall under the category of machine vision, such as parts handling, electronics inspection, or semiconductor alignment. The operations typical of these types of applications include thresholding, edge detection, and pattern matching used for feature extraction and analysis.
But the Intel architecture and its MMX extensions, like any other technology, has its limitations. You only get the increased performance with software that is optimized to take advantage of this new technology. And when it comes to certain particularly demanding real-time applications like high-speed, on-line web inspection; dedicated hardware is still required.
What it can't do for you
At present, limitations on the hardware side include I/O bandwidth, a shared memory architecture, system management, and the ability to handle integer and floating point operations. On the software side, the execution times of certain imaging algorithms cannot be improved.
Some operations like histogram and LUT mapping, both important operations, cannot be optimized, or rather, their algorithm is not suited to optimization using the MMX instruction set. Therefore, no speed improvement in execution time is possible. So, if your application relies heavily on these operations, dedicated hardware, like a Matrox Genesis image processor, may be the way to go if the fastest Pentium isn't giving you the speed you need.
Additionally, while the 132 MB/second bandwidth available on the PCI bus provides a fast data path for passing along pixel data and other information, multiple data highways may be needed to maintain real-time processing for applications like high-speed web inspection.
A single bank of memory for one or more CPUs may limit processing of image data when compared to dedicated hardware with private memory for each additional processing core.
While the host can be used for image processing, it also has "housekeeping chores". A screen may need updating, user input may need to be managed (i.e. keyboard/mouse), and I/O cards may be needed to control additional hardware. These can eat up CPU cycles and bus bandwidth needed to process image data.
As well, with Intel's current implementation of MMX technology, there is a performance penalty to be paid when there is a heavy mix of integer and floating point operations because of the context switching necessary. But this is rumored to change in future MMX implementations.
Even though certain classes of imaging operations are accelerated dramatically, others show little or no improvement. But by developing with a device-independent software library like MIL you can choose a host-based approach or use the same imaging code on a dedicated Matrox processor when required.
Coding your own - some considerations
For those of you in the "build your own" camp, here is a little insight into what it takes to try your hand at programming for MMX. The only way to benefit from MMX is by hand-optimized coding in assembly language for the critical portion of the algorithm, or the so-called "critical loop". Furthermore, you need an intimate understanding of the Intel architecture and software tools used to develop or fine tune this code, such as Intel's Vtune.
Another consideration is support. If you are going to be using non-MMX systems for some projects to benefit from the cost savings available today, and MMX systems for other projects, you will have to support two sets of code (MMX optimized and non-MMX code).
These issues must be carefully considered in today's competitive environment, where time-to-market is critical.
Final words
While MMX is good news for scientific/industrial imaging, you have to make sure the software you use is carefully designed or optimized to take advantage of it. If you are not prepared to make the investment in developing this software yourself, you can now rely on vision vendors like Matrox to do it for you. Support and upgrades are left to the supplier, while you benefit from the speed improvements of present MMX-enabled processors, and the faster ones to come (300 MHz Pentium IIs and beyond).
By early '98 Intel plans to build only MMX processors, so if you don't take advantage of it, you might be losing out on a new enabling technology. And even if MMX doesn't solve all your applications, it will certainly allow you to do a lot more using the host.
For more information, contact our Media Relations Team.
Top of page
Site Map Contact Us Legal E-mail Matrox