|
Area Correlation
The measure of similarity between images is known as their correlation.
The computation of the correlation is very similar to that involved
in convolution. Only the indexing scheme is different. We can think
of the window, that is the small image containing the feature to
be found, as playing a similar role to that of the mask in convolution.
The window is a small array of pixels being positioned over the
main image at different places and taking part in a sum of products
calculation. The resulting intrinsic image is a correlation map.
Its pixel values represent how well each small neighborhood of the
main image matches the window. If there is a single, strong maximum
value in the correlation map it indicates the location of the desired
feature (1).
It is usual to normalize the result of a correlation in order to
make the resulting peaks sharper and easier to identify. Consequently,
the full equation for a correlation is a little more complex than
that for a convolution (2).
|
|
|
|
|
|
Original image with selected area of interests (red
square). |
|
Coefficient of Un Normalized Area correlation (white
is highest) |
|
Result.
This result illustrate, that is necessary to use normalization
during the calculation.
|
|
Coefficient of Normalized Area correlation (white is
highest) |
|
Result.
The result is right.
|
|
SIMD - single instruction, multiple data;
SIMD technology in CPU:
Intel MMX;
AMD 3DNow;
Intel SSE;
Intel SSE2;
Motorola AltiVec;
MIPS MIPS-3D;
The MMX technology uses the single instruction, multiple data (SIMD)
technique for performing arithmetic and logical operations on the
bytes, words, or doublewords packed into MMX registers. For example,
the PADDSW instruction adds 4 signed word integers from one source
operand to 4 signed word integers in a second source operand and
stores 4 word integer results in the destination operand. (Note
that the same MMX register is generally used for the second source
and the destination operand.) This SIMD technique speeds up software
performance by allowing the same operation to be carried out on
multiple data elements in parallel. The MMX technology supports
parallel operations on byte, word, and doubleword data elements
when contained in MMX registers. The SIMD execution model supported
in the MMX technology directly addresses the needs of modern media,
communications, and graphics applications, which often use sophisticated
algo-rithms that perform the same operations on a large number of
small data types (bytes, words, and doublewords). For example, most
audio data is represented in 16-bit (word) quantities. The MMX instructions
can operate on 4 words simultaneously with one instruction. Video
and graphics information is commonly represented as palletized 8-bit
(byte) quantities. Here, one MMX instruction can operate on 8 bytes
simultaneously.
|
|
Single Instruction, Multiple Data (SIMD)
Execution Model |
|
Typical architecture of processor with SIMD.
Scheme AMD K6-III, contains ten execution pipelinesstore,
load, integer X ALU, integer Y ALU, MMX ALU (X), MMX ALU (Y), MMX/3DNow!
multiplier, 3DNow! ALU, Floating-Point, and Branch.
|
|
X and Y modules can work simultaneously.
It's typicaly for Intel Pentium MMX and above, AMD K6-2 and above,
VIA C3..
|
|
Mapping the MMX registers on the
floating-point stack enables backwards compatibility for the register
saving that must occur as a result of task switching. |
|
MMX data
Packed byte
Eight 8-bit bytes packed into 64 bits
Signed integer range(2^7 to 2^71)
Unsigned integer range(0 to 2^81)
Packed word
Four 16-bit words packed into 64-bits
Signed integer range(2^15 to 2^151)
Unsigned integer range(0 to 2^161)
Packed doublewords
Two 32-bit packed into 64 bits
Signed integer range(2^31 to 2^311)
Unsigned integer range(0 to 2^321)
Quadword
One 64-bit quadword
Signed integer range(2^63 to 2^631)
Unsigned integer range(0 to 2^641)
|
|
SSE/SSE2 instruction.
In Intel Pentium III and AMD Athlon XP exist SSE registers - 8 x 128bit.
SSE support 32bit floating-point operation.
SSE2 and Intel Pentium 4 processor add support for 8, 16, 32 64 bit
integer data in XMM(SSE) registers. |
|
SSE2
128-Bit Packed Double- Precision Floating-Point;
128-Bit Packed Byte Integers;
128-Bit Packed Word Integers;
128-Bit Packed Doubleword Integers;
128-Bit Packed Quadword Integers
|
|
AMD Hammer (x86-64) architecture
will support:
*64-bit virtual addresses (implementations can have less).
*Register extensions through a new prefix (REX):
- Adds eight GPRs (R8R15).
- Widens GPRs to 64 bits.
- Adds eight 128-bit streaming SIMD extension (SSE)
registers (XMM8XMM15).
*64-bit instruction pointer (RIP). |
|
The AltiVec technology extends
the instruction set architecture (ISA) of
the PowerPC architecture. AltiVec technology is a short vector parallel
architecture. The
AltiVec ISA is based on separate vector/SIMD-style (single instruction
stream, multiple
data streams) execution units that have high data parallelism. |
Literature:
- The IA-32 Intel Architecture Software Developers Manual
consists of three volumes: Basic Architecture, Order Number 245470
- 24547004.pdf;
- AMD-K6 MMX Enhanced Processor. Multimedia Technology - 20726.pdf;
- AMD-K6-III Processor Data Sheet - 21918.pdf
- AltiVec Technology- Programming Environments Manual - altivec_pem.pdf;
- AMD 64-Bit Technology.The AMD x86-64 Architecture.Programmers
Overview - x86-64_overview.pdf
|
|
|
|
|