February 4th, 2004
The Intel Streaming SIMD Extension 3 (SSE3) is a set of multimedia extension instructions included in the new HT technology Intel Pentium4 (codename Prescott) produced with 90 nm techniques. Compared to previous versions, software optimized for the SSE3 realize better process performances and efficiency.
The new TMPGEnc 3.0 XPress currently in development (first release planned for Summer 2004) is optimized for the latest Intel Pentium 4 processor with HT Technology and also the previous Pentium4.
"Intel welcomes the release of TMPGEnc 3.0 XPress which is tuned for new Intel Pentium 4 Processor with HT technology. Video encoding is a performance hungry task and new 90nm process based Intel Pentium 4 processor with HT technology provides great working environment." Kevin Sellers, Director of marketing, Intel K.K.
Processing speed increase by the SSE3 in TMPGEnc 3.0 XPress
TMPGEnc 3.0 XPress uses the new memory access instructions in the latest Intel Pentium 4 processor with HT Technology and also optimize the MPEG-1/2 encoder. The SSE3 shows especially spectacular results in movement search routine acceleration. Moreover, software optimized for previous version of the Intel Pentium 4 processor with HT Technology will also take advantage of the new performances, providing an overall acceleration.
- Memory access acceleration during the essential movement estimation process when encoding to MPEG.
- Reduction of the entire program memory access penalty
- Achieved code optimization by single purpose performance comparison with past Intel Pentium 4 processor with HT Technology models.
TMPGEnc 3.0 XPress SSE3 optimization test results
||MPEG-2 CBR File
1’15’’, 6 Mbps, 48.5 MB
||Re-encode the file above in MPEG-2 with TMPGEnc 3.0 XPress
||Video: DVD NTSC, CBR 6000 Kbps, high precision movement estimation
Audio: MP2 384 kbps.
||CPU: all possible instruction sets enabled
Multithread: all enabled
Cache settings: disabled
||MPEG-2 file input and output on the same hard disc drive.
*Test realized by Pegasys Inc.
** This benchmark is provided for information only. Results may change according to the hardware configuration.
Through the results of the benchmark test conducted by our Company, we can verify the process speed acceleration effect of the SSE3 optimization of TMPGEnc 3.0 XPress.
We found that when the SSE2 instructions are enabled with the Opteron and Atholon64 processor, execution speed decrease. The reason is if SSE and SSE2 floating point instruction and integer instruction are used in a situation where they are dependant each other, performance dramatically decrease*. This is a particularity of the Opteron and Athlon 64 processor and it is possible a change in the future version brings improvements.
In each parts of TMPGEnc are codes using such instructions, especially the DCT or NR filters.
* At an instruction level it takes 10x the normal time, there is no anomaly in such results.
Pegasys continues to improve the TMPGEnc 3.0 XPress optimization, heading to a better product quality.
* Product characteristics may change without notification.
* Windows is a trademark of Microsoft Corporation for the USA and other countries.
* All names, product and company names cited in this document are property of their respective owner. (TM) and (R) are omitted in this document.