logo资料库

intel_c++_intrinsics.pdf

第1页 / 共202页
第2页 / 共202页
第3页 / 共202页
第4页 / 共202页
第5页 / 共202页
第6页 / 共202页
第7页 / 共202页
第8页 / 共202页
资料共202页,剩余部分请下载后查看
Intel(R) C++ Intrinsic Reference
Disclaimer
Overview: Intrinsics Reference
Intrinsics for Intel(R) C++ Compilers
Availability of Intrinsics on Intel Processors
Details about Intrinsics
Registers
Data Types
New Data Types Available
__m64 Data Type
__m128 Data Types
Data Types Usage Guidelines
Accessing __m128i Data
Naming and Usage Syntax
References
Intrinsics for Use across All IA
Overview: Intrinsics for All IA
Integer Arithmetic Intrinsics
Floating-point Intrinsics
String and Block Copy Intrinsics
Miscellaneous Intrinsics
MMX(TM) Technology Intrinsics
Overview: MMX(TM) Technology Intrinsics
The EMMS Instruction: Why You Need It
Why You Need EMMS to Reset After an MMX(TM) Instruction
EMMS Usage Guidelines
MMX(TM) Technology General Support Intrinsics
MMX(TM) Technology Packed Arithmetic Intrinsics
MMX(TM) Technology Shift Intrinsics
MMX(TM) Technology Logical Intrinsics
MMX(TM) Technology Compare Intrinsics
MMX(TM) Technology Set Intrinsics
MMX(TM) Technology Intrinsics on IA-64 Architecture
Data Types
Streaming SIMD Extensions
Overview: Streaming SIMD Extensions
Floating-point Intrinsics for Streaming SIMD Extensions
Arithmetic Operations for Streaming SIMD Extensions
Logical Operations for Streaming SIMD Extensions
Comparisons for Streaming SIMD Extensions
Conversion Operations for Streaming SIMD Extensions
Load Operations for Streaming SIMD Extensions
Set Operations for Streaming SIMD Extensions
Store Operations for Streaming SIMD Extensions
Cacheability Support Using Streaming SIMD Extensions
Integer Intrinsics Using Streaming SIMD Extensions
Intrinsics to Read and Write Registers for Streaming SIMD Extensions
Miscellaneous Intrinsics Using Streaming SIMD Extensions
Using Streaming SIMD Extensions on IA-64 Architecture
Data Types
Compatibility versus Performance
Macro Functions
Macro Function for Shuffle Using Streaming SIMD Extensions
Shuffle Function Macro
View of Original and Result Words with Shuffle Function Macro
Macro Functions to Read and Write the Control Registers
Exception State Macros with _MM_EXCEPT_DIV_ZERO
Macro Function for Matrix Transposition
Matrix Transposition Using _MM_TRANSPOSE4_PS Macro
Streaming SIMD Extensions 2
Overview: Streaming SIMD Extensions 2
Floating-point Intrinsics
Floating-point Arithmetic Operations for Streaming SIMD Extensions 2
Floating-point Logical Operations for Streaming SIMD Extensions 2
Floating-point Comparison Operations for Streaming SIMD Extensions 2
Floating-point Conversion Operations for Streaming SIMD Extensions 2
Floating-point Load Operations for Streaming SIMD Extensions 2
Floating-point Set Operations for Streaming SIMD Extensions 2
Floating-point Store Operations for Streaming SIMD Extensions 2
Integer Intrinsics
Integer Arithmetic Operations for Streaming SIMD Extensions 2
Integer Logical Operations for Streaming SIMD Extensions 2
Integer Shift Operations for Streaming SIMD Extensions 2
Integer Comparison Operations for Streaming SIMD Extensions 2
Integer Conversion Operations for Streaming SIMD Extensions 2
Integer Move Operations for Streaming SIMD Extensions 2
Integer Load Operations for Streaming SIMD Extensions 2
Integer Set Operations for SSE2
Integer Store Operations for Streaming SIMD Extensions 2
Miscellaneous Functions and Intrinsics
Cacheability Support Operations for Streaming SIMD Extensions 2
Miscellaneous Operations for Streaming SIMD Extensions 2
Intrinsics for Casting Support
Pause Intrinsic for Streaming SIMD Extensions 2
Macro Function for Shuffle
Shuffle Function Macro
View of Original and Result Words with Shuffle Function Macro
Streaming SIMD Extensions 3
Overview: Streaming SIMD Extensions 3
Integer Vector Intrinsics for Streaming SIMD Extensions 3
Single-precision Floating-point Vector Intrinsics for Streaming SIMD Extensions 3
Double-precision Floating-point Vector Intrinsics for Streaming SIMD Extensions 3
Macro Functions for Streaming SIMD Extensions 3
Miscellaneous Intrinsics for Streaming SIMD Extensions 3
Supplemental Streaming SIMD Extensions 3
Overview: Supplemental Streaming SIMD Extensions 3
Addition Intrinsics
Subtraction Intrinsics
Multiplication Intrinsics
Absolute Value Intrinsics
Shuffle Intrinsics for Streaming SIMD Extensions 3
Concatenate Intrinsics
Negation Intrinsics
Streaming SIMD Extensions 4
Overview: Streaming SIMD Extensions 4
Streaming SIMD Extensions 4 Vectorizing Compiler and Media Accelerators
Overview: Streaming SIMD Extensions 4 Vectorizing Compiler and Media Accelerators
Packed Blending Intrinsics for Streaming SIMD Extensions 4
Floating Point Dot Product Intrinsics for Streaming SIMD Extensions 4
Packed Format Conversion Intrinsics for Streaming SIMD Extensions 4
Packed Integer Min/Max Intrinsics for Streaming SIMD Extensions 4
Floating Point Rounding Intrinsics for Streaming SIMD Extensions 4
DWORD Multiply Intrinsics for Streaming SIMD Extensions 4
Register Insertion/Extraction Intrinsics for Streaming SIMD Extensions 4
Test Intrinsics for Streaming SIMD Extensions 4
Packed DWORD to Unsigned WORD Intrinsic for Streaming SIMD Extensions 4
Packed Compare for Equal for Streaming SIMD Extensions 4
Cacheability Support Intrinsic for Streaming SIMD Extensions 4
Streaming SIMD Extensions 4 Efficient Accelerated String and Text Processing
Overview: Streaming SIMD Extensions 4 Efficient Accelerated String and Text Processing
Packed Comparison Intrinsics for Streaming SIMD Extensions 4
Application Targeted Accelerators Intrinsics
Intrinsics for IA-64 Instructions
Overview: Intrinsics for IA-64 Instructions
Native Intrinsics for IA-64 Instructions
Integer Operations
FSR Operations
Lock and Atomic Operation Related Intrinsics
Lock and Atomic Operation Related Intrinsics
Load and Store
Operating System Related Intrinsics
Conversion Intrinsics
Register Names for getReg() and setReg()
General Integer Registers
Application Registers
Control Registers
Indirect Registers for getIndReg() and setIndReg()
Multimedia Additions
Table 1. Values of n for m64_mux1 Operation
Synchronization Primitives
Atomic Fetch-and-op Operations
Atomic Op-and-fetch Operations
Atomic Compare-and-swap Operations
Atomic Synchronize Operation
Atomic Lock-test-and-set Operation
Atomic Lock-release Operation
Miscellaneous Intrinsics
Intrinsics for Dual-Core Intel(R) Itanium(R) 2 processor 9000 series
Examples
Microsoft-compatible Intrinsics for Dual-Core Intel® Itanium® 2 processor 9000 series
Data Alignment, Memory Allocation Intrinsics, and Inline Assembly
Overview: Data Alignment, Memory Allocation Intrinsics, and Inline Assembly
Alignment Support
Allocating and Freeing Aligned Memory Blocks
Inline Assembly
Microsoft Style Inline Assembly
GNU*-like Style Inline Assembly (IA-32 architecture and Intel(R) 64 architecture only)
Example
Example
Intrinsics Cross-processor Implementation
Overview: Intrinsics Cross-processor Implementation
Intrinsics For Implementation Across All IA
MMX(TM) Technology Intrinsics Implementation
Key to the table entries
Streaming SIMD Extensions Intrinsics Implementation
Key to the table entries
Streaming SIMD Extensions 2 Intrinsics Implementation
Index
Intel® C++ Intrinsic Reference Document Number: 312482-003US
Disclaimer and Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See http://www.intel.com/products/processor_number for details. BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino logo, Core Inside, FlashFile, i960, InstantIP, Intel, Intel logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel vPro, Intel XScale, IPLink, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium, Pentium Inside, skoool, Sound Mark, The Journey Inside, VTune, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries. * Other names and brands may be claimed as the property of others. Copyright (C) 1996–2007, Intel Corporation. All rights reserved. Portions Copyright (C) 2001, Hewlett-Packard Development Company, L.P.
Table Of Contents Overview: Intrinsics Reference......................................................................... 1 Intrinsics for Intel® C++ Compilers ............................................................... 1 Availability of Intrinsics on Intel Processors ..................................................... 1 Details about Intrinsics ................................................................................... 2 Registers ................................................................................................... 2 Data Types................................................................................................. 2 New Data Types Available.......................................................................... 2 __m64 Data Type..................................................................................... 3 __m128 Data Types.................................................................................. 3 Data Types Usage Guidelines ..................................................................... 3 Accessing __m128i Data............................................................................ 3 Naming and Usage Syntax .............................................................................. 5 References.................................................................................................... 7 Intrinsics for Use across All IA.......................................................................... 8 Overview: Intrinsics for All IA........................................................................ 8 Integer Arithmetic Intrinsics ......................................................................... 8 Floating-point Intrinsics................................................................................ 9 String and Block Copy Intrinsics ...................................................................11 Miscellaneous Intrinsics...............................................................................12 MMX(TM) Technology Intrinsics .......................................................................15 Overview: MMX(TM) Technology Intrinsics .....................................................15 The EMMS Instruction: Why You Need It........................................................15 Why You Need EMMS to Reset After an MMX(TM) Instruction .........................15 EMMS Usage Guidelines ..............................................................................16 iii
Table Of Contents MMX(TM) Technology General Support Intrinsics.............................................16 MMX(TM) Technology Packed Arithmetic Intrinsics ..........................................18 MMX(TM) Technology Shift Intrinsics.............................................................20 MMX(TM) Technology Logical Intrinsics..........................................................23 MMX(TM) Technology Compare Intrinsics.......................................................23 MMX(TM) Technology Set Intrinsics...............................................................24 MMX(TM) Technology Intrinsics on IA-64 Architecture .....................................27 Data Types................................................................................................27 Streaming SIMD Extensions............................................................................28 Overview: Streaming SIMD Extensions ..........................................................28 Floating-point Intrinsics for Streaming SIMD Extensions...................................28 Arithmetic Operations for Streaming SIMD Extensions .....................................28 Logical Operations for Streaming SIMD Extensions..........................................32 Comparisons for Streaming SIMD Extensions .................................................33 Conversion Operations for Streaming SIMD Extensions ....................................42 Load Operations for Streaming SIMD Extensions.............................................46 Set Operations for Streaming SIMD Extensions...............................................47 Store Operations for Streaming SIMD Extensions ............................................49 Cacheability Support Using Streaming SIMD Extensions ...................................50 Integer Intrinsics Using Streaming SIMD Extensions........................................51 Intrinsics to Read and Write Registers for Streaming SIMD Extensions ...............54 Miscellaneous Intrinsics Using Streaming SIMD Extensions ...............................55 Using Streaming SIMD Extensions on IA-64 Architecture..................................56 Data Types .............................................................................................57 Compatibility versus Performance ..............................................................57 iv
Table Of Contents Macro Functions............................................................................................59 Macro Function for Shuffle Using Streaming SIMD Extensions ...........................59 Shuffle Function Macro .............................................................................59 View of Original and Result Words with Shuffle Function Macro.......................59 Macro Functions to Read and Write the Control Registers .................................59 Exception State Macros with _MM_EXCEPT_DIV_ZERO..................................60 Macro Function for Matrix Transposition.........................................................61 Matrix Transposition Using _MM_TRANSPOSE4_PS Macro ..............................61 Streaming SIMD Extensions 2 .........................................................................62 Overview: Streaming SIMD Extensions 2 .......................................................62 Floating-point Intrinsics...............................................................................63 Floating-point Arithmetic Operations for Streaming SIMD Extensions 2............63 Floating-point Logical Operations for Streaming SIMD Extensions 2 ................66 Floating-point Comparison Operations for Streaming SIMD Extensions 2..........67 Floating-point Conversion Operations for Streaming SIMD Extensions 2...........74 Floating-point Load Operations for Streaming SIMD Extensions 2 ...................78 Floating-point Set Operations for Streaming SIMD Extensions 2 .....................80 Floating-point Store Operations for Streaming SIMD Extensions 2 ..................81 Integer Intrinsics........................................................................................83 Integer Arithmetic Operations for Streaming SIMD Extensions 2.....................83 Integer Logical Operations for Streaming SIMD Extensions 2 .........................90 Integer Shift Operations for Streaming SIMD Extensions 2 ............................91 Integer Comparison Operations for Streaming SIMD Extensions 2...................95 Integer Conversion Operations for Streaming SIMD Extensions 2....................98 Integer Move Operations for Streaming SIMD Extensions 2............................99 v
Table Of Contents Integer Load Operations for Streaming SIMD Extensions 2 ..........................100 Integer Set Operations for SSE2 ..............................................................101 Integer Store Operations for Streaming SIMD Extensions 2 .........................104 Miscellaneous Functions and Intrinsics ...........................................................106 Cacheability Support Operations for Streaming SIMD Extensions 2 ..................106 Miscellaneous Operations for Streaming SIMD Extensions 2 .........................107 Intrinsics for Casting Support ..................................................................112 Pause Intrinsic for Streaming SIMD Extensions 2........................................112 Macro Function for Shuffle ......................................................................113 Shuffle Function Macro ...........................................................................113 View of Original and Result Words with Shuffle Function Macro.....................113 Streaming SIMD Extensions 3 .......................................................................115 Overview: Streaming SIMD Extensions 3 .....................................................115 Integer Vector Intrinsics for Streaming SIMD Extensions 3 .............................115 Single-precision Floating-point Vector Intrinsics for Streaming SIMD Extensions 3 .............................................................................................................115 Double-precision Floating-point Vector Intrinsics for Streaming SIMD Extensions 3 .............................................................................................................117 Macro Functions for Streaming SIMD Extensions 3 ........................................118 Miscellaneous Intrinsics for Streaming SIMD Extensions 3 ..............................118 Supplemental Streaming SIMD Extensions 3 ...................................................120 Overview: Supplemental Streaming SIMD Extensions 3 .................................120 Addition Intrinsics ....................................................................................120 Subtraction Intrinsics................................................................................122 Multiplication Intrinsics..............................................................................123 Absolute Value Intrinsics ...........................................................................124 vi
Table Of Contents Shuffle Intrinsics for Streaming SIMD Extensions 3 .......................................126 Concatenate Intrinsics ..............................................................................127 Negation Intrinsics ...................................................................................127 Streaming SIMD Extensions 4 .......................................................................131 Overview: Streaming SIMD Extensions 4 .....................................................131 Streaming SIMD Extensions 4 Vectorizing Compiler and Media Accelerators ......131 Overview: Streaming SIMD Extensions 4 Vectorizing Compiler and Media Accelerators..........................................................................................131 Packed Blending Intrinsics for Streaming SIMD Extensions 4........................131 Floating Point Dot Product Intrinsics for Streaming SIMD Extensions 4 ..........132 Packed Format Conversion Intrinsics for Streaming SIMD Extensions 4..........132 Packed Integer Min/Max Intrinsics for Streaming SIMD Extensions 4 .............134 Floating Point Rounding Intrinsics for Streaming SIMD Extensions 4..............135 DWORD Multiply Intrinsics for Streaming SIMD Extensions 4........................136 Register Insertion/Extraction Intrinsics for Streaming SIMD Extensions 4 ......136 Test Intrinsics for Streaming SIMD Extensions 4 ........................................137 Packed DWORD to Unsigned WORD Intrinsic for Streaming SIMD Extensions 4 ..........................................................................................................138 Packed Compare for Equal for Streaming SIMD Extensions 4 .......................138 Cacheability Support Intrinsic for Streaming SIMD Extensions 4 ...................138 Streaming SIMD Extensions 4 Efficient Accelerated String and Text Processing..138 Overview: Streaming SIMD Extensions 4 Efficient Accelerated String and Text Processing ............................................................................................138 Packed Comparison Intrinsics for Streaming SIMD Extensions 4 ...................139 Application Targeted Accelerators Intrinsics...............................................141 Intrinsics for IA-64 Instructions.....................................................................143 vii
Table Of Contents Overview: Intrinsics for IA-64 Instructions...................................................143 Native Intrinsics for IA-64 Instructions ........................................................143 Integer Operations ................................................................................143 FSR Operations .....................................................................................144 Lock and Atomic Operation Related Intrinsics ...............................................145 Lock and Atomic Operation Related Intrinsics ...............................................148 Load and Store ........................................................................................151 Operating System Related Intrinsics............................................................152 Conversion Intrinsics ................................................................................155 Register Names for getReg() and setReg() ...................................................155 General Integer Registers .......................................................................156 Application Registers..............................................................................156 Control Registers ...................................................................................157 Indirect Registers for getIndReg() and setIndReg() ....................................158 Multimedia Additions.................................................................................158 Table 1. Values of n for m64_mux1 Operation ...........................................161 Synchronization Primitives.........................................................................164 Atomic Fetch-and-op Operations..............................................................164 Atomic Op-and-fetch Operations ..............................................................164 Atomic Compare-and-swap Operations .....................................................165 Atomic Synchronize Operation .................................................................165 Atomic Lock-test-and-set Operation .........................................................165 Atomic Lock-release Operation ................................................................165 Miscellaneous Intrinsics.............................................................................165 Intrinsics for Dual-Core Intel® Itanium® 2 processor 9000 series...................166 viii
分享到:
收藏