SSE3 Example

In this example of complex multiplication, several SSE3 instructions are used to improve performance when your compilation is targeted for the IntelŪ PentiumŪ 4 processor with Streaming SIMD Extensions 3.

float _Complex a[100];

float _Complex b[100];

void doit(void)

{

  int i;

  for (i=0;i<100;i++)

  {

    a[i] *= b[i];

  }

}

 

SSE3 Instruction Intrinsic Equivalent Description
addsubps __m128 _mm_addsub_ps(__m128 a, __m128 b) Packed Single FP Add/Subtract
movshdup __m128 _mm_movehdup_ps(__m128 a) Move Packed Single FP High and Duplicate
movsldup __m128 _mm_moveldup_ps(__m128 a) Move Packed Single FP Low and Duplicate