CSE – homework 3 – loop transformation Solution

$30.00 $24.00

For exercises 1 – 3, use the following code for square matrix-matrix multiplication: for i = 0; i < N; i++ J for j = 0; j < N; j++ for k = 0; k < N; k++ C[ i ][ j ] = A[ i ][ k ] B[ k ][ j ] [5…

5/5 – (2 votes)

You’ll get a: zip file solution

 

Categorys:

Description

5/5 – (2 votes)

For exercises 1 – 3, use the following code for square matrix-matrix multiplication:

  • for i = 0; i < N; i++ J for j = 0; j < N; j++

  • for k = 0; k < N; k++

C[ i ][ j ] = A[ i ][ k ] B[ k ][ j ]

  1. [5 pts.] Let N = 256, C = 64K, sizeof(element) = 8 bytes, B= 128, S = 1, LRU eviction. Perform an inner loop analysis.

  1. [5 pts.] Let N = 256, C = 64K, sizeof(element) = 8 bytes, B= 64, E = 4, LRU eviction. Perform an inner loop analysis.

  1. [5 pts.] Let N = 256, C = 64K, sizeof(element) = 8 bytes, B= 32, E = 2, LRU eviction. Perform an inner loop analysis.

For the following problems 4 – 7, refer to slides 18 – 19 of slide deck 4 – loop analysis:

  1. [5 pts.] Perform a total miss analysis for loop nest IKJ.

  1. [5 pts.] Perform a total miss analysis for loop nest KIJ.

  1. [5 pts.] Perform a total miss analysis for loop nest JKI.

  1. [5 pts.] Perform a total miss analysis for loop nest KJI.

For the following questions, assume contiguously allocated row major arrays.

8. [10 pts.] Given an LRU cache with parameters C = 2048, E = 2, B = 16 and assuming |element| = 4.

Let:

@a[1024] =

AAAA0000

using the code segment: for i = 0 to 1023

@b[1024] =

AAAA8000

for j = 0 to 1023

@c[1024] =

AAAB0010

for k = 0 to 1023

sum_prod += a[ i ] * b[ j ] + c[ k ]

sum_prod, i, j and k are in registers.

a.)

What is the access stride for each loop?.

b.)

What is the overall hit rate for each of a, b, and c?

c.)

What are the cache contents after the completion of the loop nest?

9. [5 pts.] Given an LRU cache with parameters C = 8k, E = 512, B = 16 and assuming |element| = 4.

Let:

@A[512,512] =

AAAA0000

using the code segment: for i = 0 to 511

@B[512,512] =

AAAA8000

for j = 0 to 511

@C[512,512] =

AAAB0010

for k = 0 to 511

C[ i ][ j ] += A[ j ][ k ] – B[ k ][ i ]

i, j and k are in registers.

a.) Perform a total miss analysis.

b.) Is there a better loop ordering for this problem? If so, state the preferred ordering. If not, why?

The Ohio State University

CSE - homework 3 - loop transformation Solution
$30.00 $24.00