subject

Hw6-1 (43 points) suppose we wish to write a procedure that computes the inner productof two vectors u and v. an abstract version of the function has a cpe of 14{18 with x86-64 fordi erent types of integer andoating-point data. by doing the same sort of transformations we didto transform the abstract program combine1 into the more ecient combine4, we get the followingcode: void inner4(vec_ptr u, vec_ptr v, data t *dest) {long i; long length = vec_length(u); data_t *udata = get_vec_start(u); data_t *vdata = get_vec_start(v); data_t sum = (data_t) 0; for (i = 0; i < length; i++){sum = sum + udata[i] * vdata[i]; }*dest = sum; }our measurements show that this function has a cpe of 1.50 for integer data and 3.00 foroating-point data. for data type double, the x86-64 assembly code for the inner loop is asfollows: # inner loop of inner4. data_t = double. op = *.# udata in %rbp, vdata %rax, sum in %xmm0, i in rcx, limit in rbx. l15: # loop: vmovsd 0(%rbp,%rcx,8), %xmm1 # get udata[i]vmulsd (%rax,%rcx,8), %xmm1, %xmm1 # multiply by vdata[i]vaddsd %xmm1, %xmm0, %xmm0 # add to sumaddq $1, %rcx # increment icmpq %rbx, %rcx # compare i: limitjl .l15 # if < , goto loopassume that the functional units have the latencies and issue times given in figure 5.12 (andin the course notes).a. diagram how this instruction sequence would be decoded into operations, and show how the datadependencies between them would create a critical path of operations in the style of figures 5.13(figure: opt/dpb-sequential) and 5.14 (figure: opt/dpb-ow and figure: opt/dpb-ow-abstract). (25points.)b. for data type double, what lower bound on the cpe is determined by the critical path? givea numerical value and an explanation. (6 points.)c. assuming similar instruction sequences for the integer code as well, what lower bound on thecpe is determined by the critical path for integer data? give a numerical value and an explanation.(6 points.)d. explain how theoating-point version can have a cpe of 3.00 even though the multiplicationoperation requires 5 cycles. (6 points.)hw6-2 (27 points) write a version of the inner product procedure described in the previousproblem that uses six-way loop unrolling (6 1; no parallelism). (11 points.)

ansver
Answers: 1

Another question on Computers and Technology

question
Computers and Technology, 23.06.2019 10:00
Whats three fourths of 15(this is supposed to be in math but i clicked too fast)
Answers: 1
question
Computers and Technology, 23.06.2019 19:00
Choose the correct citation for the case which established the "minimum contacts" test for a court's jurisdiction in a case. select one: a. brown v. board of education of topeka, 347 u.s. 483 (1954). b. international shoe co. v. washington, 326 u.s. 310 (1945) c. haynes v. gore, 531 u.s. 98 (2000). d. international shoe co. v. washington, 14 u.s. code 336.
Answers: 1
question
Computers and Technology, 24.06.2019 10:00
When writing a business letter, how many times can you use the same merge field in a document? once once, unless using the address block feature unlimited it will depend on the type of document you choose
Answers: 1
question
Computers and Technology, 24.06.2019 11:00
Need fast im timed in a paragraph of 125 words, explain at least three ways that engineers explore possible solutions in their projects.
Answers: 2
You know the right answer?
Hw6-1 (43 points) suppose we wish to write a procedure that computes the inner productof two vectors...
Questions
question
Mathematics, 05.09.2019 02:10
question
Social Studies, 05.09.2019 02:10
question
Mathematics, 05.09.2019 02:10
question
Business, 05.09.2019 02:10
question
Mathematics, 05.09.2019 02:20
Questions on the website: 13722361