Problem Set 3: I/O cost & Query Optimization Solution

$30.00 $24.00

For the following questions, assume the following database schema: Users (UserID INTEGER, Name CHAR(30), Age INTEGER, ReviewCount INTEGER) Businesses (BusinessID INTEGER, BName CHAR(30), City CHAR(20), State CHAR(2 )) Checkins (BusinessID INTEGER, Weekdays INTEGER, Weekends INTEGER) Reviews (ReviewID INTEGER, UserID INTEGER, BusinessID INTEGER, Stars REAL) Reviews (UserID) is a foreign key referring to Users (UserID). Reviews…

5/5 – (2 votes)

You’ll get a: zip file solution

 

Description

5/5 – (2 votes)

For the following questions, assume the following database schema:

Users (UserID INTEGER, Name CHAR(30), Age INTEGER, ReviewCount INTEGER)

Businesses (BusinessID INTEGER, BName CHAR(30), City CHAR(20), State CHAR(2 ))

Checkins (BusinessID INTEGER, Weekdays INTEGER, Weekends INTEGER)

Reviews (ReviewID INTEGER, UserID INTEGER, BusinessID INTEGER, Stars REAL)

Reviews (UserID) is a foreign key referring to Users (UserID).

Reviews (BusinessID) is a foreign key referring to Businesses (BusinessID).

Checkins (BusinessID) is a foreign key referring to Businesses (BusinessID).

Question 1 16 pts

Consider the following SQL query:

SELECT *

FROM Users

WHERE (Name = “John” OR Name = “Mary”) AND (Age = 20 OR Age > 50) ;

Which of the following indexes on the table Users match the selection condition?

https://canvas.wisc.edu/courses/176732/quizzes/167910/take Page 1 of 5

Quiz: Problem Set 3: I/O cost & Query Optimization , 22:39

hash index on (Name)

B+ tree index on (UserID, Age)

B+ tree index on (Age)

hash index on (Age)

Question 2 15 pts

Consider the following SQL query:

SELECT *

FROM Users

WHERE Age >= 20 AND Age < 30 AND ReviewCount >= 500 ;

What is the selectivity of the above selection? Assume that the values of Age are integers uniformly distributed between 10 and 109, and the values of ReviewCount are integers uniformly distributed between 0 and 999.

1%

5%

10%

20%

Question 3 18 pts

We want to compute the natural join between Businesses and Checkins. Assume that Businesses has 20,000 pages, while Checkins has 40,000 pages. The buffer pool has size B = 10,000. Finally, assume that both relations are

https://canvas.wisc.edu/courses/176732/quizzes/167910/take Page 2 of 5

Quiz: Problem Set 3: I/O cost & Query Optimization , 22:39

sorted on the join attribute BusinnesID.

Compute the I/O cost of the following join algorithms:

Join Algorithm

I/O cost

Block Nested Loop Join

Sort Merge Join

Hash Join

Question 4 16 pts

Consider the Relational Algebra expression:

Which of the following RA expressions are equivalent?

Question 5 20 pts

Consider the following SQL query:

https://canvas.wisc.edu/courses/176732/quizzes/167910/take Page 3 of 5

Quiz: Problem Set 3: I/O cost & Query Optimization , 22:39

SELECT City, COUNT(BusinessID)

FROM Businesses

GROUP BY City;

Suppose we compute the above query using hash-based aggregation. Recall that in this algorithm, we construct an in-memory hash table, where each entry of the table is of the form <City, Count> (City is the key, and Count holds the running count). Assume that each entry of the hash table is 30 Bytes, each page holds 1200 Bytes, and the fudge factor of the hash table is f = 1.2.

If the number of distinct cities in the Businesses table is 10,000, what is the smallest possible buffer size such that the algorithm runs in a single pass (i.e., the hash table fits in the buffer)?

300

301

250

251

Question 6 15 pts

Suppose we want to compute the following query:

How many left-deep join plans exist that can be used to compute this query?

https://canvas.wisc.edu/courses/176732/quizzes/167910/take Page 4 of 5

Quiz: Problem Set 3: I/O cost & Query Optimization

, 22:39

Quiz saved at 10:39pm Submit Quiz

https://canvas.wisc.edu/courses/176732/quizzes/167910/take Page 5 of 5

Problem Set 3: I/O cost & Query Optimization Solution
$30.00 $24.00