Homework 2 Solution

$30.00 $24.00

Part 1: Text, Joins and Subqueries For some reason, your instructor has been scraping the Caltrans website every 15 minutes or so, since 2015, to get road conditions on all of the highways within California. The data is written to MySQL. Your version of the data is hourly, and only for 2017. A Caltrans highway…

5/5 – (2 votes)

You’ll get a: zip file solution

 

Categorys:
Tags:

Description

5/5 – (2 votes)

Part 1: Text, Joins and Subqueries

For some reason, your instructor has been scraping the Caltrans website every 15 minutes or so, since 2015, to get road conditions on all of the highways within California. The data is written to MySQL. Your version of the data is hourly, and only for 2017.

A Caltrans highway conditions report looks like the following and contains conditions for individual stretches of highway (\area”) typically representing a coarse area of the state: Northern, Southern, Central, Sierra Nevada etc.

SR 120

[IN THE CENTRAL CALIFORNIA AREA & SIERRA NEVADA]

IS CLOSED FROM CRANE FLAT TO 5 MI WEST OF THE JCT OF US 395 /TIOGA PASS/ (TUOLUMNE, MONO CO) – FOR THE WINTER – MOTORISTS ARE ADVISED TO USE AN ALTERNATE ROUTE

[YOSEMITE NAT’L PARK]

FOR YOSEMITE NAT’L PARK ROAD INFORMATION CALL 209-372-0200

The schema for the caltrans table looks like the following:

CREATE TABLE caltrans (

reported timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,

highway varchar(6) NOT NULL,

area varchar(255),

condition text NOT NULL,

hash varchar(32) NOT NULL

);

1

reported is the time the data was scraped, highway is the highway the status pertains to pre xed by its type (i.e. US101, SR1, I405), area refers to a particular part of the state or highway, and condition is the update itself. Since we cannot use text as a primary key, a hash column was added.

US Highway (US) California State Route (SR) Interstate (I)

Exercises

  1. Write a query that returns a list of all the highway stretches in 2017 that were closed due to snow at any point of the year, or were closed for the winter. Order them by highway and area and give us the top 20 results, both columns in descending order. (Hint 1: You don’t need to do anything with dates to answer this question. Hint 2: Before writing a query, look at the data.)

  1. For each highway stretch found in part (a), compute the percentage of days out of the year that it was closed. If a highway stretch was closed for only a partial day, it counts as a full day. There are at least three ways to solve this problem. Try to use a method that involves a join, and a method that does not. Report the highway, area/stretch, the percentage of days it was closed in descending order by percentage, and only give us the 5 highest percentages and the highways and areas they belong to. You may hardcode the number of days in the year (see the note below).

Note that not all of your responses will be perfect in theory because there were times when the instructor’s script lost Internet connectivity, or the power went out. So, there may not be exactly 365 days in 2017. You will want to nd the number of days represented in this dataset.

Part 2: Join De nitions

Exercises

  1. Your instructor almost included the above Venn Diagram in his lecture slides to show how di erent types of joins are related, but he noticed that it was wrong in at least one way. Explain at least one thing that is wrong about the diagram.

2

Part 3: More Joins and Subqueries

In Homework 1, we did several things with the Bird Scooter use case, but we did not have any data to practice writing queries with. Suppose we now have trip data in the following two tables:

  1. trip starts;

  1. trip ends;

Exercises

  1. Write a query that computes the elapsed time of each trip. If something happened and a trip end was not recorded, the elapsed time shall be 24 hours, per Bird’s policy. Print your results as trip id, user id, and trip length. Only show the rst 5, without any special ordering.

  1. Write a query that computes the charge to the user for each trip. The charge is calculated as follows: $1 at rate per trip plus 15 cents per minute. All fractional minutes are rounded up to the next minute. Assume we did not store the results of the query from part (a). Print the rst 5 results (no ordering) as trip id, user id and trip charge.

  1. Putting it all together: Suppose we bill the user at the end of the month rather than at the end of each trip. Write a query that computes the monthly charge for trips in March 2018 for each user assuming we did not store the results from parts (a) or (b). The max daily charge is $100. Print your results: user id and monthly total for the rst ve users (no ordering). In particular, how much does user id = 2 owe?

  1. In the solution set for Homework 1, it was mentioned that another way we can record starts and ends of trips was to use one table, 2 rows per trip: one row representing the start and a second row representing the end of the trip. We would then have an enum or bit(1) that speci es whether the row refers to a start or an end. If we wanted to use this one single table as the basis to charge users, what type of join would we need to compute?

3

Homework 2 Solution
$30.00 $24.00