Description
Overview
This homework is worth 100 points (out of 1000) toward your final grade. It is due on Sunday, April 14, at 11:59 p.m. Late submissions will be penalized 20% during a 3-day grace period up until Wednesday, April 17, 11:59 p.m. After that time, no late work will be accepted. Your homework submission should be a document saved and submitted as a PDF file via the link found in the Homework section of the “Week 12 Moodle March 31 – April 6” — which is the same place where you found this file.
This homework will give you hands-on practice in working with a sample Data Warehouse. In this homework you will create a data warehouse (built according to the dimensional model/star schema) and populate it using scripts provided. You will use the data warehouse you create for running some analytical queries and answering a few questions.
Objectives
-
Become familiar with the structure of a dimensional model / star schema data warehouse
-
Understand the unique nature of the date dimension
-
Successfully run the scripts necessary to create the sample data warehouse consisting of 5 dimension tables and one fact table. Run a script to verify that your data warehouse is correctly built.
-
Run SQL against your data warehouse to answer the assigned problems.
Step One: Ensure that your MySQL environment is working.
This assignment follows on Homework # 3 in which you created a MySQL environment on your personal computer and executed queries against it. For Homework # 5, you must first ensure that MySQL is up and running on your device, and that you are able to run SQL queries against your database(s).
Step Two: Creating the Data Warehouse Tables
To get started, you need to download some scripts from Moodle and create your database and tables, and then load the tables with data.
Before you can create your database, you need to make sure that your instance of MySQL is running.
Then using your query editor, you must connect to the MySQL instance.
Creating the Data Warehouse:
Homework # 5 – Data Warehouse Lab
The SQL statements to create the tables for your Homework # 5 data warehouse can be found on the Moodle site “Week 12 Moodle March 31 – April 6” under the Homework assignments heading. There are 9 scripts for you to run.
Script # 1: Create the Database
Scripts # 2-6: Create and Load the Five Dimension Tables
Script # 7: Create and Load the Fact Table
Script # 8: Create the Foreign Key Constraints needed for the Fact Table
Script # 9: Run the “verify” script to ensure that the data warehouse is built properly
Your Sales_DW database consists of the following tables:
-
Dim_Product
-
Dim_Store
-
Dim_Customer
-
Dim_Date
-
Dim_SalesPerson
-
Fact_ProductSales
-
Dates, Numbers, Numbers_Small – Created and used by the script to create the dim_date dimension, but NOT used for anything else.
To complete this homework, you must open up each of these 9 script files, copy the SQL statements, paste the SQL statements into your query editor and execute the script.
Note: Most scripts begin with a command to DROP the table before it creates it. This allows you to run the script over and over as needed.
After creating the five dimension tables and one fact table, run the “verify” script. You should see the following tables and row counts for each.
CSCI 3287 Database Systems Page 2
Homework # 5 – Data Warehouse Lab
Preparing Your Homework Submission
Your results for this homework assignment should be captured in a document (such as a .txt file, MS Word or similar tool.) Please then save your final deliverable document as a PDF. Use the link found in the Homework Assignment section of “Week 12 Moodle March 31 – April 6” Moodle site to submit your work for grading. If you are doing PAIR PROGRAMMING on this assignment, please be sure to identify the name of your programming partner on your submission. You must EACH submit your own results document for this homework and both names must appear on all submitted documents.
Data Warehouse Problems
For this homework you must answer the questions below stating the results of your analysis of the data in the data warehouse. Each answer should be stated in a sentence or two providing the requested analysis. For these questions, you must create and execute one or more SQL Queries against the sales_dw data warehouse to answer the question. In addition to the answer to the question, you must turn in your SQL code AND your answer set from the query you used to determine the answer.
Question 1. Which salesperson produced the most total revenue for this organization during 2012? (Total Revenue = SalesPrice * Quantity)
Question 2. Which customer’s revenue increased the most from 2012 to 2013?
(Total Revenue = SalesPrice * Quantity)
Question 3. Rank all stores by their total revenue for each year represented in the data warehouse.
In other words, for a given year rank the three stores by revenue for that year.
Question 4. Which product yielded the highest profit in 2015?
(Profit = (sales price – actual cost) * quantity )
Question 5. In which calendar quarter in 2016 did the Boulder store show the highest revenue?
CSCI 3287 Database Systems Page 3