Description
Introduction
This assignment will have you implement some mongo functionality in Java. It is loosely based on the existing MongoDB driver for Java. This driver is used to connect to an existing Mongo database in your Java applications. Instead of connecting to an existing Mongo database, your assignment will also be in charge of managing the data – reading and writing JSON from disk.
To assist with handling the JSON, you are provided with a JSON parser called Gson. You can use this library to assist you with parsing the JSON contents of the database, and you will use objects from this library to assist you in implementing Mongo style queries.
Overview
Since this homework assignment is unrelated to the previous assignments we’ve completed in this class, it has been put into a separate repository. As you did with the previous repository, make a private fork of this repository, making sure to put your name (and your partner’s name) on the repository name.
Inside the repo you will find four files for you to complete: DB.java, DBCollection.java, DBCursor.java and Document.java. Below is a description of each of these files as well as some tips on how to implement the required functionality of this assignment.
Documents
For our documents we are using JsonObjects from the Gson library. These objects will contain all of the key value pairs for a given document. You can then use this information to help you complete queries (using find()). Note that these JsonObjects are also used to encapsulate query parameters.
The Document
class contains two methods: one that converts a String into a JsonObject
and one that reverses this process. These methods will be used when accessing documents through a collection (see below). Turning a string into a JsonObject
can be accomplished with the help of the JsonParser object from the Gson library.
For the purposes of this assignment, the data types that a document can contain will be restricted to the follwing:
* Strings
* Documents (embedded documents)
* Arrays
There should be no restrictions on how these can be used in a document. For example, your implementation should be able to handle an array of documents, an embedded document with an array, an array of arrays, etc.
The above constructs correspond to the JsonPrimitive, JsonObject, and JsonArray objects, respectively. You will need to become familiar with these objects and use them in your implementation. You will likely also have to use the JsonElement object in your implementation as well.
DB
This class is the entry point to any interaction with this Mongo implementation. It is loosely based on the same class provided by the Mongo Java driver.
You can use this class to set which Mongo database you are currently interacting with. A Mongo database is represented on disk by a directory of collections (more on collections below). Essentially, the database is keeping track of which directory you are currently working in.
All databases should be contained in the “testfiles” directory that has been provided to you in the Eclipse project. For example, if I wanted to access the “hw5” database, I would expect to see a directory called “testfiles/hw5” in my Eclipse project. I could access that database in code by doing: new
DB("hw5");
If a database is requested that does not yet exist, it should be created (by creating the directory). If a database is dropped, the directory associated with it should be deleted.
DBCollection
This class is loosely based on the same class from the Mongo Java driver. It’s primary role is to manage a single collection. This includes accessing documents, inserting/updating/deleting documents, and querying documents.
A collection is represented on disk as a file with JSON data, stored in plain text. These files will be stored in the appropriate database. So for example, if I wanted to access the “assignments” collection in the “hw5” database, I would expect to see a file called “assignments.json” in the “testfiles/hw5” directory. If a collection is requested that does not yet exist, this file should be created.
Inserting documents is as simple as adding information to this file. To make retrieving documents easier, I suggest that you include a blank line with a single tab (\t) charater in between each document in the collection. This will make it easier for you to search the document using the getDocument()
method provided in this class.
Removing documents from the collection will be tricky. You have a few options on how to implement this. You could include a “header” of sorts at the beginning of the file to indicate which documents have been deleted or not. Or you could include a flag with each document indicating whether it is still active or not. Or you could also consider making a new copy of the collection when a document is deleted, leaving the deleted document out of the collection. All of these approaches have their pros and cons, but all of them will work for the purposes of this assignment.
Querying is the other main responsibility of the collection. Querys will return a DBCursor
(see below), and you should consider letting the cursor do most of the work when it comes to querying.
For this assignment you need to be able to handle the following types of queries:
* Queries that include all documents from the collection
* Queries that request a single document from the collection
* Queries based on data in an embedded document or list
* Queries with comparison operators (you do not have to implement the other operator types)
DBCursor
The DBCursor class represents the result of a query. It allows the user to navigate through the resulting documents one at a time.
It is recommended that the actual query processing take place in this class, perhaps when the cursor is constructed. Other implementations are possible, but this will lead to a clean design.
Testing
We have spent a lot of time this semester discussing how to properly test your code. Unlike previous assignments, you will not be provided with a suite of tests. Rather, you will be expected to test your code yourself. For this assignment I want you to write at least 6 additional unit tests. You should submit these tests with the rest of your assignment, and they will be graded with the rest of your assignment. You are encouraged to write more tests than that to verify that your code is working properly – unlike previous assignments you will not be given unit tests to use while you are working on your implementation.
An additional set of tests written by the instructor will be used during grading to verify the functionality of your submission.
Grading
DB (15 points)
Document (15 points)
Collection: insert/update/delete (20 points), querying (10 points), everything else (10 points)
DBCursor (20 points)
Tests (10 points)
Total (100 points)