CS 4604 Project Assignment 3


Released on Mar 8, 2013. Hardcopy due at the start of class on Mar 22, 2013.
  1. (0 points) Modify your database schema to address all our comments on the solution you turned in for Project Assignment 2. Itemize all the changes you made. It is enough if you explain in words, e.g., "We created a table called Pubs to store all keys and titles of all the publications in the database. We added constraints to create foreign key referecnes to this tables from tables Articles, InProceedings, ...". We will not grade assignment 3 unless you list the changes made to your schema. If you did not make any changes, explain why.

  2. (10 points) Conferences are becoming increasingly important venues in computer science. Journals on the other hand are becoming less and less important. Nevertheless, scientists continue to first publish a paper in a conference and then submit a full version (with the same title) to a journal. Count the number of publications that first appeared in a conference (the type of such a publication is inproceedings) and later appeared with the same title in a journal (the type of such a publication is article).

  3. (15 points) Write a query to find the names of the 10 most prolific authors, i.e., the 10 authors who have written the most publications. In this query and the remaining queries, the rank of an author on the list of authors of a publication does not matter. Ignore editorships in this and the remaining queries as well, i.e., if an author has edited a publication, do not credit this publication in the authors' count. Return the author name and the number of publications written by the author, for the 10 most prolific authors.

  4. (20 points) Some authors like to work alone, but can still be very prolific. Find the 10 authors who have written the most number of single author papers. Return both the author name and the number of papers written.

  5. (25 points) Other authors are highly collaborative. A shining example is the Hungarian mathematician Paul Erdös, who has published the largest number of papers in mathematics and has worked with hundreds of collaborators. As a side note, "The Man Who Loved Only Numbers" is a fascinating biography of this great mathematician. For this query, you are required to find the 10 scientists with the highest number of collaborators. Two scientists have collaborated if they write at least one publication together; the number of papers they have written together does not matter. The number of collaborators of a scientist is the number of other scientists he/she has written at least one publication with. Return the author name and the number of other authors he/she has collaborated with, for the 10 authors with the highest number of collaborators.

  6. (30 points) Keeping to the theme of Paul Erdös, mathematicians amuse themselves by computing their Erdös numbers. Briefly, if you have written a publication with Erdös, your Erdös number is 1. The number is defined inductively for other scientists. If your Erdös number is not k or less and if you have written a publication with a scientist whose Erdös number is k, then your Erdös number is k + 1. The DBLP database does not store publications by Erdös, so let us use the highly prolific computer scientist 'Philip S. Yu' in his stead, and define a Yu number analagous to the Erdös number. Write a query to find the number of authors whose Yu number is 2.
While grading your solutions, we will pay attention to the quality of your queries, e.g., whether they are correct, the number of tables they reference, and the running time. Please desist from creating massive new tables to support answering these queries!

What you should turn in:

A paper copy that details the following:
  1. The name of your project and the names of the students in your group.
  2. The changes you made to your database based on our suggestions for Assignment 2.
  3. A list of your defined SQL schemas; these schemas are to remind us about your design.
  4. For each of the problems listed above,


Last Updated: Sat, Mar 2, 3:30pm EDT, 2013