(100 points) Design an E/R diagram for the DBLP database. Here is
a complete description of what your E/R diagram should model. Read
this description carefully, since it differs in some details from the
description in Project Assignment 2. The
DBLP dataset contains information about approximately 1.4 million
publications in the computer science literature. Each publication has
a unique string called the dblp_key that identifies it. It
also has a title, a year of publication, and one or more authors. Some
types of publications do not have authors: they have editors (see
below). The order in which authors appear in a publication is
important and must be recorded. In each publication, each author
appears at most once. The rank of a author is unique within the
publication. Within a publication, ranks must start at 1 and be
consecutive. For some publications, the authors have not been
recorded. A publication may also have a URL and a Digital Object Identifier (DOI). Each
publication belongs to one of the following categories:
-
article
-
This type corresponds to a journal article. The publication will
have an associated journal name, a volume and a number specifying
the issue of the journal, page numbers, and a publisher of the journal.
- book
- As the name indicates, this type of publication is a book. It also
has a publisher and an ISBN number.
- incollection
-
This type indicates a publication contained within a
collection. An example of a collection is a book that contains
different chapters written by different authors (note that every book is not necessarily a collection).
Each chapter in a
collection will have the type incollection. Each
chapter will have its own page numbers and authors. The entire
collection itself is considered a separate publication
and has its own title, a list of editors, and a publisher. It is
not possible for a person to be an author of a collection, i.e.,
collections only have editors. Within a single collection, an
editor appears at most once. Within a single collection, ranks of
editors are also unique and consecutively numbered starting at
1. A chapter in a collection has a cross reference to the
collection it was published in.
- inproceedings
-
This type indicates a paper published in the proceedings of a
scientific conference. It is very similar to a publication of type
"incollection". The conference proceedings is itself a
separate publication with its own title, editors, and
publisher. Editors and their ranks for a "proceeding" have the
same function and constraints as for a "collection". A publication
of type "inproceedings" has a cross reference to the
proceedings it was published in.
- mastersthesis
- This publication is a Master's thesis, with a specific author,
department and/or university, and year.
- phdthesis
- This publication is a PhD thesis, with a specific author,
department and/or university, and year.
- www
- This type of "publication" is just a pointer to a web page,
possibly with a title and one or more authors. It must have a URL.
Each publication can cite one or more publications (these are the
list of references that appear at the end of a typical
publication). In addition, each publication can be associated with
one or more topics. Topics are themselves arranged hierarchically,
e.g., see the Computing
Classification System. A topic can be a sub-topic of more than
one "parent" topic and itself have one or more specialised
topics as "children".