Using PostgreSQL (CS 4604, Spring 2016)

Table of Contents

  1. What is PostgreSQL? Why are we using it and not MySQL, Oracle, Microsoft SQL server ...?
  2. Can I install PostgreSQL on my own computer?
  3. Using the CS 4604 PostgreSQL server
  4. Using PostgreSQL and psql
  5. Special Instructions for the Database Server

1 What is PostgreSQL? Why are we using it and not MySQL, Oracle, Microsoft SQL server ...?

PostgreSQL is an advanced open-source RDBMS, increasingly used by a large number of high profile projects. We use it in CS 4604 because:

a) it is open-source and free ("free as in speech, not as in beer") software;

b) it is not really important which RDBMS you use to learn SQL;

c) among open-source RDBMS, PostgreSQL has some advantages over MySQL (e.g., MySQL does not enforce CHECK constraints).

Extensive documentation is available for PostgreSQL, which you should make for your leisure reading. The documentation is also available with user comments.

2 Can I install PostgreSQL on my own computer?

Yes! You are welcome to install it on your own computer. Install version 8.4 or later. PostgreSQL works on Linux and Windows. You can download versions for these operating systems. For Mac OS X, please use the fink system. We won't be able to provide tech-support for the installation though. If you decide to do this, you can read this link to learn how to download the data from our database server. Download it to your account on cs4604.cs.vt.edu. You can then transfer the data to your machine and re-load into your database.

Further, in Project Assignment 3, you will be asked to interface your database with an accessible web-server. We have a web-server on cs4604.cs.vt.edu. But again, you are welcome to install the web server on your own computer as well as long as we can evaluate the final step of your project. If you use your own web-server, please make sure you have installed your own PostgreSQL server on your machine too. You cannot install the web server separately and connect to OUR database server (due to security reasons, outside connections to our database server are blocked).

3 Using the CS 4604 PostgreSQL server

Follow the following steps to use the CS 4604 PostgreSQL server. Some of the steps below apply only the first time you log in. Note: Do NOT wait till the assignment due dates to test it out!

  1. ssh to cs4604.cs.vt.edu using your SLO account.

  2. Access the postgresql server by typing

    psql –U “user name” –d “database name” (the psql shell will prompt you for the password).

    Use your SLO username as the username. If this is the first time, use your SLO as also your database name. Otherwise you can use the name of any database you have access to. Email Sorour and Shamimul for the default password (also posted on Piazza now). If you provide the correct username and password, you will get a psql prompt where you can type in SQL commands and queries.

  3. If this is the first time you are accessing the PostgreSQL server, change your password by typing the following command at the psql prompt:

    alter user <username> password 'YourNewPassword';

  4. At the psql prompt, type \list to show a list of all the databases. Each student will have access to only two databases from this list. One of these is for your own project group. All the members of your project group have the right to access to this database. This database is named after your project title or group title. The second one is your personal database. The name of your personal database is the same as your username; only you can access this database (you used this database to access the server in Step B above). At the psql prompt, you can switch databases (which you are authorized access to) using \connect db_name.

    If you do not find the group database, please send email to Sorour and Shamimul to let them know which project group you belong to, the name of the project, and the members of your group.

  5. If you have any problems, please send email to Sorour and Shamimul (with a cc to Aditya). Be sure to include all the errors you get so that we can help you as quickly as we can.

4 Using PostgreSQL and psql

You will often use the psql command line client to talk to the CS 4604 server, especially for Homework 4, Project Assignment 2 and 3. Here are some potentially useful tips.

  1. To copy data from a file into a table or vice-versa, use the \copy command. Read the documentation for this command, which is about a third of the way down the web page.
  2. you can copy a table from a database to another database by using

    pg_dump -U “user name” -t “table name” “database name” | psql -d “database name”

  3. How to use pg_dump? Please read the following link, it describes how to use pg_dump. http://www.postgresql.org/docs/8.4/static/app-pgdump.html
  4. You can use temporary tables to stash results of queries. Temporary tables will disappear when you disconnect from the server. Take a look at the documentation for creating tables from queries.
  5. When you create primary keys or unique constraints, PostgreSQL will name the constraints even if you do not. When defining foreign keys, refer to the attributes and not to the names of the constraints.
  6. PostgreSQL does not allow subqueries in CHECK constraints. Nor does it allow the constraint to refer to variables other than columns of the current row. This sad fact is documented (search for the third occurrence of "CHECK (expression)").

5 Special Instructions

  1. The database server cs4604.cs.vt.edu is provided by the department. The server specification is 4 cores (QEMU Virtual CPU, 2.4GHz, 4MB cache) and 8 GB memory. Hence if multiple students run big queries at the same time, the server may slow down and not be as responsive. Although we have optimized the server, note the following things:
    1. First, always test your queries on smaller toy sample tables, before running it on the large tables. Also desist from designing queries with extremely large joins (sometimes this can not be helped, but pay attention if you are designing an unnecessarily slow query).

    2. If you run a query and the system does not respond for a long time (more than 5 minutes), you can kill this connection by using the following query in your private database:

      SELECT dbo.pg_kill_user_connection(ProcID);

      Here ProcID is the process id of the process. You'll need to use 'ps -u USERNAME -F' to see information about the processes run by you. This function will only kill processes owned by you. Let the TAs know if there is any issue using this.
    3. After finishing the step b, if the system is still running slow or does not respond, contact Sorour and Shamimul by email and they will try to solve it as soon as possible.

  2. Our server disk space is limited, so please make sure your database size does not exceed 5Gb. You can use the following commands to check the size of your database and tables:

    select pg_size_pretty(pg_relation_size('table_name'));

    select pg_size_pretty(pg_database_size('database_name'));

    If you find the size of your database is more than 5GB, try to drop some tables to decrease the size of your database.
Last modified: Thursday, March 10, 5:38pm EDT 2016