CS 2204: Homework #3

What to turn in: A legible paper copy giving your answers

Construct gawk commands to operate on an Credit card charges of John Smith, Esq., from Blacksburg, VA that will produce results specified below. Write down your awk commands below each question or use a separate sheet if you need space. If you need, you can use multiple gawk scripts connected with pipes | . Try not to, but if absolutely unavoidable, you can also use egrep . If more than one script is used, make sure to give the whole sequence in your answer. Note that not every dollar amount reported is the money SPENT. Also note that not every line that has $ sign should count - only those lines that start with a date are what you need.


    Below is a set of possible solutions. Note the use of substr() -- see your UNIX book (or any online awk manual) to understand how it works. Also note that these are by far not the only possibility. For example, one can first use a "filter" awk script (based on gsub described in class ) to get rid of all $ signs. Another possibility is to re-define FS to separate fields by the $ sign. In this case, $NF will give you the value of the last field, which is the dollar amount.
  1. (3 points) Compute and print the total amount spent by Mr. Smith. lally@hc652aeae:lally$ cat hw3-1.gawk
    BEGIN {
      total = 0
    }
    $1 ~ /^[A-Z][a-z][a-z]$/ && $NF ~ /\$[0-9]/ {
      total += substr($NF, 2)
    }
    END {
      print total
    }
    lally@hc652aeae:lally$ gawk -f hw3-1.gawk transactions.txt
    6183.75
  2. (2 points) Now compute the total spending by Jan 26th inclusive; Print the lines sorted down by the dollar amount. lally@hc652aeae:lally$ cat hw3-2a.gawk
    $1=="Jan" && $2 <= 26 && $NF ~ /\$[0-9]/{
      print substr($NF,2), $0
    }

    lally@hc652aeae:lally$ cat hw3-2b.gawk
    BEGIN {
      total = 0
    }
    {
      total += substr($NF,2)
      print $0
    }
    END {
      print "The total is:", total
    }

    lally@hc652aeae:lally$ gawk -f hw3-2a.gawk transactions.txt | sort -n -k 1 | cut -d ' ' -f 2- | gawk -f hw3-2b.gawk
    Jan 26 Kroger #402 Sl9 Blacksburg Va $6.29
    Jan 23 Kroger #402 Sl9 Blacksburg Va $9.44
    Jan 19 Shell Station 9 Ridgeway Va $13.14
    Jan 21 Martinis North Myrtle Sc $20.00
    Jan 05 Amoco Oil 06952956 Blacksburg Va $24.20
    Jan 24 Metropolitan Museum Art New York Ny $32.45
    Jan 07 Kroger #210 Sl9 Blacksburg Va $38.56
    Jan 22 Oasis World Market Blacksburg Va $53.89
    Jan 20 Harris Teeter Wilmington Nc $65.36
    Jan 21 Bennetts Seafood #2 Myrtle Beach Sc $101.17
    Jan 25 Charles E Harris Dds Blacksburg Va $363.00
    The total is: 727.5
  3. (1 point) How much did Mr. Smith spend on gas? lally@hc652aeae:lally$ cat hw3-3.gawk
    BEGIN {
    total = 0
    }
    /Amoco/ || /Shell/ || /Wilco/ || /Exxon/ {
    total += substr($NF,2)
    }
    END {
    print total
    }
    lally@hc652aeae:lally$ awk -f hw3-3.gawk transactions.txt
    105.3
  4. (3 points) Write a "fraud warning" script that prints a warning message for each transaction that is a) larger than 50 % of the total spending AND b) happens outside of the US. lally@hc652aeae:lally$ cat hw3-4.awk
    BEGIN {
      total = 6183.75
      threshold = total/2.0
    }
    $1 ~ /^[A-Z][a-z][a-z]$/ && /FOREIGN CURRENCY/{
     # the +0 forces gawk to treat the result of substr() as a string
     # usually it's smart enough to figure this out, but not always.
     if ((substr($NF,2)+0) > threshold) {
       printf "line %d is over threshold: %s\n", NF, $0
     }
    }

    lally@hc652aeae:lally$ awk -f hw3-4.awk transactions.txt
    line 8 is over threshold: Jan 28 WorldGlitter Diamonds Zanzibar FOREIGN CURRENCY* $5000.00
  5. (2 point) What percentage of Mr. Smith's total spending happened in his home state? lally@hc652aeae:lally$ cat hw3-5.awk
    BEGIN {
      total_spent = 0
      amt_in_home_va = 0
    }
    $1 ~ /^[A-Z][a-z][a-z]$/ {
      state_field = NF -1
      if ($state_field == "Va") {
       amt_in_home_va += substr($NF,2)
      }
      total_spent += substr($NF,2)
    }
    END {
      print (100*amt_in_home_va)/total_spent
    }

    lally@hc652aeae:lally$ awk -f hw3-5.awk transactions.txt
    13.0165
  6. (3 points) Compute the total BALANCE. lally@hc652aeae:lally$ cat hw3-6.awk
    BEGIN {
      balance = 0
    }
    $1 ~ /^[A-Z][a-z][a-z]$/ {
      if ($NF ~ /\$[0-9]/) {
        balance += substr($NF,2)
      } else {
        balance -= substr($NF, 3, length($NF)-3)
      }
    }
    END {
      print balance
    }

    lally@hc652aeae:lally$ awk -f hw3-6.awk transactions.txt
    4894.2