Your name here (please print):
Your student ID number here:
Construct gawk commands to operate on an protein structure file that will produce results specified below. Write down your gawk commands below each question. You can also use egrep if you think that gives a shorter solution. To begin, download the above file and examine its contents using vim . Read the "REMARK" section carefully. Note that awk is most likely aliased to gawk on your machine, so it does not matter if you type gawk or awk
gawk 'BEGIN{s=0} {if ($1 == "ATOM" ) s+= $9 } END {print s}' protein.pqr
The total charge is 2.
gawk 'BEGIN { total = 0 } /LEU/ { total += $9 } END { print total }' protein.pqr
The result, which is the total charge on all "LEU", is 0,
so the answer is "No".
grep -c "CA" protein.pqr
32 single-atom CA's
gawk '{ if ( $1 == "ATOM" ) print }' < protein.pqr | sort -n -k 9
gawk 'BEGIN{s=0} {if ($3 ~ /^H/) s+=$9} END{print s}' < protein.pqr
Note that you can't simply search for an occurence of "H", as it
can happen outside of the atom name. The ^ makes sure you pick
the first "H" in a word and don't pick up something like "CHH".
gawk '{ gsub(/ASP/,"ASH"); print }' protein.pqr