CS 6824: Computing the Brain

Spring 2016, 11am-12:15pm, Tuesdays and Thursdays, Torgerson 1080

About the Course

What is the focus of this course?

The goal of this course is to teach you computational methods that scientists use to understand the brain at three levels:
  • Molecular
  • Cellular
  • Anatomical
in order to gain insights into structure-function relations, health, and disease. We will primarily use the tools of graph (network) theory but will also occasionally delve into machine learning and data mining.

Who should take this course?

You should take this course if you are curious to find out how the latest research is shaping our understanding of the brain. We will spend a lot of time learning basic concepts and tools in graph theory and how they are applied to understand networks that connect regions in the brain, cells in the brain, and molecules in the brain. There are many exciting and profound issues that researchers in this area are actively investigating, such as the robustness of the brain, network structures and dynamics, and applications to drug discovery. During this course, we will come across many interesting open research problems. Taking this course might be an excellent way to create research topics and projects for your Master's or Ph.D. thesis in the area of bioinformatics/computational biology. In this course, you will be able to communicate and work with students and researchers with varied backgrounds.


The course is open to students with graduate standing. There are no explicit pre-requisites, except an exhortation to keep your brain open, a la Erdő`s. You must know some programming (the language does not matter much) in order to complete assignments. I hope that both students with computational backgrounds and students with experience in the life sciences will take this course. If you find this course interesting but are not sure whether your background matches the course, please talk to me.

Course structure

The course will primarily be driven by lectures by me and by student presentations. Your grade will depend on your presentations (35%), on class participation (35%), and assignments (30%).


After the introductory lectures, each lecture given by me will focus on a particular paper. I expect students to read the paper carefully before the class and be prepared for participating in extensive discussions in the classroom. Remember that class participation counts for 35\% of the grade. I strongly encourage students to form reading groups in order to discuss and understand papers.


Each student will have to present one or two papers in class. I will work with each student to decide the papers. In a single class, I expect you to prepare a presentation for 45 minutes and expect 30 minutes of questions and discussion during the presentation. Be prepared for some discussions to take over your presentation. In fact, some papers will require two full classes, i.e., a total of 150 minutes, including time for questions. Prepare your presentation well in advance! Practise multiple times!


A typical assignment will involve writing code to replicate or extend the analysis in one or two figures in a paper we study. Sometimes, you will have to perform this analysis for a dataset not considered in the original paper. These assignments may organically come about from class discussions. You will have about two weeks to complete assignments. Your solution will include the following items:
  1. Fully working code, e.g., on GitHub.
  2. A short report on the results of your analysis, including the figures, discussion of difficulties you faced, how you solved them, and observations on your results.

Papers to be covered

A CiteULike page collects a superset of the papers that we will discuss. The actual set of papers we will cover will depend on the interests of the students. This list will evolve over the course of the semester.

(Large) Networks among brain regions and neurons

Creating Networks

Networks among brain molecules

Notes on Reading Papers

The papers and notes in this section appear in the order of discussion in the class.

Collective dynamics of 'small-world' networks

  • We will focus more on the "small-world" aspect of this paper than on the "dynamics" aspect.
  • What is the intuitive meaning of a small-world network? The main purpose of this paper is to answer how small-world networks could arise.
  • What do you think the authors mean by random graphs? How do we construct random graphs? Answering this question might involve reading articles similar to Bollobas's book (reference 16). There are several resources on the web that I encourage you to find.
  • A key point to appreciate is that a random graph (whatever that may mean) is guaranteed to be connected if \(k \gg \ln n\). What could this statement in the third paragraph of the paper mean?
  • Carefully understand the rewiring procedure described in the caption of Figure 1. If n is the number of nodes in a network, there are two other parameters: k, the average degree and p, the probability of rewiring an edge. Try to figure out the effect of k and p on the average path length and clustering coefficient.
    • What if p is 0? What if p is 1? (k is fixed.)
    • What if k is 2? What if k is n-1? (p is fixed.)
    • More generally, obtain an intuitive understanding of the limits on \(L(p)\) and \(C(p)\) as \(p \to 0\) and \(p \to 1\), given in the third paragraph of the paper.
  • Next, understand Figure 2 and the text accompanying it. Note the scales of the axes. What do you notice about the plots?
  • Turn your attention to Table 1. Try to find out more about the networks described here. How big are these graphs?
  • Do the differences between \(L_\text{actual}\) and \(L_\text{random}\) and between \(C_\text{actual}\) and \(C_\text{random}\) appear to be significant? Is there a way to assess if these differences are statistically significant?
  • Finally, focus on the model of disease spreading described on page 441 and later (second half of the second page onwards). Does the model appear reasonable? Is it over-simplified?
  • Digest the trends in Figure 3 and relate them to Figure 2.
  • What does Figure 3 suggest about disease transmission? Why were people worried that the Rio Olympics would cause Zika transmission to the rest of the world?

The small world of the cerebral cortex

  • The Connectome Debate: Is Mapping the Mind of a Worm Worth It? An article in Scientific American on the value of connectomics.
  • What is the cerebral cortex? How is it different from the visual cortex?
  • We have been throwing around the number 100 billion as the number of neurons in the brain? Where did that number come from? The introduction of this papers introduces a new number: \(10^{15}\) connections? How do scientists estimate this number? On average, each neuron has 5,000 connections!
  • Does the discussion on the types of neuronal connections in the first paragraph of this paper suggest why some neurons are myelinated and others are not?
  • Being published six years after the Watts and Strogatz paper, the second paragraph in the introduction provides a perspective on the impact of small world networks.
  • What is the motivation for this paper?
  • Are the datasets used in this paper equivalent to graphs? If so, what types of graphs are they?
    • You can ignore the details on how the authors modified some of the original datasets.
    • Do the authors explain the rationale for the specific steps they used to create the density-based connectivity data set for the cat cortex?
    • What are the reference networks created by the authors? How do they create them? Which one is similar to the E-R model and which one to the Watts-Strogatz model?
    • Are the methods use to create random networks with the same in-degree/out-degree distribution clear?
  • Understand all the graph-theoretic quantities computed by the authors. How closely do they follow Watts and Strogatz?
  • What is the new concept introduced in this paper? (New in the context of what we have seen applied to brain networks, not new to graph theory!)
    • Do you understand the distinction being made in Figure 3D between the methods employed here and those used by Milo et al.?
    • How do the authors define the quantity \(p_{\text{cyc}}(q)\)? The caption of Figure 3D is helpful.
  • In the results displayed in Table 1, how do the authors interpret \(\lambda_{\text{scl}}\) and \(\gamma_{\text{scl}}\) as being closer to random networks or to lattice networks?
    • The authors do one type of analysis that is additional to the what Watts and Strogatz did in their paper. What is it? (I know there are multiple answers in general. What is it that you see in Table 1?)
    • How do you think the authors computed the statistical significance values in Table 1?
    • How do they reach the conclusion that local vertex statistics cannot explain low \(\lambda\) and high \(\gamma\) seen for the connectivity matrices?
    • In class, I will put up Figure 4 on the screen. I want you to tell me what the authors discuss about \(\lambda\) and \(\gamma\) for specific regions of the brain. You may need to do a bit of background reading on the functions of different brain regions.
  • Next, consider the results in Table 2. Do you notice anything interesting? What do the authors say about this table?
  • I think we can ignore Table 3. What is your opinion of the dataset on which it is based?
  • Ignore the section "Probabilistic Connectivity Data Sets".
  • Read the "Discussion" carefully. What are the main points made by the authors? What do you believe?
  • Read the sub-section on "Distances between Neurons." Eventhough it is self-evident but useful to keep in mind.
  • Finally, read about what the authors have to say about scale-free networks. What are scale-free networks? We are likely to come across them again in this course.

An introduction to diffusion tensor image analysis

  • What is Diffusion tensor imaging (DTI)? Is it different from MRI? How is DTI measured?
  • Is the diffusion of molecules similar in the case of white and gray matter? Intuitively, what do you think happens in eah case?
  • Can we map gray matter using this technique?
  • What does anisotropy mean? How is it relevant for DTI?
  • What does FA stand for in the paper? Why do the authors use it as a weighting parameter?

Rich-Club Organization of the Human Connectome

  • What is the "rich-club" phenomenon and why is it relevant to brain networks? What other real-world examples exhibit this rich-club architecture? What is the main motivation behind studying rich-club nodes? What major role do they play according to the authors?
  • In the third paragraph in the introduction, the authors say the following about rich-club organization: "The presence, or absence, of rich-club organization can provide important information on the higher-order structure of a network, particularly on the level of resilience, hierarchal ordering, and specialization." What do they mean by "specialization?" How many of these properties do they talk about in this paper?
  • Why do we need to weight the networks in this paper? What information are we missing in unweighted networks? - Which of the networks do you think is a more accurate depiction of the connections in the brain?
  • How many brain networks did they aggregate in this paper?
  • You can ignore the specific details under "MR acquisition" and "DTI preprocessing and deterministic fiber tracking", especially the parameters used.
  • There are three types of weighted networks: "NOS weighted," "NOS-ROI weighted," and "FA-weighted." What do these terms (NOS, ROI, FA) mean? (Hint: They are not defined in this paper.)
  • What conclusions can we make from all those values in Table 1 about different networks?
  • What do \(g\) and \(\lambda\) stand for in Table 1?
  • Is there a difference between "degree" and "strength" in row 6 of Table 1? Do they mean the same thing?
  • How do we compute the different terms in Table 1 for weighted and unweighted networks?
  • Are we all wired in the same way in our brains? Can you infer anything about this question from Table 1?
  • In Figure 2, why are some bars highlighted in yellow? How many subcortical regions are highlighted in yellow?
  • What can you conclude about subcortical regions based on Figure 2? Are they related to rich-club nodes in any way?
  • Why do you think the "rich club coefficient" has a parameter called \(k\)? How does the computation of \(\phi(k)\) differ for weighted and unweighted networks? Why do we need to normalize this quantity? If you are convinced we need to normalize the \(\phi(k)\), can we use any methods discussed in the previous lectures to generate the random networks?
  • In Figure 3, what does the region highlighted in gray indicate?
  • How did the authors compute the significance of the rich-club coefficient?
  • What is rationale behind the computation of the \(s\text{-core}\)? What is the \(k\text{-core}\) and how does it differ from the \(s\text{-core}\)? Can we define both quantities both for weighted and unweighted networks?
  • Read and understand Figures 4 and 5 and the text accompanying them.
  • What does the high-resolution network in Figure 8 (page 9 or page 15783) show? Why do you think the authors included this another network? What can you conclude based on results in this figure?
  • In the section "Rich-club centrality," the authors say the following:

"Two heuristics were examined: (1) the percentage of the number of shortest paths that passed through at least one of the rich-club nodes, normalized to the number of all shortest paths in the network, and (ii) the number of shortest paths between all nodes in the network that passed through at least one rich-club edge."

  • What exactly is a "rich-club edge?" Is is an edge between two rich-club nodes or an edge between a rich-club node and another node? In either of the two definitions, how is the heuristic in (1) different from the heuristic in (2)?
  • In order to understand the module finding algorithm in the section, "Modularity, provincial and connector hubs," you'll have to go through the paper "Modularity and community structure in networks." I may not have enough time to go through the entire algorithm, but I'll briefly try to convey my understanding of the method in class.
  • How did the authors define provincial and connector hubs and what did they was the significance of these nodes?
  • What do the authors mean by attacking a node/edge, in "Rich club in targeted and random attack?" How did the authors "attack" the network? What measure do they compare before and after the "attack" and why?
  • At this point, you may wonder why the authors choose only the \(M^{w-nos}\) network for most of their analyses such as \(s\text{-core}\), modularity, network attack and finally the high-resolution network. Why do you think the authors chose this while they had two other weighted networks? The answer (I think) is buried somewhere in the paper. \(\smile\)
  • A complete understanding of all the graph-related quantities in Table 1 requires reading "Complex network measures of brain connectivity: uses and interpretations.". I'll try my best to give an intuitive explanation, but I would recommend reading other online articles and looking at different examples before we discuss them in class.

Cortical High-Density Counterstream Architectures

  • How many distinct areas of the macaque's brains were considered in this study? Did the authors focus on any particular subset of areas?
  • What does the word "interareal" mean? What percentage of total connectivity is interareal, according to the authors?
  • The initial argument that the authors want to make is that the small-world hypothesis does not hold for interareal networks? Who authored the studies that are referenced on the first page challenging the small world hypothesis? (References 8, 11)
  • Why do the authors argue that their experimental methodology provides a better representation of brain structure than prior publications?
  • Find out what a retrograde tracer is. Can you relate how such a tracer works to slide 84 in the first lecture in the class?
  • There are many abbreviation used by the authors, e.g., FLN, NFP, FIN.
    • What is the FLN? How does the FLN relate to the weight of edges?
    • What is a NFP? What reason do the authors give for the NFP's having been missed in previous studies?
    • What is the FIN?
  • What is the notion of edge-completeness, according to the authors? Read the definition in the glossary at the end of the paper. Then read the sentence that includes the word "edge-incomplete" at the end of the first page of the paper (in the paragraph that starts with "The \(G_{29\times 29}\) interareal subgraph"). Are the two notions (edge-complete and edge-incomplete) consistent?
  • What is the crux of Figure 1A? Which of the data sets listed here are from papers that we have analyzed or considered in this class? What were the authors of those papers arguing about the structure of the brain? What are the current authors saying?
  • Look at Figure 1B. What is the source of this data?
    • For Figure 1B, what are the authors arguing? What is the threshold at hich the small-world network property is no longer realized?
    • What do the authors argue this means for the structure of the macaque brain?
  • Why are the authors not using thresholding? What does this mean for their dataset?
  • What are the five specific regions of the brain that the authors consider?
  • What is the connectivity behavior of the NFPs?
  • What is a dominating set? How is it different from a vertex cover? Find out how difficult it is to compute the dominating set of smallest size in an arbitrary graph?
  • Look at Figure 2A. Can you visually pick out a dominating set? Is it easy to find many dominating sets in this graph? Study Fig S3 in the supplementary material for citation 11.
  • How does density relate to the domination statistics?
  • Ignore Figure 2D.
  • Ignore the paragraphs starting with the sentence "Besides suggesting a key role for the NFPs" till the end of the sub-section titled "Hierarchical Organization." Ignore Figure 3 as well.
  • How did the authors compute high-density core of the \(G_{29 \times 29}\) graph? What is the density of this core? What does this statistic mean about the FIN?
  • If time permits, We will discuss the bowtie structure.
    • What is a feedforward (FF) pathway and what is a feedback (FB) pathway? Read the caption of Table 1 and the glossary for relevant definitions.
    • What are supragranular layer neurons (SLNs)?
  • We will first consider the exponential distance rule (EDR).
    • What are the parameters of the EDR? How do the authors "fit" an EDR to their data?
    • What is the biological explanation for why an exponential distance rule is an appropriate model for connections in the brain?
    • How does the EDR influence the cortical structure of the brain?
  • The authors have released their dataset at http://www.core-nets.org. It is available locally. How large is the data set? How difficult is it for a researcher to obtain a data set of this type and size?
  • How do we calculate the global efficiency of the network? What is link resistance?
  • What does Figure 5D tell us about the existence of a high-bandwidth network core? What percentage of weak links need to be removed before the global efficiency of the network is impacted?
  • Ignore Figure 6 and text relevant to it.

Randomization and resilience of brain functional networks as systems-level endophenotypes of schizophrenia

  • What is the hypothesis of this study?
  • What is an endophenotype? You can read the Wikipedia, of course, but try to understand this concept from a research paper on the topic.
  • What do the abbreviations Sz, Rel, and HV stand for? Why did the authors select these groups of patients?
  • What were the three main objectives of this study?
  • List the network meeasures used in this study. (Hint: There are five.)
  • Read the Methods section and the Supplementary Methods before you read the Results.
    • In the Methods, you can ignore the details in the sections on fMRI preprocessing and connectivity estimation.
    • Try to get a sense of the range of possible values of the wavelet correlation.
  • What type of graphs do the authors construct from this data? How many nodes and edges do the graphs have? Are the numbers given at different parts conflicting?
    • How would you construct a graph from a full correlation matrix? Why do the authors propose their approach that starts from a minimum spanning tree?
    • As we all know, a minimum spanning tree minimizes is the subgraph of smallest total edge weight that connects all the nodes in the graph. How do we reconcile this definition with the fact that the authors subsequently add edges in decreasing order of wavelet correlation? (Hint: See reference 5 and find the magic words therein.)
  • Compared to the papers we've read already, are there any new aspects of the graph that they're measuring?
  • Analyze Figure 1.
    • First, read up on the Jonckheere-Terpstra test. How is it different from ANOVA or the Kruskal-Wallis test?
    • What do the authors claim that Figure 1 proves?
    • Do you believe the figure is consistent with this claim? Why or why not?
  • We will spend some time on the panels in Figure 2, except Figure 2D.
    • Look at the red dotted lines (they are actually asterisks) at the top of Figures 2A, 2B, 2C, and 2F. What do they signify?
    • How many statistical tests are the authors performing for each figure?
    • What does "FDR correction" mean?
  • Analyze Figure 3.
    • Why is the Connection Density measured on a scale of 0 to 1?
    • What conclusions do the authors draw from these graphs?
  • Analyze Figure 4.
    • Which areas of the brain demonstrated greater differences in clustering?
    • Which areas of the brain demonstrated greater differences in efficiency?
    • What do these results imply about small-worldness?
  • How do the authors measure correlation between network metrics? Which correlational results did they deem significant? Do you agree with this determination? (Hint: Table S3 in the Supporting Information may be useful.)
  • Read the "Discussion" carefully. It may require multiple readings for you to grasp the arguments.
    • What are the two major hypotheses presented in the Discussion section? How would one test these hypotheses?
    • What are the two questions presented in the Discussion section?
  • For the following questions, refer to the blue "Significance" box on page 1.
    • What is the most novel thing about this study?
    • What is the authors' interpretation of their results?

DeltaCon: Principled Massive-Graph Similarity Function with Attribution

  • Unless otherwise noted, we will be ignoring parts of this paper that deal with node and edge attribution (the DeltaConAttr algorithm).
  • We will also not be discussing proofs beyond intuition: thoroughly going through most of these proofs would easily take half a class, for little benefit toward our main objectives in this course.
  • First class (2016-10-11)
    • Abstract
      • What kind of similarity comparison are we doing here? In other words, what is the critical piece of information we have about the two graphs we are comparing?
      • What else is particularly striking about the claims in the abstract? (Hint: think big)
    • Section 1. Introduction
      • What are some useful applications of similarity scores in general?
      • What is the real new challenge that the "passage of time" has brought to graph analysis?
      • What are the main contributions of the paper?
      • Take a careful look at Table 1: we will be refering to this quite frequently as we discuss the main algorithms in this paper.
    • Figure 1/Section 6.2: Skip ahead to Section 6.2 and read more about how the authors derived Figure 1.
      • Figures 1c and 1d: What are these graphs really telling us? Is the layout of the graphs important here? How did the authors derive it?
      • Read the caption carefully. Do you notice a discrepancy between the caption and Figure 1b?
      • What kind of network are we working with here? Is it at the scale claimed by the paper? How was it constructed?
      • What is hierarchical clustering? What is Ward's method? Try to research these topics.
      • How do the authors interpret the results of the clustering? What are their findings?
    • Section 2. DeltaCon: Intuition
      • Familiarize yourself with the problem definition.
      • What is graph similarity? Need two graph have same node sets, if you want to measure their similarity? What if two graphs has different set of nodes?
      • What are some intuitive ways to measure similarity? Skip ahead to Section 5 and read the description of Vertex/Edge Overlap and Graph Edit Distance. Try to understand how these work. Which of these is the simple algorithm the authors mention in Section 2? What is the objection the authors give in rejection of this algorithm?
      • What is node affinity? How do we construct similarity matrix \(S\)? What are the two conceptual steps that this paper proposes to obtain a graph similarity score?
      • What is the definition of a neighborhood in the context of graph theory? If nodes that are farther away, should they have less influence?
      • "Intuitively, node \(i\) has more influence/affinity to node \(j\) if there are many short, relatively heavily weighted paths from node \(i\) to node \(j\)." Think about this statement. Is it inuitive? Could you raise objections to this statement?
      • Why use Belief Propagation (BP)? How does it take into account the direct and \(k\)-neighbors?
      • Consider the properties that a similarity score should satisfy. Do you think they are all needed? There are three axioms, three statements on intuition, and one informal property. What are they? Why is the informal property not listed with the others?
      • Is it easy to compute pairwise node affinity scores for each graph?
  • Second and third classes (2016-10-13 and 2016-10-18)
    • Section 3. DeltaCon: Details
      • Why have the authors used rooted Euclidean distance (ED) instead of regular ED? How does regular ED violate some axioms?
      • How is DeltaCon faster than DeltaCon0? What is the excepted time complexity? Will these two give the exact same result?
      • Does DeltaCon also satisfy all the axioms and statements on intuition listed in Section 2.4?
      • The first paragraph of Section 3.2 is very important. Read it, and try to answer the following questions:
        • Why is DeltaCon0 slow? Try to think through the time complexity of this algorithm.
        • What is its time complexity? Is this acceptable? How does it scale?
        • What is different about DeltaCon versus DeltaCon0 that allows it to be faster?
        • What method do the authors use to derive the \(g\) groups? What is the most critical criterion placed on \(g\)?
        • What range of values do the authors use for \(g\)? (Hint: see pg. 30, Fig. 9 for some answers on the last question.)
        • What trick do the authors employ to achieve the linear running time they claim for DeltaCon? (Hint: Lemma 3.1)
        • What are the authors assuming about their graphs in the proof of Lemma 3.2?
      • How fast is their algorithm on a 1.6 million node graph? How long did your code take to run for our last assignment?
      • Take note about their discussion of random node partitioning. What do they say is the reason for the unintuitive failure of METIS? How do we normally partition our brain graphs? Could this be cause for concern?
    • Section 3.3
      • Does DeltaCon obey the authors' identity axiom? Is this an intuitive result?
      • Does DeltaCon obey the authors' symmetric axiom? Is this an intuitive result?
      • Does DeltaCon obey the authors' zero axiom? Try to get a high-level grasp of the proof.
      • Does DeltaCon obey the edge importance property? Do not concern yourself with the proof. Rather, think of the barbell graph the authors give as an example and think about how similarity scores would change afterward.
      • Does DeltaCon obey the edge-"submodularity" property? Do the authors know the answer to the question?
      • Does DeltaCon obey the weight-awareness property? Again, spend a few minutes trying to get a high-level intuition for this proof (this math is not that bad). We will discuss this proof briefly in class.
      • What do the authors mention at the very end of this section?
    • Section 4. DeltaCon-ATTR: Remember, skip this section!
  • Fourth class (2016-10-20)
    • Section 5. Experiments
      • Take a look at the other methods that the authors have compared their method to, and try to understand how each one works.
      • Figure 3 and Table II: understand how the authors named the graphs (useful for later figures), why the authors chose these graphs, and try to work out how the above algorithms would apply.
      • In Table III, how do the other methods fail with regards to the axioms the authors defined? Be careful about the delta similarity and delta distance.
      • Table VI lists the datasets the authors will use later. Take a look at the magnitude of nodes and edges.
      • What do the authors conclude from igure 4?
      • Skip Section 5.2.
      • In Section 5.3, how do the authors show that DeltaCon is scalable? (Figure 8) Is DeltaCon robust to the number of groups? What does the robust means here? (Figure 9)
    • Section 6: DeltaCon and DeltaCon-attr at work(Application)
      • Enron dataset: what does the dataset contain? What does the similarity score mean in Fig. 10?
      • For the anomalies in Fig. 10, possibly you have found that they are all at least pairs of days per time. Why does this happen?
      • Section 6.3: how many subjects are there in test-retest brain connectomes? What does test-retest mean?
      • What conclusion do the authors draw from Fig. 12?
    • Section 7: we won't discuss this section.

A high-throughput framework to detect synapses in electron microscopy images

  • First class (2016-10-25)
    • Introduction
      • What is the topic or main question to be answered in this paper? Why do they study synapses?
      • How many synapses are there in the mammalian brain? Compare this number with the number of neurons in the brain.
      • Understand how synapses are important and how some diseases are related to synapses dysfunction.
      • Understand why other techniques for synapses distribution studies fail to provide accurate and large-scale measurements.
      • The authors stress the notion of "high-throughput". What do you think is "high-throughput" about their framework?
      • Why is bioimage informatics important? How large is the size for a single confocal microscopy image session?
      • What are the overall steps in the framework for high-throughput synapses analysis?
      • What are the key computational challenges listed in the paper? Are they any more than you can think of?
    • Approach and Methods
      • Please ignore the preparation of transmission electron microscopy samples. (second paragraph of 3.1)
      • Section 3.1
        • What is the experimental technique to selectively stain for synapses? How does it work?
        • The authors used the barrel field, a part of primary somatosensory cortex area, to study synapses. What is the function of the barrel field? You can use the mouse brain atlas to find this region (click "coronal" from "Launch interactive atlas viewer").
        • They used mice from three Postnatal days (P14, P17 and P75). What are the development stages of those days? Are samples enough to study density or strength changes of synapses?
      • Section 3.2
        • What is the two-step approach in their machine-learning framework to detect synapses?
        • What are the steps in the image processing pipeline to reduce noise? Why are EM images noisy?
        • What are the drawbacks of non-overlapping sliding windows and overlapping sliding windows?
        • What is the Contrast-limiting adaptive histogram equalization algorithms? How does it solve the problems in the sliding window approach?
        • Why do the authors set a 10% sample-independent threshold to do the final segmentation? Is this important or necessary?
        • How do the authors validate their segmentation? Does their segmentation preserve synapses?
        • Why is the background for a segment is important? How can we use background information to separate synapse with synapse-like structures such as mitochondria or nucleus?
        • Why do the authors normalize each candidate window? Understand the equation. Is the value in this equation similar to the \(z\)-score in the \(z\)-test?
        • How does the Hough transformation work? What is the pixel size after rotation?
      • Figure 1
        • In Figure 1A, is EPTA staining good to selectively stain synapse? Can you recognize the positive and negative segments in EPTA staining images?
        • In Figure 1B, did normalization and alignment reduce the heterogeneity?
      • Section 3.3
        • What is texture? What is the method to generate texture-based features?
        • What is the MR8 filter bank?
        • Get a basic idea of the 10 shape descriptors of a synapse
        • Why do they use histogram of oriented gradients (HoG)? How does HoG work?
        • What is the total number of features for classifiers?
        • What three classifiers are they using? How do the three classifiers work?
  • Second class (2016-10-27)
    • Figure 2
      • The image used in Figure 2 is same as the image on the right of Figure 1A. How many identified synapses are highlighted?
      • There are two groups of features in the sub-image titled "Feature Extracted" and in the psuedocode. Which group do HoG features belong to?
    • Section 3.4
      • What are the compatibility conditions for co-training algorithms? Why two conditions cannot be satisfied simultaneously?
      • The iteration process retrains a final single classifier. Is this classifier a new one?
      • Did this approach provide another way to account for variability in never-seen-before samples requiring explicit annotation?
    • Section 3.5
      • What is negative-to-positive ratio in supervised learning?
      • In semi-supervised learning, why do we train the classifiers using data from sample A and test from sample B? Why don't we mix the labeled samples?
    • Section 4.1
      • Did the two experts annotate the same images?
      • The authors have two thresholds in Table 1. There is a difference of 8% for threshold=0.5. Is this value of 8% statistically significant?
    • Section 4.2 and Figure 3
      • What is k-fold cross-validation?
      • What do the terms ROC, AUC, precision, and recall mean? How d =i we calculate them?
      • Where does prefect classification lie in ROC space?
      • What are the differences between the curves in Figure 3A and those in Figure 3B? Do they have show the same result for classifier performance?
    • Section 4.2 and Table 2
      • What does precision-recall AUC mean?
      • Which threshold is better, 0.5% or 1.5%, and why?
      • Why did the authors do semi-supervised learning on mice at the same age? What did the result show us?
    • Section 4.3
      • Read this section and be prepared for a discussion in class.

Resynchronization of circadian oscillators and the east-west asymmetry of jet-lag

  • Overall paper:
    • What is the problem that the authors are trying to solve?
    • What approach do they use?
    • What is their main result?
  • Abstract and Introduction:
    • What are circadian rhythms?
    • What is special about the circadian system that the authors want to investigate?
    • What is the SCN and where is it located? How does the SCN contribute to circadian rhythms?
    • Are we studying brain networks of the type we have considered earlier in the semester?
      • What property of neuronal networks are we studying?
    • What does the word "dimensionality" represent in this paper?
      • How many neurons comprise the SCN? What is the dimensionality of the system they are considering?
    • What is a phase space?
    • Deconstruct the sentence: "We model synchronization of SCN cells using the forced Kuramoto model, which consists of a large population of coupled phase ocillators (modeling indivisual SCN cells) with heterogenous intrinsic frequencies and external periodic forcing."
      • This sentence describes the entire mathematical model used here, and our objective is to understand what it means.
  • Model
    • Does the model consider a frame of reference?
    • What are the major simplifications/assumptions made in the model?
    • Can there be any reason for the authors choosing a Lorentzian or Cauchy distribution?
    • What is the advantage of introducing an order parameter?
    • What is the intuitive meaning of the order parameter \(z\)?
    • What is the initial condition used in the reduced model?
    • What is \(p\) and how is it defined?
    • What is a steady state? What is the steady state of the infinite-dimensional model?
    • Discuss the meanings of the parameters in the final reduced model.
  • Dynamics
    • What are bifurcations?
    • What do the following mean intuitively?
      • \(|z|<1\)
      • \(|z|=1\)
      • \(|z|>1\)
  • Recovery from jet-lag
    • Discuss the implications of recovering by phase advancement or phase-delay
  • Reference parameter set
    • What are the arguments proposed by the authors in selecting parameters?
      • Do these arguments impact their final selection of parameter values? If so, how?
    • How do they study the effect of the parameter set on the dynamics of the system? Do you agree with this method?
    • What is their interpretation of the dynamics resulting from the reference parameter set?
  • Parameter Dependence
    • Do the actual values of parameters used in this discussion have any "real" meaning? Can we accept the authors' interpretation of these results?

Efficient Physical Embedding of Topologically Complex Information Processing Networks in Brains and Computer Circuits

  • Introduction
    • What factor drives the evolution of VLSI circuits?
    • What do the authors hypothesize about the evolution of biological information networks?
    • What do the authors say about the properties shared by artificial and biological information processing networks?
    • What is Rent's rule? See equation 1.
    • What are the different datasets used in this study ? You will find details in the "Network data" section in "Materials and Methods."
    • What do the authors aim to do with these datasets ?
  • Topological Rentian scaling
    • What did the authors do to show that all networks follow Rent's rule in topological space? Look at Figure 3.
    • In "Materials and Methods," the authors say that they used hMetis software to compute the topological Rent exponent. How does the algorithm work? Figure 1B is illustrative.
    • What is the definition of topological dimension and how is it related to the Rent exponent? Can you explain the reasoning behind the formula. You will have to read reference number 5.
    • Examine Table 1 and see how Topological dimension and Euclidean dimension compare against each other.
    • What do the authors say about the relationship between high dimensional topologies and their logical capacity? Look at Figure 4.
    • What do the authors say about the wiring cost of high dimensional topologies? Is it high or low?
    • Can we compute the physical layout of a high dimensional topology with an optimal wiring cost?
  • Physical Rentian scaling
    • What did the authors do to show that all networks follow Rent's rule in physical space? Look at Figure 1C.
    • How is the method used for physical partitioning different from the method used for topological partitioning? Look at the caption for Figure 1C.
    • How did the authors estimate physical Rent exponent?
  • Wiring Length
    • How is wiring length related to Topological dimension and Euclidean dimension? Look at equation 3.
    • What is the criterion for a cost-efficient embedded network?
    • When the authors say a network is cost efficiently embedded, does it mean that the cost of wiring is minimum?
    • Look at Table 1 and see if all the networks satisfy the criterion that the coefficient \(k \approx 1\) if the network has been cost-efficiently embedded
    • Are the brain networks studied in this paper cost-efficient? If not, how are these networks different from the cost efficient networks? Look at Figure 5B.
    • Look at Figure 5B and try to intuitively understand how our brain network has evolved by natural selection. Hint: What was the deduction from Figure 4?
  • Allometric scaling
    • Look at Figure 4B. What is the source of the data shown as gray circles?
    • According to the figure, what is the relationship between gray matter volume and white matter volume in log-log scale? Is it linear? What does it mean to have a linear relationship in log-log scale?
    • Did the MRI and DSI datasets contain gray and white matter volume measurements? If no, how did the authors plot the relationship between gray matter volume and white matter volume in the MRI and DSI networks?
    • What conclusions do the authors draw from Figure 4B?
    • The authors later compare the Rent exponents of the cortex and the cerebellum. How did they estimate Rent exponent for the cortex? Did they use the Rent exponent computed for the MRI network? How did the authors justify their estimation for cortical Rent exponent?
    • How did they estimate the Rent exponent for the cerebellum? Did they justify their choice of Rent exponent?
    • What is the relationship between the Rent exponents of the cortex and thes cerebellum? Are they equal? If not, which one is greater?
    • What conclusions do the authors draw from this analysis?
  • Hierarchical modularity
    • The authors say that "each network could be sub-divided into a number of sparsely interconnected modules each comprising a number of densely intra-connected nodes. Is this property necessary to divide a network in modules?
    • What is hierarchical modularity? The authors have given a definition in the "Hierarchical modularity" section.
    • Which algorithm did the authors use to compute hierarchical modules? We will be discussing the algorithm in class.
    • How did the authors show that modularity was not occuring randomly? Identify the two types of benchmark networks they generated for this purpose.
    • Study Figure 2.
      • What does the number in each circle represent?
      • What does the value \(m\) represent?
      • Figure out the significance of each color in the dendrograms.
      • Are these the full hierarchical decomposition of the given networks ?
    • What conclusions do the authors draw from this analysis?

Tissue-based map of the human proteome

  • The data produced by this research is available at The Human Protein Atlas
  • See this video to learn about how a gene is transcribed into messenger RNA and then translated into protein.
  • We will not spend much time on the experimental techniques used to generate the data in this paper. To the extent possible, familiarise yourself with the two key ways of measuring gene and protein expression described in the first half of the second paragraph of the paper (not including the asbtract).
  • We will discuss the paper through its figures. Therefore, please study each figure carefully and understand the accompanying text.
  • What are the different types of proteins they attempt to characterise in this paper? (Second half of the second paragraph of the paper.)
  • What are housekeeping genes? (Bottom of paragraph four on page one of the PDF.)
  • Fig 1
    • Fig 1A: How many tissues do they study in this paper? There are different numbers mentioned: 32 and 44. Explain why there are two different numbers. Which tissues correspond to the brain?
    • Fig 1B: How did the authors generate this figure? Where have we seen a similar algorithm used in this class?
    • Fig S1: Examine Fig 1B and Fig S1 to understand the authors' claims on tissues that are outliers and tissues that have a clear connectivity.
    • We will ignore Fig S2 and S3.
    • Fig 1C and Table 1
      • Use Table 1 to understand the different classifications of proteins.
      • What is this mysterious quantity called FPKM?
      • What is a "tissue-specific" gene? Is the use of this term supported by the authors?
    • Fig 1B: What do you observe about testes and brain?
    • We will spend just a little time on Figs 1D-1F.
  • Fig 2
    • Fig 2A-N: We will ignore these panels.
    • Fig 2O and Fig S4: Study these two figures together. In Fig S4, what property distinguishes a red circle from an orange circle? Which tissues jump out at you?
    • Fig 2P: To understand this figure, I will introduce the Gene Ontology and functional enrichment. Read the text to get an understanding of the types of functions that are unique to specific tissues.
  • Fig 3.
    • What are the authors predicting? Try to get a high-level idea of how they made these predictions (read the supplement).
    • We will use Figs 3A/B to introduce the relevant concepts, study Figs 3D/F in more detail, and ignore Figs 3C/E.
  • If there is time, we will study Fig 4, especially to understand the implications of the sentences at the end of the sections "The druggable proteome," "The cancer proteome," and "Tissues versus cell lines."
  • We will ignore Fig 5.

Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder

  • Introduction
    • What causes the different levels of Autism Spectrum Disorder?
    • How many genes were known to contribute to autism risk? How many are estimated to contribute to autism risk?
    • What are the limitations of the previous approaches of the network-based analyses? (paragraph 2)
    • What is the goal of this paper?
  • Figure 1
    • What is the set of positive genes and negative genes they use to train their classifier?
      • What are the different evidence levels of positive genes?
      • How many genes belong to each category?
      • Do you agree with their choice of negatives?
    • Who developed the human brain-specific functional interaction network? (see reference 21)
    • How is the interaction network incorporated with the positive and negative genes?
      • What type of classifier did they use?
      • See the end of the first paragraph of the "Learning and cross-validation of the network-based classifier" section of the methods
      • Could they have used other classifiers? Or other network-based algorithms such as diffusion?
    • How are genes ranked based on their predicted autism association?
      • See the "Genome-wide prediciton using evidence-weighted network classifier" section of the methods.
      • Supplementary Fig. 1 gives an example of a highly predicted gene with no previous evidence
  • Figure 2
    • How did they generate the AUC boxplots in Figure 2a?
      • See the second paragraph of the "Learning and cross-validation of the network-based classifier" section of the methods
    • In Figure 2a, why did "E1 + E2 + E3 + E4" perform better than "E1 + E2 + rand34"?
    • In Figure 2b, what are the different gene sets? Where did the data for this analysis come from?
      • See the "Evaluation of autism-associated gene ranking …" section?
      • What does it mean that the sibling gene set was not significant while the proband gene sets were?
    • They say they used a permutation test to get the p-values. What are they permuting and comparing the permutations with?
    • In Figures 2c-e, what is a decile? Notice that in 2c, the gene sets are the same as used in 2b.
      • What does it mean for a high fraction of genes to be in the first decile compared to a unifrom distribution of genes?
      • Why are they looking at the regulators and pathways in 2d and 2e?
  • Figure 3
    • Where did the data for Figure 3 come from?
      • See reference 29
    • What does each cell represent? What does it mean that most of the red squares are focused around the early to late midfetal range? Is that when ASD develops?
    • Notice that the autism-associated genes seem to be spread out across all of the brain regions. How do the authors interpret this?
  • Figure 4
    • What is the purpose or contribution of this analysis?
    • How do they link clusters to specific cellular functions and relate them to autism?
    • Choose a cluster and try to figure out how the cellular functions of that cluster could relate to autism.
  • Figure 5
    • What is a CNV? Why would having more or less copies of a gene be bad?
    • How common are these CNVs in persons with Autism? (See the percentages at the top of each CNV region)
    • What do they say about the previously non-associated, top-ranked PPP4C and MAZ genes in the most common CNV?
  • Figure 6
    • What is an "intermediate gene"?
    • What is the purpose of this analysis?
  • Results/Discussion
    • What do they say is their chief contribution?
    • Have they updated their original analysis with updated versions of SFARI and other databases?
      • What version are they on now?
    • What is their hypothesis about the transcription factor MAZ?

Introductory Videos

These videos provide introductions into the brain structure and function, and molecular and cell biology. I will play some of them in class.

These videos discuss general molecular and cell biology.


Date Topic and Papers Presenter(s)
Aug 23, 2016 Introduction to Computing the Brain T. M. Murali
Aug 25, 2016 Introduction to Computing the Brain, Discussion of papers T. M. Murali
Aug 30, 2016 Collective dynamics of 'small-world' networks (Reading notes) T. M. Murali
Sep 1, 2016 Collective dynamics of 'small-world' networks (continued) T. M. Murali
Sep 6, 2016 The small world of the cerebral cortex (Reading notes) T. M. Murali
Sep 8, 2016 No class  
Sep 13, 2016 An introduction to diffusion tensor image analysis (Reading notes)
Rich-Club Organization of the Human Connectome (Reading notes)
Aditya Pratapa
Sep 15, 2016 Rich-Club Organization of the Human Connectome Aditya Pratapa
Sep 20, 2016 Rich-Club Organization of the Human Connectome Aditya Pratapa
Sep 22, 2016 Cortical High-Density Counterstream Architectures (Reading notes) Noah Luther
Sep 27, 2016 Cortical High-Density Counterstream Architectures Noah Luther
Sep 29, 2016 Randomization and resilience of brain functional networks as systems-level endophenotypes of schizophrenia (Reading notes) Audrey Decker
Oct 4, 2016 Randomization and resilience of brain functional networks as systems-level endophenotypes of schizophrenia Audrey Decker
Oct 6, 2016 No class  
Oct 11, 2016 DeltaCon: Principled Massive-Graph Similarity Function with Attribution (Reading notes) Mitch Wagner and Xinfeng Xu
Oct 13, 2016 DeltaCon: Principled Massive-Graph Similarity Function with Attribution Mitch Wagner and Xinfeng Xu
Oct 18, 2016 DeltaCon: Principled Massive-Graph Similarity Function with Attribution Mitch Wagner and Xinfeng Xu
Oct 20, 2016 DeltaCon: Principled Massive-Graph Similarity Function with Attribution Mitch Wagner and Xinfeng Xu
Oct 25, 2016 A high-throughput framework to detect synapses in electron microscopy images (Reading notes) Zhen Guo
Oct 27, 2016 A high-throughput framework to detect synapses in electron microscopy images Zhen Guo
Nov 1, 2016 Resynchronization of circadian oscillators and the east-west asymmetry of jet-lag (Reading notes) Amogh Jalihal
Nov 3, 2016 Resynchronization of circadian oscillators and the east-west asymmetry of jet-lag Amogh Jalihal
Nov 8, 2016 No class  
Nov 10, 2016 Efficient Physical Embedding of Topologically Complex Information Processing Networks in Brains and Computer Circuits (Reading notes) Aditya Bharadwaj
Nov 15, 2016 Efficient Physical Embedding of Topologically Complex Information Processing Networks in Brains and Computer Circuits Aditya Bharadwaj
Nov 17, 2016 Tissue-based map of the human proteome (Reading notes) T. M. Murali
Nov 29, 2016 No class  
Dec 1, 2016 Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder (Reading notes) Jeff Law
Dec 6, 2016 Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder Jeff Law


  1. Assignment 1, released on Sep 20, 2016, due on October 4, 2016