You will have three individual assignments, six seminar submissions, and one group project:
Deadlines are all by 11:59 pm (Pacific time) on the due date. Any submission or modification after the due date will not be graded unless you have requested an extension. If you anticipate having trouble meeting a deadline and need to request an extension/academic concession please reach out via email.
For detailed instructions on how to submit your work, see the submission guide.
For a visual summary, click here.
Category | Assignment | Due Date |
---|---|---|
Seminar | Seminar 1 | Fri Jan 12 |
Seminar | Seminar 2a & 2b | Fri Jan 19 |
Individual Assignment | Intro Assignment | Thu Jan 25 |
Seminar | Seminar 3 | Fri Jan 26 |
Group Project | Proposal Lightning Talks | Tue Jan 30 |
Seminar | Seminar 4 | Fri Feb 02 |
Group Project | Written Proposal | Thu Feb 08 |
Seminar | Seminar 5 | Fri Feb 09 |
Seminar | Seminar 6 | Fri Feb 16 |
Paper Critique | Paper Critique | Thu Feb 22 |
Individual Assignment | Analysis Assignment | Thu Feb 29 |
Seminar | Seminar 7 | Fri Mar 01 |
Group Project | Progress Report | Thu Mar 07 |
Seminar | Seminar 8 | Fri Mar 15 |
Seminar | Seminar 9 | Fri Mar 22 |
Seminar | Seminar 10 | Fri Mar 29 |
Group Project | Final Report | Tue Apr 02 |
Group Project | Presentation Day 1 | Thu Apr 04 |
Group Project | Presentation Day 2 | Tue Apr 09 |
Group Project | Presentation Day 3 | Thu Apr 11 |
Group Project | Individual & Group Evaluation | Fri Apr 12 |
Seminar | Seminar 11 | Fri Apr 12 |
This assignment is designed to give you independent practice in the workflow used for completing and submitting course work: committing and pushing files to GitHub, formatting an R Markdown document, using R to do simple analyses, and writing about your results. Grade point values are listed in the assignment.
The instructions and questions are available here.
You will submit short “deliverables” to demonstrate your participation for every seminar. These deliverables give practical experience applying the knowledge that will be helpful on the homework assignment, final project, and (hopefully) your future research. Each of Seminar session is weighted equally, but the lowest score will be dropped (so that the 10 seminars with highest score will each count for 2% of the final grade). Seminar deliverables are due on the Friday following the TA-led session for that seminar. See the Seminars page for the submission materials and schedule.
Each student will review and provide a 500-700 word critique a paper that will be posted on Canvas.
Please see critique rubric for detailed instructions on this task.
This assignment will assess your understanding of the seminar and lecture materials. Start early because this assignment will take time to fully work through. Use the issues in the Discussion repo and the seminar time to ask questions. You will find most of the analysis workflow of the assignment in the seminar materials.
The instruction and questions will be available here.
The grade point values for each question are listed right in the assignment.
Identify a biological question of interest and a relevant dataset. Develop and apply a statistical approach that allows you to use the dataset to answer the question.
We assume the biological question and data fall in the general area of high-throughput, large-scale biological investigations targeted by the course. Beyond that, it is wide open: methylation, SNPs, miRNAs, CNVs, RNA-Seq, CHiP-Seq, gene networks, … it’s fair game. Avoid a dataset that doesn’t have any/much quantitative data, i.e. contains only sequence or discrete data. If you are using published data, it is critical to be clear about how your project differs from previous literature.
Note that definitive answers are not necessarily expected. Rather, aim to provide a critical appraisal of the data, the analytical approach, and the results. You will have to handle the competing pressures to “get it right” and “get it done”. Shortcomings of the data, misfits between the data or the biological question and the statistical model, etc. are inevitable. Your goal is to identify such issues and discuss them critically, without becoming paralyzed. Demonstrate understanding of the statistical concepts and methods that are the foundation of your analytical approach.
We assume the analytical and computing task will have a substantial statistical component, probably enacted via R. So beware of a major analytical or computational undertaking that is, nonetheless, not statistical (example: constructing a database). Creating useful data visualizations can be absolutely vital and is arguably statistical, but your analysis should go beyond merely creating pretty pictures (but please do include some!). Key concepts, at least some of which should come up in your analysis:
the (hypothesized, probably artificial) data-generating model
background variation, variance, signal to noise ratio, estimates and their associated standard error
relationship between biological factors and experimental factors, apparent relative importance in terms of “explaining” observed data
attention to large-scale inference, e.g. control of family-wise error rate or false discovery rate
If your project involves using unpublished data, ensure your plans are known to the data providers (e.g., your supervisors), and think about implications for publishing - are you are bringing the project team in as collaborators in effect? Are you planning to publish the results of your project, and if so who will be the co-authors? It is best to deal with these questions at the outset of the project.
The projects are not made public (other than an oral presentation of your work in front of your classmates, which will be recorded and made available only to the teaching team and registered students in the course). The project report materials are loaded into GitHub, the secure site we use to manage the course. The course staff and instructors are the only people who have access to the project GitHub repo other than the other members of the project group. The data used can be uploaded to the project, but this can limited or omitted if there are special concerns about privacy etc. - it’s primarily the code and write-up about the results that needs to be provided for evaluation.
You can read Github’s security and privacy policies.
Groups will be formed by the instructional team following the results of an in class survey, and posted in the Discussion repo. Groups will have a target size of 4 members. Groups will be formed with priority for diversity in terms of backgrounds. In practice, this probably means the team members come from a mix of programs/departments. We will try to honour requests for working with specific team mates, and you may come talk to us immediately after receiving group assignments if you’d like to make a change.
Details and grading rubrics for each component of the final group project can be found on the Group project rubrics page.