You will have three individual assignments, six seminar submissions, and one group project:
Deadlines are all by 11:59 pm (Pacific time) on the due date. Any submission or modification after the due date will not be graded unless you have requested an extension. If you anticipate having trouble meeting a deadline and need to request an extension/academic concession please reach out via email.
For detailed instructions on how to submit your work, see the submission guide.
For a visual summary, click here.
Category | Assignment | Due Date |
---|---|---|
Seminar | Seminar 1 | Fri Jan 10 |
Seminar | Seminar 2a & 2b | Fri Jan 17 |
Individual Assignment | Intro Assignment | Thu Jan 23 |
Seminar | Seminar 3 | Fri Jan 24 |
Group Project | Initial Proposal | Mon Jan 27 |
Seminar | Seminar 4 | Fri Jan 31 |
Seminar | Seminar 5 | Fri Feb 07 |
Group Project | Final Proposal | Tue Feb 11 |
Seminar | Seminar 6 | Fri Feb 14 |
Paper Critique | Paper Critique | Thu Feb 20 |
Seminar | Seminar 7 | Fri Feb 28 |
Individual Assignment | Analysis Assignment | Thu Mar 06 |
Group Project | Progress Report | Tue Mar 11 |
Seminar | Seminar 8 | Fri Mar 14 |
Seminar | Seminar 9 | Fri Mar 21 |
Seminar | Seminar 10 | Fri Mar 28 |
Group Project | Final Report | Tue Apr 01 |
Group Project | Presentation Day 1 | Tue Apr 01 |
Group Project | Presentation Day 2 | Thu Apr 03 |
Group Project | Presentation Day 3 | Tue Apr 08 |
Group Project | Individual & Group Evaluation | Fri Apr 11 |
Seminar | Seminar 11 | Fri Apr 11 |
This assignment is designed to give you independent practice in the workflow used for completing and submitting course work: committing and pushing files to GitHub, formatting an R Markdown document, using R to do simple analyses, and writing about your results. Grade point values are listed in the assignment.
The instructions and questions are available here.
You will submit short “deliverables” to demonstrate your participation for every seminar. These deliverables give practical experience applying the knowledge that will be helpful on homework, the project, and your future research. Each Seminar session is weighted equally, but the lowest score will be dropped (so that the 10 seminars with highest score will each count for 3% of the final grade). Seminar deliverables are due on the Friday following the TA-led session for that seminar. See the Seminars page for the submission materials and schedule.
Each student will review and provide a written critique (max 1000 words) of a paper that will be posted on Canvas.
Please see critique rubric for detailed instructions on this task.
This assignment will assess your understanding of the seminar and lecture materials. Start early because this assignment will take time to fully work through. Use the issues in the Discussion repo and the seminar time to ask questions. You will find most of the analysis workflow of the assignment in the seminar materials.
The instruction and questions will be available here.
The grade point values for each question are listed right in the assignment.
Identify a biological question of interest and a relevant dataset. Develop and apply a statistical approach that allows you to use the dataset to answer the question.
We assume the biological question and data fall in the general area of high-throughput, large-scale biological investigations targeted by the course. Beyond that, it is wide open: methylation, SNPs, miRNAs, CNVs, RNA-Seq, CHiP-Seq, gene networks, … it’s fair game. Avoid a dataset that doesn’t have any/much quantitative data, i.e. contains only sequence or discrete data. If you are using published data, it is critical to be clear about how your project differs from previous literature.
Note that definitive answers are not necessarily expected. Rather, aim to provide a critical appraisal of the data, the analytical approach, and the results. You will have to handle the competing pressures to “get it right” and “get it done”. Shortcomings of the data, misfits between the data or the biological question and the statistical model, etc. are inevitable. Your goal is to identify such issues and discuss them critically, without becoming paralyzed. Demonstrate understanding of the statistical concepts and methods that are the foundation of your analytical approach.
We assume the analytical and computing task will have a substantial statistical component, probably enacted via R. So beware of a major analytical or computational undertaking that is, nonetheless, not statistical (example: constructing a database). Creating useful data visualizations can be absolutely vital and is arguably statistical, but your analysis should go beyond merely creating pretty pictures (but please do include some!). Key concepts, at least some of which should come up in your analysis:
the (hypothesized, probably artificial) data-generating model
background variation, variance, signal to noise ratio, estimates and their associated standard error
relationship between biological factors and experimental factors, apparent relative importance in terms of “explaining” observed data
attention to large-scale inference, e.g. control of family-wise error rate or false discovery rate
If your project involves using unpublished data, ensure your plans are known to the data providers (e.g., your supervisors), and think about implications for publishing - are you are bringing the project team in as collaborators in effect? Are you planning to publish the results of your project, and if so who will be the co-authors? It is best to deal with these questions at the outset of the project.
The projects are not made public (other than an oral presentation of your work in front of your classmates, which will be recorded and made available only to the teaching team and registered students in the course). The project report materials are loaded into GitHub, the secure site we use to manage the course. The course staff and instructors are the only people who have access to the project GitHub repo other than the other members of the project group. The data used can be uploaded to the project, but this can limited or omitted if there are special concerns about privacy etc. - it’s primarily the code and write-up about the results that needs to be provided for evaluation.
You can read Github’s security and privacy policies.
Groups will be formed by the instructional team following the results of an in class survey, and posted in the Discussion repo. Groups will have a target size of 4 members. Groups will be formed with priority for diversity in terms of backgrounds. In practice, this probably means the team members come from a mix of programs/departments. We will try to honour requests for working with specific team mates, and you may come talk to us immediately after receiving group assignments if you’d like to make a change.
Details and grading rubrics for each component of the final group project can be found on the Group project rubrics page.