2020 Summer School

Course Listings

Introduction to Python (THIS COURSE IS CLOSED)

Status
closed
Date
June 8 - June 11
Time
9:00 am - 12:00 pm
Location
ONLINE
Instructor
Chris Larson

Course Closes: May 27

Description: This four-day course will introduce students to basic concepts in scientific computing in the Python language. Trainees will learn introductory topics such as data structures, control flow, functions, and file input/output and data parsing.

Instructor Bio: Chris Larson is an independent Python consultant who has been working with the language since the late ’90s. As a consultant, he has worked to implement solutions for computational phylogenetics researchers, UT’s GSAF and a broad range of other clients in industry and academia. He’s happiest creating order from chaos, exploring interesting puzzles, and introducing folks to the power and simplicity of Python.

Preferred or Prerequisite Skills: None

Computer Requirement: Students must provide laptops able to connect to the internet, and a Firefox or Chrome browser. UTEID required for wireless access. Please be sure you know your UT EID when you come to class. To obtain a UTEID, go here.

Back to top

Introduction to Biocomputing (THIS COURSE IS CLOSED)

Status
closed
Date
June 8 - June 11
Time
1:30 pm - 4:30 pm
Location
ONLINE
Instructor
Benni Goetz (Associate Research Scientist and Bioinformatics Consultant, CRBS)

Course Closes: June 1

Description: An introduction to the Unix command line and Python. Unix basics will include file navigation, pipes, and core utilities. Python basics will cover data types, loops, conditionals, and objects. After the basics are covered, the focus will turn to bioinformatics applications. No previous programming experience assumed.

Instructor Bio: Benni Goetz is a bioinformatics consultant in the CBRS. Python, Bash, and huge computing clusters are some of his favorite things. In a previous life, Benni studied pure math: differential geometry in particular.

Preferred or Prerequisite Skills: No previous programming experience is assumed.

Computer Requirement: Students should have their own laptop computer. UTEID required for wireless access. Please be sure you know your UT EID when you come to class. To obtain a UTEID, go here.

TACC Account Required: Attendees will need an account TACC. Please be sure you know your TACC username when you come to class. To sign up for a TACC account, go here.

Back to top

Introduction to Core NGS Concepts and Tools (THIS COURSE IS CLOSED)

Status
closed
Date
June 15- June 19
Time
9:00 am - 12:00 pm
Location
ONLINE
Instructor
Anna Battenhouse (Associate Research Scientist and Bioinformatics Consultant, CBRS)

Course Closes: June 8

Description: This course provides an introduction to the concepts and vocabulary of Next Generation Sequencing (NGS) with an emphasis on common protocols, tools and file formats used in NGS data analysis. Subjects covered include quality assessment and manipulation of raw NGS sequences (FastQC, cutadapt), read mapping (bwa, bowtie2), the Sequence Alignment Map (SAM) format, and tools for manipulating BAM files (samtools, bedtools). Participants will gain hands-on experience using these and other NGS tools in the Linux command line environment at TACC, as well as exposure to the many bioinformatics resources TACC makes available.

Instructor Bio: Anna Battenhouse is a research scientist in the lab of Dr. Edward Marcotte as well as leading the Biomedical Research Support Facility’s mission to support the IT and computational needs of UT Austin’s biological sciences community. She has extensive experience working with NGS data over the last 10 years, and develops and maintains NGS analysis pipeline scripts for UT’s BioITeam. Anna received a B.A. in English Literature from Carleton College in 1978. After a long career in commercial software development Anna began her “retirement career” at UT Austin in 2007, and obtained a B.S. in Biochemistry in 2013.

Preferred or Prerequisite Skills: None

Computer Requirement: Students must bring their own laptops. UTEID required for wireless access. Please be sure you know your UT EID when you come to class. To obtain a UTEID, go here.

Back to top

Introduction to RNA-Seq (THIS COURSE IS CLOSED)

Status
closed
Date
June 15 - June 19
Time
1:30 pm - 3:30 pm
Location
ONLINE
Instructor
Dhivya Arasappan (Assistant Professor of Practice and Bioinformatics Consultant, CBRS)

Course Closes: June 8

Description: This four-day course provides an introduction to methods for analysis of RNA-seq data. It assumes familiarity and comfort with Linux command line and TACC. A typical RNA-seq workflow will be featured, starting from quality assessment of raw data, mapping (bwa, kallisto), differential expression analysis (DESeq2), and downstream analyses and visualization. The course also describes analysis methods for dealing with single-cell RNA-Seq data. Participants will gain hands-on experience using these tools in a Linux command line environment at TACC.

Instructor Bio: Dhivya Arasappan has 10 years experience analyzing NGS data from multiple platforms. Her areas of expertise include RNA-Seq analysis (specifically involving large-scale brain expression datasets and coexpression network analysis), de novo genome assembly (particularly using hybrid sequencing data) and benchmarking of bioinformatics tools. She is the research educator for the Big Data in Biology Freshman Research Initiative stream.

Preferred or Prerequisite Skills: familiarity working in a UNIX environment. You may register for Intro to UNIX, a one-day short course in late April, by clicking here.

Computer Requirement: Students should have their own laptop computer. UTEID required for wireless access. Please be sure you know your UT EID when you come to class. To obtain a UTEID, go here.

Back to top

Genome Variant Analysis (THIS COURSE IS CLOSED)

Status
closed
Date
June 22 - June 26
Time
9:00 am - 11:30 am
Location
ONLINE
Instructor
Daniel Deatherage, Ph.D. (Postdoctoral Research Associate, Molecular Biosciences)

Course Closes: June 15

Description: This course is designed to teach you how to identify genomic variants from a variety of NGS library sources (mixed populations, whole genome, enriched/targeted panels, rare variant, amplicon, etc.) for both prokaryotic and eukaryotic organisms. The course emphasizes using existing data sources to allow participants to analyze real data in the same step-by-step manner that one would analyze their own data. The modular nature of exercises allows participants of all computational skill levels to benefit from both instruction and hands-on practice in areas they are personally most interested in while providing introductory resources to analysis types they may encounter in the future. Additional lecture/discussion will focus on understanding strengths and weaknesses of different sequencing library types, alternative analysis programs, different sequencing platforms, and how to best utilize TACC resources and existing pipelines to make analysis faster. Major data analysis steps include: sequencing quality assessment and improvement, reference genome construction, read mapping, variant calling, visualization and reporting. Using programs and pipelines such as: FastQC, Trimmomatic, SPAdes, SAMtools, Bowtie2, bedtools, breseq, IGV, GATK.

Instructor Bio: Daniel Deatherage earned his doctorate at The Ohio State University studying epigenetic effects of ovarian cancer. His postdoctoral work in Dr. Jeffrey Barrick’s lab has focused on using next generation sequencing to identify ultra rare mutations within evolving populations and diagnose synthetic biology constructs failure modes. In general, he is interested in using next generation sequencing to answer novel questions that may not be answerable by other methods.

Preferred or Prerequisite Skills: None

Computer Requirement: Students must use their own laptops. TACC Account and UTEID required. Please be sure you know both your UT EID and your TACC username when you come to class. To obtain a UTEID, go here. To sign up for a TACC account, go here.

Back to top

Comparative Genomic and Computational Approaches to the Evolution of Complex Phenotypes (THIS COURSE IS CLOSED)

Status
closed
Date
June 22 - June 25
Time
1:30 pm - 4:30 pm
Location
ONLINE
Instructor
Dr. Hans Hofmann (Professor, Integrative Biology); Dr. Becca Young (Research Associate, Integrative Biology); Dr. Kevin Liu (Assistant Professor, Computer Science and Engineering, Michigan State University)

Course Closes: June 19

Description: Recent progress in genome sequencing technologies and phylogenetic comparative analysis have made it possible to ask novel questions about organismal evolution across the tree of life, including the role of convergence and parallelism in the evolution of complex phenotypes. Huge amounts of publicly available data are available for these kinds of analyses. Similarly, advances in digital evolution approaches have enabled researchers to simulate historical processes in silico within an artificial life framework. This course will introduce participants to various comparative and molecular evolution methods at an ‘omics scale; and demonstrate how in silico modeling platforms and publicly available -omics data can be leveraged to test hypotheses of phenotypic evolution. There are no prerequisites.

Instructor Bio: Dr. Hans Hofmann is an evolutionary neuroscientist and professor in Integrative Biology at UT Austin. His research uses neurobiological, genomics, and comparative approaches to uncover the neural and molecular underpinnings of social behavior. Dr. Becca Young is an evolutionary developmental biologist in Integrative Biology at UT Austin. Her research uses genomics, bioinformatics, and comparative approaches to identify ecological, evolutionary, and genomic mechansims of phenotypic variation among populations and species. Dr. Kevin Liu is an assistant professor of Computer Science and Engineering at Michigan State University. His research develops new computational methodologies for efficient and accurate phylogenomic and comparative genomic analyses – especially in the context of complex evolutionary scenarios – and then connects the resulting insights to phenotype and function.

Preferred or Prerequisite Skills: None

Computer Requirement: Students will need their own laptops. UTEID required for wireless access. Please be sure you know your UT EID when you come to class. To obtain a UTEID, go here.

Back to top

Principles of Machine Learning for Bioinformatics (THIS COURSE IS CLOSED)

Status
closed
Date
June 29 - July 2
Time
9:00 am - 12:00 pm
Location
ONLINE
Instructor
Dennis Wylie (Research Scientist and Bioinformatics Consultant, CBRS)

Description: This four-day course will introduce a selection of machine learning methods used in bioinformatic analyses with a focus on RNA-seq gene expression data. We will cover unsupervised learning, dimensionality reduction and clustering; feature selection and extraction; and supervised learning methods for classification (e.g., random forests, SVM, LDA, kNN, etc.) and regression (with an emphasis on regularization methods appropriate for high-dimensional problems). Participants will have the opportunity to apply these methods as implemented in R and python to publicly available data.

Instructor Bio: Dennis Wylie joined the bioinformatics group in 2015. He has experience in NGS data analysis including variant calling and RNA-Seq-based biomarker discovery and predictive modeling (classification, regression, etc.). Prior to UT, he earned a PhD in Biophysics from UC Berkeley applying stochastic simulation methods in immunology, did postdoctoral work modeling the transmission of infectious disease, and spent six years as a bioinformatician in industry.

Preferred or Prerequisite Skills: This course is recommended for students with some prior knowledge of either R or python. Participants are expected to provide their own laptops with recent versions of R and/or python installed. Students will be instructed to download several free software packages (including R packages and python libraries such as including pandas and sklearn).

Computer Requirement: Students should have their own laptop computer. UTEID required for wireless access. Please be sure you know your UT EID when you come to class. To obtain a UTEID, go here.

Back to top

Practical Approaches to Analyzing Biological Data with R (THIS COURSE IS CLOSED)

Status
closed
Date
June 29 - July 2
Time
1:30 pm - 4:30 pm
Location
ONLINE
Instructor
Rachael Cox (graduate student, Marcotte labs)

Description: Modern researchers need basic data literacy. This four-day course will introduce how to use the R programming language to analyze and visualize biological data on small and large scales. We will focus on the practical tools you need to quickly import your data, clean it up, analyze it, and then generate publication-quality plots. Along the way we’ll briefly address best practices for coding in R, and even how to effectively find help online. This course uses the tidyverse ecosystem of R packages, and upon completion you’ll have used dplyr, tidyr, ggplot2, and more. Finally, I will also introduce Bioconductor, a collection of R packages designed for numerous biology-specific applications like RNA-sequencing, 16S ribosomal profiling, and genomic sequence analysis. No previous coding experience is required for this course.

Instructor Bio: Rachael Cox is a graduate student in the Biochemistry PhD program. Previously, Rachael obtained a BS in Chemistry (Texas A&M, 2013) and spent several years working as an organic chemist designing and building continuous reaction systems at Eastman Chemical Company. Since then, she’s decided big molecules are much more interesting than small molecules, and now studies the evolution of protein complexes in Dr. Edward Marcotte’s lab. Rachael frequently works with multi-omics datasets, using both Python and R.

Preferred or Prerequisite Skills: None

Computer Requirement: Students will need their own laptops. UTEID required for wireless access. Please be sure you know your UT EID when you come to class. To obtain a UTEID, go here.

Back to top

If you use the UT ProCard for payment of courses, please be aware that you can only charge ONCE per 24 hour period. Any attempts to charge more courses will fail, and you will not be registered.

For example, you may add one to many courses for one student into your shopping cart at any one time, and charge them to the ProCard, and you should receive a "registration successful!" page at the end. This is because you registered ONCE for ONE student. If you attempt to register and pay again, for example, for a different student, this will trigger the UT ProCard security system to stop payment, and your registration will not be successful. A page stating this fact will occur after you attempt to process payment. It looks a lot like the "registration was successful" page.

Ways to avoid this are: use the ProCard after 24 hours have passed, or the student may use their credit card and be reimbursed later through the usual UT accounting methods, or process the registration with an IDT, otherwise known as an Interdepartmental Transfer (talk to someone in your department that handles the accounts).