A Primer for Computational Biology
A Primer for Computational Biology aims to provide life scientists and students the skills necessary for research in a data-rich world. The text covers accessing and using remote servers via the command-line, writing programs and pipelines for data analysis, and provides useful vocabulary for interdisciplinary work. The book is broken into three parts:

  1. Introduction to Unix/Linux: The command-line is the “natural environment” of scientific computing, and this part covers a wide range of topics, including logging in, working with files and directories, installing programs and writing scripts, and the powerful “pipe” operator for file and data manipulation.
  2. Programming in Python: Python is both a premier language for learning and a common choice in scientific software development. This part covers the basic concepts in programming (data types, if-statements and loops, functions) via examples of DNA-sequence analysis. This part also covers more complex subjects in software development such as objects and classes, modules, and APIs.
  3. Programming in R: The R language specializes in statistical data analysis, and is also quite useful for visualizing large datasets. This third part covers the basics of R as a programming language (data types, if-statements, functions, loops and when to use them) as well as techniques for large-scale, multi-test analyses. Other topics include S3 classes and data visualization with ggplot2.
Shawn T. O’Neil earned a BS in computer science from Northern Michigan University, and later an MS and PhD in the same subject from the University of Notre Dame. His past and current research focuses on bioinformatics. O’Neil has developed and taught several courses in computational biology at both Notre Dame and Oregon State University.
Table of Contents
    Part I: Introduction to Unix/Linux
        Logging In
        The Command Line and Filesystem
        Working with Files and Directories
        Permissions and Executables
        Installing (Bioinformatics) Software
        Command Line BLAST
        The Standard Streams
        Sorting, First and Last Lines
        Rows and Columns
        Patterns (Regular Expressions)
    Part II: Programming in Python
        Hello, World
        Elementary Data Types
        Collections and Looping: Lists and for
        File Input and Output
        Conditional Control Flow
        Python Functions
        Command Line Interfacing
        Bioinformatics Knick-knacks and Regular Expressions
        Variables and Scope
        Objects and Classes
        Application Programming Interfaces, Modules, Packages, Syntactic Sugar
        Algorithms and Data Structures
    Part III: Programming in R
        An Introduction
        Variables and Data
        R Functions
        Lists and Attributes
        Data Frames
        Character and Categorical Data
        Split, Apply, Combine
        Reshaping and Joining Data Frames
        Procedural Programming
        Objects and Classes in R
        Plotting Data and ggplot2
        About the Author
