Tag Archives: Bioinformatics

Black Forest Summer School on Bioinformatics for Molecular Biologists, September 2013


Here’s another course for the summer aimed at PhD students and early career molecular biologists on mastering the use of pre-existing bioinformatics tools.  The venue sounds amazing and it looks like the perfect place to learn or hone your existing bioinformatics skills.  Francis Martin will be giving the keynote lecture so there is more incentive to attend!

Summer 2013 Bioinformatics Workshop Roundup Part Two

Here’s a couple more promising bioinformatics workshops taking place in the summer of 2013:

Metagenomics: From The Bench To Data Analysis, Heidelberg, Germany, April 14th to April 20th, 2013

EMBO course header

Joint EU-US Training in Marine Bioinformatics, Newark, Delaware, USA, June 16th to June 29th, 2013

EU-US Course Header

Summer 2013 Bioinformatics Workshop Roundup Part One

The summer is a great time to learn some new skills and really hone data analysis techniques.  I think it’s best to learn some topics — bioinformatic tools and data analysis scripting in particular — as intense multi-day workshops or a week- or two-week long short courses.  Here’s a few courses that are being held this summer that may be of interest to you.  I’ll be sure to post more as I hear about them.

Programming for Evolutionary Biology, Leipzig, Germany, April 3rd to April 19th, 2013

course one

Informatics for RNA-sequence Analysis, Toronto, Canada, June 3rd to June 4th, 2013

course two

Pathway & Network Analysis of -Omics Data, Toronto, Canada, June 10th to June 12th, 2013

course two

Book Review: Practical Computing For Biologists

I have a couple of book reviews in the pipeline, so I am starting a new category for review of books I find useful (or not so useful).  I wrote this review for this great book months ago, but, like many things in my life, I’m just now getting it online.

practical computing for biologists
Like many people, my research has been changing in recent years.  I have been spending an increasing amount of time in front of a computer and less time at the lab bench.  I can’t see myself ever forsaking the wet lab or field experiments, but I’m using computers more than ever before.  There’s now so much data to process – mostly text in the form of sequence data – and I’ve become increasingly reliant on a computer to search large data sets and convert data file formats.  Even if you aren’t a biologist in the area of genomics/genetics, new data collection instruments for physiology, ecology, and atmospheric sciences are recording data at incredible rates, and, additionally, sorting through citations is getting more and more time consuming.  It’s impossible to ignore the data revolution that is taking place no matter where your foundation within the biological sciences (or physics, chemistry, etc.) lies.

I wish the book Practical Computing For Biologists (and Companion Website), by Steven H. D. Haddock & Casey W. Dunn, would have come along sooner, but I am so glad it’s available now, because learning to deal with data more efficiently is where this book comes in.  When considering my research and use of time, this book has been the most important book I’ve read in the last year, perhaps the last decade.  If you’re a biologist (or anyone for that matter) who finds themselves clicking away at a database file (such as Excel) or cutting and pasting from online data repositories (such as GenBank, national weather databases, etc.) then this book is for you.  In reality, this book is for anyone who wants to use a computer to work more efficiently with data.

The book can be broken up into six sections dedicated to the following topics: (1) manipulating and searching text files, (2) working within your computer’s shell, (3) basic programming for biologists, (4) combining methods (this is a section on database management and tool selection), (5) dealing with graphics for data communication, and (6) advanced topics such as remote computer access and installing software.

This book devotes a large portion, and rightfully so, to addressing how to manipulate text files and other file formats used to store and communicate data.  Beginning with text editing using regular expressions, what I learned in the initial chapters immediately saved me time during large text processing and parsing of sequencing data.  A section at the end of the book focused on remote access and remote scripting helped me to start dealing with text and files on other computers.

The book focuses on Unix based platforms (Linux, OSX) due to ease of programming, but it does not ignore DOS (Windows) based platforms.  An appendix at the end of the book is useful in translating one platform to another.  When the book recommends the use of specific software, which is rare, the focus is on free open-source options.  The programming language Python is the language of choice for much of the book, but an Appendix at the end of the book helps to sort out differences in the many programming languages used in biology.  The open-source MySQL database platform is addressed for storing and communicating data.  One important goal of the programming and data organization aspect of the book is to standardize reproducibility and improve collaborative work through automation and transparency.

Surprisingly little attention is given to the actual communication of data in graduate coursework and training, so it’s refreshing to see a focus on image basics communicated here across a few chapters in the book.  These sections focus on basic image creation and manipulation using both commercial and open-source options.

Striking a perfect balance by guiding you through tutorials and nudging your own self-exploration, the book has just enough guided direction to not annoy or overwhelm.  This text is not a solution cookbook, but, more importantly, a guide to help get you started in data analysis and file format manipulation and to help you think for yourself to address your research problems.  While this book will help you deal with text, it doesn’t address software for word processing (Word, OpenOffice), Presentation (Powerpoint, Keynote), Spreadsheet (Excel), or statistics (R, SAS, SPSS, etc.), as this would create a huge giant book.  This book does not cover software for phylogenetics or population genetics and I don’t think it should.

Just to be clear, I’m not being paid here to promote this book.  I just honestly have found this book extremely helpful to my own research and I want to communicate that.  I haven’t read many books which have been able to change my life in a self-actualizing way, but this book helped (…and is still helping) me to do what I was doing before, but more efficiently.

Galaxy Workshop and Community Conference, July 2012

Galaxy is a free web-based platform for bioinformatics and data mining initiated by some of my colleagues at Penn State’s Center for Bioinformatics and Genomics.  The are Galaxy platforms popping up all over the place: JGI, JCVI, and you can run it on your own desktop or computer cluster.  In case you’d like to gain hands on experience using the platform or want to learn more about setting up your own Galaxy platform you can attend the 2012 Galaxy Workshop and Community Conference:

The 2012 Galaxy Community Conference (GCC2012) will be held July 25-27 at the UIC Forum at University of Illinois Chicago.

GCC2012 will run for two full days, and be preceded by a full day of training workshops. GCC2012 will have things in common with previous meetings (see GDC 2010, GCC 2011), and will also incorporate new features, such as the training day, based on feedback we received after the 2011 conference.

GCC2012 is hosted by the University of Illinois at Chicago, the University of Illinois at Urbana-Champaign, and the Computation Institute.

2012 Workshop Series in Microbial Genomics & Metagenomics at JGI

There’s another series of workshops for both microbial genomics and metagenomics presented by the U.S. Department of Energy’s Joint Genome Institute in 2012.   These workshops include two days of seminars and three days of hands-on tutorials for both microbial genomics and metagenomics.  These workshops are centered on the use of the following bioinformatic tools: IMG, IMG/M, IMG-ER, IMG-EDU, VISTA, GREENGENES and ARB.  Registration for this workshop can be found here.