Python: It's a snake. No, it's a programming language. No, it's a snake.

... and so on and so on people argue. But they are both right, it is a snake, and it is a programming language. I want to talk about the programming language now. We'll deal with the snake another time. (and boy, do I have snake stories....)

Python is an interpreted computer language, also known as a script language (being a script language and an interpreted language are not necessarily the same thing, but Python is both). I never met a computer language I didn't like and I also never met a computer language that wasn't somehow superior to all other languages according to someone, for some reason.

Python is very much like Perl in that both are especially good at handling text, and both have piles of software already written that you can use (as libraries) for handling both bioinformatics requirements and web-related jobs. (Otherwise, Perl and Python are very different, of course.)

The reason I'm even talking about this, because really, me expressing my opinion about programming languages is nothing but troll bait (and trolls are not in season) is to point out that PLoS has a paper on Python, focusing on it's use in bioinformatics. The paper is here.

One thing I think is cool about Python is the way it integrates data entities with control structures. So, for instance, if you have a sentence in a string called "S":

"Now is the time for all good men to come to the aid of their country"

and you wanted to examine each letter in this sentence to see if it meets a specific criterion, you can do, essentially , this:

For each letter in S
..... do(a thing)

and this thing you are doing will loop pretty much automatically across the sentence. (Note: That was not Python code, just pseudocode, to give you the idea)

Another especially unique thing about python, which most people say is a great thing but others probably quietly think of as a total pane in the arse, is the way the language is formatted. White space matters. Certain code structures must be indented (or not) in relation to each other. White space is meaningful. You can make a mistake by not indenting something.

This is because Guido thinks your code should be pretty, and Guido thinks indentation is pretty.

Guido is Dictator for Life in the Python world. (He invented the language.)

Finally, the other thing about Python that is kind of cool is that since it is named after Monte Python, amost everything you read about it uses jokes from Monty Python. So, for instance, your standard code examples for handling strings in an intro Python book will have you do things like converting:

spam

to

spam spam spam spam

and so on.


Hey was that peer reviewed research I was blogging on? Not sure. Anyway, here's the ref:

A Primer on Python for Life Science Researchers. Bassi S. PLoS Computational Biology Vol. 3, No. 11, e199 doi:10.1371/journal.pcbi.0030199

Tags

More like this

Thanks for rendering this in pseudocode so us lay-folk can understand it. :)

Have you read Dreaming in Code? It's a pretty interesting account of a large software project, and it includes some fascinating and very accessible discussions of how Python and other languages function.

That sounds pretty nice. Having taken this semester an intro to computer programming with C++, I feel confident that C++ is not the language I want to use. I'm sure it's great for certain applications, but not for the mathematical work I would like to do. I can think of about 15 ways to improve the language off the top of my head, but I don't know what to replace it with. Maybe I'll take a look at Python. I've heard good things about OCaml too, and it seems like its functional programming paradigm might jibe better with my mathematical way of thinking.

Lucas:

Say you want a nice painting on your wall. The best way to get the exact painting you want is to paint it yourself, no matter what that takes. That would be programming in C.

Buying a nice, expensive painting that is pretty good ... that's programming in a major compiled language that is not C but that does the job and has an excellent development environment. You'll get what you need but it will cost you.

Going to the library, saying "Hey, can I check out a cool print of a nice painting ...whatever you recommend..." and then getting the perfect painting because the librarian is empathic, that's programming in Python.

Lucas,

You might want to avoid Python for mathematics. It is a nice language and lets you create very fast, but it is lacking in other areas. It is interpreted, so it is slower when your code becomes very complex, much slower at times. It is also not strongly typed, so type errors will become a problem. Despite this, I will admit, I really like Python. It is an excellent language to hack something up to test a concept.

If you really must do a lot of massive number crunching, you will want to either learn Fortran or Ada. The GCC compiler supports Fortran out of the box. It is free at gcc.gnu.org and can be run in Cygwin in Windows. As Greg said, C is good too, but C is not the best fit for mathematics.

Hope this helps.
Thomas

Actually, Thomas, there is Numerical Python, either in the form of the old "Numeric" module or the newer "numpy" module, which is meant to handle large computations and should be reasonably fast, at least compared to, say, Matlab and the like. Of course, if the computations are huge enough that Matlab is inappropriate, then likely Python will be not so wonderful either.

BTW, for Fortran, there is also the g95 compiler, which is a fork from the GCC project. Last I heard, it is more mature than its canonical GCC counterpart, gfortran, but I'm not sure if that is still the case. Of course, you can always try both.

Thomas: cygwin? Windows? I don't understand what you're talking about. Why wouldn't you just open up a terminal window and type gcc?

My current interests are mainly in combinatorics and logic, and don't involve very much "number crunching." My main frustration with C++ is how inflexible most of the data structures I've used are. I can think up an algorithm in 10 minutes that takes me hours to implement. It's very frustrating. I've heard that Python reads like pseudocode, and has native set and string data types, which sounds like music to my ears. Maybe this is a good winter break project.

Cygwin is a program to allow you to run a Unix environment in Windows. If you don't use Windows, then it is of no importance.

My concern was that most languages are limited in size of an integer that can be displayed. Often using a float or double will often not be a good solution either, due to lost accuracy. A small factorial will overflow an integer or long integer rapidly. There are numerical solutions, like J.J. mentioned above, which solve this. I made my suggestion based on the fact that Fortran handles many of these issues natively. If this is not a problem, then don't worry.

I think you will find Python useful. Strings are nice in Python, like you mentioned. It is also easy to write, but this can make debugging a bit annoying. The lack of type checking can really frustrate at times, but once you get a good handle on the language you should be fine. Python does have set capabilities native to the language. I've never used sets in Python, so I can't tell much about it. Ultimately, I think you will find Python limited after a while, but I think you will like it.

You mentioned Caml in your first post, which I have never used, but you may find a functional or logical programming language a better fit. I've never used a functional language, so I really can't give advice on which is better. I have played with the Prolog logical language, which is good for set operations. It is not very easy to learn though.

Don't know if this helps.
Thomas

I know what cygwin is--I just can't resist poking fun at all you hopelessly misguided windows people out there. :-P

rehana: I love those languages. (There is a difference between loving a language and loving to live with a language. Maybe I'm just a computer language slut, I dunno.)

Lucas: You can go from zero to sixty in Python in winter break, no problem.

More comments on python:

1) Name spaces are implemented and help with all sorts of issues.

2) My impression is that python is a functional language, and an object-oriented language, and a language that you don't have to consider to be either of the above. This is the pseudo-code-osity factor. There are a lot of ways to program in Python.

Considering the size of the development community and the great strength of the libraries, etc., you are not limited in any important way by python except one: Since it is interpreted, your enormous crunching programs will go slow, or if you need crispy interaction, you may not get that.

So you start with python, and then over time, when you start to run into one of these walls, you start changing your code over to C.

Hi Lucas,
As Ramsey mentioned there are some good libraries for number crunching in Python. I have used sets, lists and dictionaries extensively in python and I can assure you they are pretty fast. You just have to know which datastruct is appropriate for your use.

Regards
Sharmila