Lately I have had a problem. I am essentially working two jobs, I am doing research for my RA and research for what I would like my thesis topic to be about. As a result any hobbies that I usually indulge in, such as blogging, have taken a back seat. A ver.
In anycase, I am going to some selling today and what I would like to sell you is Python. I have been using Python for all my scientific computing for about 2 months now and I thought I would write about the experience.
First, why did I even start using Python? Well, like most engineers I know I “grew up” using MATLab as my programming language of choice for first any assignments I had and eventually my research. I realized recently that I had built up quite a bit of legacy code that would be worthless if I were ever to leave a university environment. So really it comes down to the fact that to use my code I would have to spend money and there is a part of me that deeply fears that. So, about to embark on a new coding project, I decided to jump the matlab ship and try a more open language.
Why Python? I would have chosen Lisp if I thought I could get anything done quickly enough. Python had, as far as I could tell, a no nonsense reputation, a good amount of online support, and well publicized numerical , scientific , and plotting packages.
First I would like to say that for the most part Python is like MATLab. There are small differences like that arrays are zero indexed and other syntax differences but the feel is much the same. Whats more important is that unlike Java you can program much the same way as you do in MATLab. While everything is an object in python you don’t have to declare it as such so you can write small data analysis scripts with a minimum of fuss. I will give you an example of this. Lately I have been running quite a few molecular dynamics simulations and I have had to analyze the data. Reading in these data files consists of 4 lines of code:
fileID = open(‘fileName’, ‘r’)
for line in fileID:
dummyString = line.split()
This iterates through the file, loads each line as a string into the variable “line” and the splits it into an array according to whitespace and stores that into dummyString. From here it is a piece of cake to convert into a floating point and more importantly figure out what each line is and what to do with it.
In MATlab this would have taken at least twice as many lines and would have involved twice as many variables. Ya, I could have gotten it done but the important part is how easy it is to spot mistakes when there is less code. My projects get done faster because debugging is faster.
And, that is the main thing I like about Python, I get things done faster. For example the Array data type in NumPy must be declared a fixed length. Usually though I build my arrays as lists first using the .append() function to simply add stuff onto the end of the list. Then, when I finally need to do something to the array or matrix the built in matrix functions in NumPy or SciPy automatically turn the list into an array (as long as the dimensions are conducive to an array, one can make some pretty funky convoluted lists of one tries). Again, this saves me time because I am not trying to hunt through code to find the point where I screwed up two indices’s in a multiply nested for loop.
Lastly, Pylab is amazing. Not only does it make it really easy to plot data, it looks good. Better in my opinion then MATlab by far. I also am not aware of the existence of such a robust plotting package in any other language. I would be interested in hearing of one though!
Post a Comment