One of my dear readers, C S, has pointed out to me the R package latticist. In the beginning I was sceptical, since the package is merely more than an interface to existing visualization routines. However, I now consider it astonishingly useful and use it almost every day. The reason is simple: Getting an intial glimpse onto a large set of multivariate data is tedious in R, due to the command line interface. With latticist, you get an instant overview over all variables of interest and it allows you to quickly dive into details by selecting subgroups and to look at potential correlations. Recommended.
Tags:english R research |
Filed on July 23rd, 2010 | No Comments »
Have you ever had the problem of re-finding how you created this particular image or that specific result of your recent bioinformatics project? I did, and not only once. In his article “A quick guide to organizing computational biology projects“, the distinguished scientist William S. Noble gives great advice on how to organize a research project practically. His key suggestions include:
- use a date-based directory structure for the experiments you perform
- keep a lab-notebook containing documentation and code for each experiment (How about Org-mode?)
- create generate scripts that work on original data and a general runner script for the complete experiment
His work flow is applies mostly to *nix enviroments. One thing I’d like to add: Use symbolic links! Due to huge amounts of data and backup strategies, it might be impractical to have all the project’s data in a single directory. Create subdirectories of your data and result directories according to the naming convention and link them to other network drives using the ‘ln -s’ command.
Tags:english research scripting |
Filed on June 8th, 2010 | No Comments »
One of my greatest fears is loosing my photo collection, those irretrievable links to good memories. Other physical media (CD, HD) to be stored at home aren’t save either, therefore I’d like to have an online backup solution. Windows Live Skydrive features 25GB free storage for files up to 50MB size. Using the free tool SDExplorer you can also upload your own photo folder structure to the Skydrive. Alternatives are Gladinet (but the free version allows only 1000 files to be uploaded!) and Windows Live Photo Gallery (each folder has to be uploaded indivually, no subfolders).
|
Filed on February 27th, 2010 | 1 Comment »
Scientific data commonly comes in tab-separated textfile format containing comment lines. What is the best way to read this data? Analogous to the recipe given by skip.montanaro, use a commented file decorator as follows:
import sys, re
import csv
class CommentedFile:
def __init__(self, f, commentstring="#"):
self.f = f
self.commentstring = commentstring
def next(self):
line = self.f.next()
while line.startswith(self.commentstring):
line = self.f.next()
return line
def __iter__(self):
return self
tsv_file = csv.reader(CommentedFile(open("inputfile.txt", "rb")),
delimiter='\t')
for row in tsv_file:
print row[2] # prints column 3 of each line
Tags:english programming python |
Filed on February 15th, 2010 | 4 Comments »
Read this post about how to reload your pdf document. This is particularly useful when you are creating a new document with LaTeX. Sweet, I was waiting for that functionality.
Tags:english tech |
Filed on December 21st, 2009 | No Comments »
Ein jeder (Hobby-)Koch benötigt ein gutes Kochmesser. Im Weblog Lifehacker wurde vor kurzem ein Messer empfohlen welches eine sehr gute Qualität besitzt, aber statt 100-200$, wie vergleichbare Produkte, nur rund 30$ kostet. Das vom schweizerischen Hersteller Victorinox hergestellte Messer scheint jedoch in dieser Form nur in Amerika erhältlich. Zumindest fand ich es im europäischen Victorinox Katalog nicht wieder. Bei Amazon.com ist es immerhin das meistverkaufteste Messer.
Auf Nachfrage bei Victorinox erhielt ich aber die Information das es in Europa ebenfalls vertrieben wird – nur mit leicht anderem Bedruck. Die Artikelnummer ist 5.2063.20. Ich habe es bereits gekauft und es macht einen guten Eindruck.
Tags:cooking german |
Filed on November 23rd, 2009 | No Comments »
Emacs usually takes quite some time to fully start up. However, as described in the great blog Emacs-Fu, Emacs 23 can now be started in the background as a daemon. This allows to fire up a new Emacs instance really fast. Thanks djcb!
Tags:emacs english tech |
Filed on November 20th, 2009 | No Comments »
Say you want to print the lines 3 and 7, and all lines from 11 to 15 of a text-file. The following SED one-liner will do for you
sed -n -e '3p' -e '5p' -e '11,15p' textfile.txt
Tags:commandline english linux tech |
Filed on October 31st, 2009 | 1 Comment »
For the first time I could personally sense the effects of the economic crisis. The manufacturer of my Bluetooth device ANYCOM USB-200, the Germany-based ITM Technology AG is insolvent. Immediate effect for customers like me: No more driver updates and their general unavailability on the homepage.
Here is the good news for everybody who wants to use an ANYCOM Bluetooth USB adapter (200, 250, 500) on Windows 7. The Vista driver runs just fine under Windows 7. And I got the driver ( “anycom-bluetooth-usb200-250-500-vista-v6-1-0-4700.exe”). If anybody needs it, feel free to send me an email (see About). It may be worth noting that Windows 7 complains about not being able to correctly install Bluetooth devices like a headset (Plantronics Voyager 510 for me), while in fact you you only need the correct driver for the adapter.
Update: After brisk demand I decided to allow you to download the driver directly from this website. Of course, no warranty whatsoever provided.
Tags:english tech windows |
Filed on October 6th, 2009 | 8 Comments »
Several people have recently asked me whether or not it is possible to use tuples in their shell script. One example is running a program with a varying set of parameters. Since they often did not find a good solution, they began to formulate their problem in a higher-level scripting language like Ruby. Surprisingly, you can accomplish the same task easily with simple shell scripting (supported by bash, zsh,..). Consider the following (semi-stupid) example
#!/bin/bash
paramset="foo.txt 1 --with-graphics
bar.txt 8 --no-graphics
flock.txt 4 --with-graphics"
echo "$du" | while read file p1 p2 ; do
./myProgram $file -t $p1 --verbose $p2
done
We here run the program myProgram three times (for each line in the multi-line string). Every line contains three white-space separated values (words), to which we assign the variable names file, p1, p2 in the loop header. Note that the last variable (in this case p2) always contains all remaining words of a given line if there are more words then variables.
The set of parameters can also be stored in a file. In that case, replace the loop header with cat params.txt | while read file p1 p2 ; do. If the script is not working properly, examine the Input-Field-Separator (IFS) variable, which should be set to IFS=" ".
Tags:commandline english linux scripting tech |
Filed on October 2nd, 2009 | 1 Comment »