CACHE researcher David Vendrami continues his CACHE journey:
“I am very excited to let you know that my first ‘CACHE paper’ is almost done! Specifically, I am currently waiting for ‘Our Server’ to produce the last set of results. I am really happy about that and also, if I may say, quite proud of myself: I began my PhD three and half months ago and the learning curve to achieve this goal was pretty steep (but really interesting and amazing).
In the meanwhile, while waiting, my colleague Abhi and I are working on the optimization of our ‘pipeline’ for data processing. Do you know what a ‘pipeline’ is? In a nutshell, a pipeline is a set of data processing elements connected in series, where the output of one element is the input of the next one. This is a necessary (really, it is NECESSARY) element if you are working with enormous files, like Next Generation Sequencing generated data, which is exactly the kind of data I am mostly playing with, because it allows you to process these data in just a few days or weeks. I actually have no idea about how long it would take if you didn’t use a pipeline (I am talking about tons of files made of, literally, millions or rows!).
So yeah, since also ‘the pipeline’ is built within ‘our server’, you can imagine how hard we are making ‘our poor computer’ work… Probably he’s really the one who deserves some vacations!
Finally, last but not least, Lucy is learning really fast and she’s doing a great job with scallop mitochondrial DNA sequences!”