Marco Brandizi's Site

Welcome to a 5-min-lazyness site...

All Document Types

Every Programmer has its Own Toolbox: jUtils

[Young Worker With his Toolbox - courtesy of]My personal lib of utils is now a quite public project, jUtils (part of ISA-Tools) and the code in it doesn't come from myself only any more. In this post, I'd like to give a brief overview of what it offers. Who knows, that bit you were about to code yourself might be described in the next paragraphs.

graph2tab, a library to convert experimental workflow graphs into tabular formats.

Brandizi, M., N. Kurbatova, U. Sarkans, and P. Rocca-Serra, "graph2tab, a library to convert experimental workflow graphs into tabular formats.", Bioinformatics, 2012 May 3.


Motivations: Spreadsheet-like tabular formats are ever more popular in the biomedical field as a mean for experimental reporting. The problem of converting the graph of an experimental workflow into a table-based representation occurs in many such formats and is not easy to solve.

Results: We describe graph2tab, a library that implements methods to realise such a conversion in a size-optimised way. Our solution is generic and can be adapted to specific cases of data exporters or data converters that need to be implemented.

Availability and Implementation: The library source code and documentation are available at


Supplementary Information: A supplementary document describes the theoretical and technical details about the library implementation.

The BioSample Database (BioSD) at the European Bioinformatics Institute.

Gostev, M., A. Faulconbridge, M. Brandizi, J. Fernandez-Banet, U. Sarkans, A. Brazma, and H. Parkinson, "The BioSample Database (BioSD) at the European Bioinformatics Institute.", Nucleic Acids Res, vol. 40, issue Database issue, pp. D64-70, 2012 Jan.

The BioSample Database is a new database at EBI that stores information about biological samples used in molecular experiments, such as sequencing, gene expression or proteomics. The goals of the BioSample Database include: (i) recording and linking of sample information consistently within EBI databases such as ENA, ArrayExpress and PRIDE; (ii) minimizing data entry efforts for EBI database submitters by enabling submitting sample descriptions once and referencing them later in data submissions to assay databases and (iii) supporting cross database queries by sample characteristics. Each sample in the database is assigned an accession number.

Knowledge sharing and collaboration in translational research, and the DC-THERA Directory.

Splendiani, A., M. Gündel, J. M. Austyn, D. Cavalieri, C. Scognamiglio, and M. Brandizi, "Knowledge sharing and collaboration in translational research, and the DC-THERA Directory.", Brief Bioinform, vol. 12, issue 6, pp. 562-75, 2011 Nov.

...In this article we introduce the DC-THERA Directory, which is an information system designed to support knowledge management for this research community and beyond. We present how the use of metadata and Semantic Web technologies can effectively help to organize the knowledge generated by modern collaborative research, how these technologies can enable effective data management solutions during and beyond the project lifecycle, and how resources such as the DC-THERA Directory fit into the larger context of e-science...

Internet closed, due to World Cup (and not only)

[Net Consumerism Icons]

Here we go again: World Cup begins and, as any other big sports event, on line TV from your country is blocked if you're abroad. And it's not the only restrictions that limit your freedom, including your chance to be a European citizen. There are technical solutions that allows one to overcome the sharks and you'll discover them in this post. Although there is quite more than that.



Redoing the microarray analyses

[Galileo Galilei on a money note]Nature Genetics has recently published this interesting work on repeatability of gene expression analysis. That microarray is a powerful tool used for study a number of biological phenomena is nowadays obvious. As it is that a microarray study should be repeatable, being it an instance of the experimental approach the science is based on. So, those who try to verify the repeatability of such kind of experiments are certainly welcome (sorry for repetitions... ;-) ). Even when one considers only a part of it, as they do, by focusing on the possibility to redo the analysis, starting from the raw data or the processed ones (i.e.: normalised).

What we learn from the paper is, IMHO, that the situation is better than what one would have expected, but there is still quite room to improve.

They started from an initial list of 20 papers, they choose a single conclusion from each of them (i.e.: a single figure or table) and tried to come at the same one, by redoing the analysis reported. An important part of the test is that they purposely didn't contacted the original authors, rather they tried to get all what they needed from the paper and from the data published in ArrayExpress or GEO, hence exploiting the available data annotations.

Error #AEF039483CD: I was programmed by a damned jerk

I've been told of all the good Software Engineering principles, and good programming and all that jazz since my childhood. Be simple. Be clear. Be modular. Reuse. Document.

(Late) notes from ISWC 2008

I've eventually found time to review the notes from the last International Semantic Web Conference conference and put them in a clean document. Yes, it's a bit late, but I like to put this here and maybe someone will still find it interesting.

If you like the longer reading, please see the slides below. If you're looking for just short impressions, I can summarize them as follows.

The Bowing Mighty

The picture is taken from this fine article (in Italian) about the recent earthquake in the chinese region of Sichuan. It depicts a local leader bowing down and apologising for the young victims who died under the crash of school buildings.

Holidays in Croatia

Managing Microarray Knowledge with the Semantic Web

[MannOnto, relatedAssertions] Geeez! Eventually I have submitted the thesis and given the defense presentation! My PhD project is about representing the results coming from microarray data analysis. I represent concepts like: sets of differentially expressed genes, results from clustering algorithms, claims about the role of genes in a biological pathway. I use Semantic Web technologies and an OWL model (ontology) to provide a formal representation of such knowledge. All is focused on assertions (e.g.: "IL2 is expressed under LPS infection condition"), which may be supported by experimental evidence (e.g.: a data set) or other kinds of evaluations, including ranking and comments from the users. Moreover, users and their roles (e.g.: is an expert on Immunology) may be considedered, especially for ranking the assertions. I have also developed a demo, based on the Makna Semantic Wiki.


A Database for Genomic Expression Data Management

This is the project that made me discover the charm of Bioinformatics, Microarrays and Biology. Here you can find:


According to the guide it seem you may learn this Cambridge popular activity, after little excercise. Reality...

Being World Champions in London

Piccadilly Circus, 9 Luglio 2006.


Fervens was created by Design Disease for WordPress, brought to you by Smashing Magazine.
Ported to Drupal by Leow Kah Thong - Freelance Drupal Developer.