Marco Brandizi's Site

Welcome to a 5-min-lazyness site...

Comp Sci

The BioInvestigation Index and ISA Tools are out!

[The BII Architecture]

After a long time me and other brilliant guys from the NET Project team have been working, finally we have released a public version of the BioiInvestigation index and binaries for local installation! (Sources coming soon, just give us time to clean up/document etc.).

What is it?

The BII project provides an infrastructure for storage 'n retrieval of multi-omics experiments (we call them studies), i.e.: studies which of typical design is: prepare a sample and take out different measurements on it, such as microarray data, proteomics, 2D-gels, sequencing etc. We aim at keeping together the meta-data about these experiments (experiment design, sample characteristics and preparation), as well as leveraging on existing omic-specific repositories and format (e.g.: ArrayExpress + MAGETAB, PRIDE + PRIDEML).

How does it work?

We cover the whole pipeline consisting of prepare-submission/submit/search-stored-information. Submissions can be crafted by means of the ISATAB format, a tabular, spreadsheet-based format, which is a compromise between the need for an (end)user-friendly format and the need for something decently structured and formal (a similar approach was followed for the definition of MAGETAB). Things are further eased by the ISAcreator, a graphical Java tool that works similarly to Excel (or OpenOffice Calc, I don't like to mention MS only...), with the difference that ISAcreator has many more nice ISATAB-specific features. For instance, one of the most interesting things is that it connects to ontology servers (OLS or Bioportal) and allows you to select the right annotation terms for your submission.

Redoing the microarray analyses

[Galileo Galilei on a money note]Nature Genetics has recently published this interesting work on repeatability of gene expression analysis. That microarray is a powerful tool used for study a number of biological phenomena is nowadays obvious. As it is that a microarray study should be repeatable, being it an instance of the experimental approach the science is based on. So, those who try to verify the repeatability of such kind of experiments are certainly welcome (sorry for repetitions... ;-) ). Even when one considers only a part of it, as they do, by focusing on the possibility to redo the analysis, starting from the raw data or the processed ones (i.e.: normalised).

What we learn from the paper is, IMHO, that the situation is better than what one would have expected, but there is still quite room to improve.

They started from an initial list of 20 papers, they choose a single conclusion from each of them (i.e.: a single figure or table) and tried to come at the same one, by redoing the analysis reported. An important part of the test is that they purposely didn't contacted the original authors, rather they tried to get all what they needed from the paper and from the data published in ArrayExpress or GEO, hence exploiting the available data annotations.

(Late) notes from ISWC 2008

I've eventually found time to review the notes from the last International Semantic Web Conference conference and put them in a clean document. Yes, it's a bit late, but I like to put this here and maybe someone will still find it interesting.

If you like the longer reading, please see the slides below. If you're looking for just short impressions, I can summarize them as follows.

Networking with Dendritic Cells

I've recently gone to the hearth of Mediterranean, Athens, for the annual meeting of the DC-THERA network, an FP6 project. A number of people and organisations collaborate in the context of this project, under the hat of  dendritic cells research (DC). Even me, the amateurish biologist, can guess the importance of this kind of cells for the Immune system, and their consequent importance for the development of a number of vaccines and therapies.

My new job @EBI

So, I came back at the EBI in March. I am now working at the BioInvestigation Index project (formerly BioMAP), for which I brushed up the good old Java.

What is it about?

[isa-tab]In short: we aim at integrating the submission and access of transcriptomics data, proteomics data and metabolomics data. We will leverage on the EBI existing repositories, ArrayExpress and PRIDE. We are also developing a submission tabular format, ISA-TAB, which is inspired by MAGE-TAB and FUGE.

Managing Microarray Knowledge with the Semantic Web

[MannOnto, relatedAssertions] Geeez! Eventually I have submitted the thesis and given the defense presentation! My PhD project is about representing the results coming from microarray data analysis. I represent concepts like: sets of differentially expressed genes, results from clustering algorithms, claims about the role of genes in a biological pathway. I use Semantic Web technologies and an OWL model (ontology) to provide a formal representation of such knowledge. All is focused on assertions (e.g.: "IL2 is expressed under LPS infection condition"), which may be supported by experimental evidence (e.g.: a data set) or other kinds of evaluations, including ranking and comments from the users. Moreover, users and their roles (e.g.: is an expert on Immunology) may be considedered, especially for ranking the assertions. I have also developed a demo, based on the Makna Semantic Wiki.


PhD project, yet another ontology

I have reviewed the OWL file once more. I have done it after analysis of study cases and considerations on making useful inference with OWL. Details are reported in the doc.

A Database for Genomic Expression Data Management

This is the project that made me discover the charm of Bioinformatics, Microarrays and Biology. Here you can find:

Gene Lists, Microarray Knowledge, Semantic Web

I've started this discussion on the semweb-lifesci mailing list.

I was interested in how to handle gene lists in OWL (and other lists of gene expression related entities). The discussion has expanded toward: MAGE-OM in RDF format, microarrays and Semantic Web, Semantic web technologies...

Awkward identifiers

[Figures about TO gene]
Today I discovered that a gene named "to" exists. It's not difficult to imagine how much hope you have to find it in Google. That's interesting: it was mentioned on a discussion about the use of natural language identifiers in computer science modelling. But actually maybe it is a case that shows how the Semantic Web could be useful. And complement Google. Well, Google is still able to find meaning info by means of the extended name, "takeout". So, if you already know what "to" means you'll probaly look up it on a by-keyword search engine via "takeout" + something else (drosophila, gene). I wonder if marking something like "to" with something like "is the label for..." could actually improve searches. Uhm... probably relating "to or whatever" to other resources, such as publication or chromosome or who is studying this gene, to which disease or biological process is related to, etc. Uhm... Uhm... anyway I am impressed by this "to" thing...

PhD, Progress report of Mar 2006

Now I am working on my project from the EBI. This is a progress report where I show the application of the Semantic Web to my stuff.


Fervens was created by Design Disease for WordPress, brought to you by Smashing Magazine.
Ported to Drupal by Leow Kah Thong - Freelance Drupal Developer.