CL-SBML: A Common Lisp SBML Library

The System Biology Markup Language (SBML) is a useful format specification for exchanging models of networks of biochemical systems. SBML files can contain information about regulatory, metabolic, or signaling processes.

A Common Lisp "binding" is a desirable thing, as the language offers numerous advantages for generic bioinformatics programming.

SBML is based on XML and therefore is relatively language-neutral. Nevertheless, the main language binding for SBML, libsbml, is based on C/C++. Language bindings for Python, and Perl are available, as well as Java.

A SBML Common Lisp binding is also available which directly binds libsbml via UFFI. This binding postdates the code presented here.

However, having a pure CL implementation of SBML available in a read/write library is still a desirable thing. The main motivations are the justification of the SBML specification itself, and, of course, the independence of the CL code from foreign language libraries. In particular, the existence of "good" SBML libraries other than libsbml, does demonstrate the validity of the the contents of the specification, i.e. it demonstrates that the specification is both truly language independent and well formulated.

The exercise was succesfull and the result is described in these pages.

Example Use

The main entry point in the CL-SBML library is the cl-sbml:build-model-from-sbml generic function.

In the tests sub-directory there are a few files downloaded from the main SBML site. You can use one of these as a test. Note that once CL-SBML is loaded, you also have the CL-SBML logical pathname translation set to the approriate directory.

  cl-prompt> (defvar *m* (cl-sbml:build-model-from-sbml (pathname "CL-SBML:tests;100Yeast.xml")))
  

The call to pathname is necessary because is a string were supplied cl-sbml:build-model-from-sbml would try to read the model directly from it.

At this point the variable *m* will contain a CL-SBML model. The result is:

  cl-prompt> *m*
  #<CL-SBML:MODEL _100_Yeast_glycolytic_cells_[multi_unit] (NIL) 2525 species, 101 compartments, 2000 reactions 20610964>
  

The model has been loaded in an internal CL data structure and can now be manipulated at will.

Please note that the efficiency of the CL-SBML library depends on the efficiency of the underlying CL-XML library. If you find the performance lacking, you are advised to help the CL-XML people to improve the efficiency of their library.

Features

CL-SBML provides a full blown read and print library to manipulate SBML Level 2 models.

The library is very well integrated with CL and it provides some bells and whistles on top of the usual expected functionalities. In particular, it provides nice math parsing routines even for string representations of formulae, thanks to the INFIX package by Mark Kantrovitz. Of course, the SBML Level 2 MathML subset is handled as well.

Remember: XML is S-expressions in drag!

Mathematical Formulae Parsing

SBML admits the presence of mathematical formulae in several positions in a element tree. As an example, the test file Metabolism-2002Lam.xml contains the following kineticLaw (edited to improve readability)

  <kineticLaw formula="(((Vfpglm_2) * (G1P)) / (Kpglmg1p_2) - ((((Vfpglm_2) * (Kpglmg6p_2))
                                                               / ((Kpglmg1p_2) * (16.62))) * (G6P)) / (Kpglmg6p_2))
                          / (1 + (G1P) / (Kpglmg1p_2) + (G6P) / (Kpglmg6p_2))">
      <listOfParameters>
          <parameter name="Vfpglm_2" value="0.48"/>
	  <parameter name="Kpglmg1p_2" value="6.3e-005"/>
	  <parameter name="Kpglmg6p_2" value="3e-005"/>
      </listOfParameters>
  </kineticLaw>
  
This kinetic law is translated in the following CL object, where the parsed formula is directly available. (Apart from the MATH slot, which would contain the SBML Level 2 MathML formula specification, all NIL slots have been omitted and replaced with "...")
  #S(KINETIC-LAW  FORMULA "(((Vfpglm_2) * (G1P)) / (Kpglmg1p_2) - ((((Vfpglm_2) * (Kpglmg6p_2))
                                                               / ((Kpglmg1p_2) * (16.62))) * (G6P)) / (Kpglmg6p_2))
                          / (1 + (G1P) / (Kpglmg1p_2) + (G6P) / (Kpglmg6p_2))"
		  PARSED-FORMULA (/ (- (/ (* VFPGLM_2 G1P) KPGLMG1P_2)
				       (/ (* (/ (* VFPGLM_2 KPGLMG6P_2)
						(* KPGLMG1P_2 16.62))
					     G6P) KPGLMG6P_2))
				    (+ 1 (/ G1P KPGLMG1P_2)
				       (/ G6P KPGLMG6P_2)))
		  MATH NIL
		  PARAMETERS (#S(PARAMETER NAME "Vfpglm_2" VALUE "0.48" ...)
			      #S(PARAMETER NAME "Kpglmg1p_2" VALUE "6.3e-005" ...)
			      #S(PARAMETER NAME "Kpglmg6p_2" VALUE "3e-005" ...))
		  ...)
  
As you can see the PARSED-FORMULA slot already contains a properly represented CL piece of code.

Exercise. The PARSED-FORMULA slot contains a prefix rendition of the kinetic law formula. MathML formulae are also rendered in prefix form. Although this is an unfair request, as MathML also deals with mathematical formulae layout, you can try to render the kinetic law already in S-expression form directly in MathML. Have fun!

Full Integration in a CL Environment

The ability of CL-SBML to parse the mathematical renditions of various formulae in a SBML file ties nicely in with the CL ability to manipulate code as data and viceversa. On-the-fly compilation of code is trivial in CL, thus the construction of simulators and analyzers of variuos types is practically immediate.

The features described are not directly available in the direct libsbml CL binding.

The CL-SBML Dictionary

The CL-SBML dictionary contains all the definitions comprising the library.

Site Map

None yet.

Enjoy!


Questions? Queries? Suggestions? Comments? Please direct them at me.