Larry Lok,
Ph.D.
Senior Research Fellow
Email: llok molsci.org
Tel: 510-981-8740
Download NIH Biosketch (PDF format) |
 |
In the realm of practical programming, my work at MSI has had two main foci. The first, a part of MSI's Alpha Project, is simulation of cellular chemistry. This work has focused particularly on managing large (>5000) reaction systems that involve many species of complexes of proteins with modifications, such as phosphorylation, that form the "currency" ofintracellular information transmission systems, such as the yeast mating pheromone pathway. The principal outcome of this work has been the Moleculizer simulator, which was the first to generate system reactions "on the fly" by means of user-supplied rules, rather than the user's having to explicitly specify the entire reaction network.
My second main line of practical programming work at MSI is the computational statistics of gene expression and of complex systems of intracellular reactions by means of Bayesian networks, in particular the causal analysis of these systems using the statistical theory of causality developed recently by Pearl, Heckerman, and others. The principal outcome of this unpublished work is a package of software tools connected with Bayesian networks and information theory. These tools are implemented in C++ libraries linked into an object-oriented form of the Scheme programming language.
Most of my more purely mathematical interests currently lie at the intersection between differential geometry and statistics, called information geometry, AKA statistical manifolds, primarily due to S. Amari. Loosely, this is the geometrical structure that a manifold M gets by being imbedded in the space of probability distributions on a measurable space X. For example, in a biologicalcontext, X might be possible combinations of expression levels of the genes in a gene-regulatory network. The Fisher information matrix (or metric) is the best-known element of information geometry. Much of the mathematics that underlies practical Bayesian network calculation can be cast into information-geometric terms, usually resulting in great clarification. Information geometry similarly clarifies many of the most important theorems of modern statistics; e.g. the famous Cramer-Rao inequality. Practically, I
am currently working on an information-geometric solution to a problem of multivariate estimation from (typically flow-cytometric or image-cytometric) observations of only a few variables at a time (e.g. fluorescence values of several engineered fluorescent reporter proteins [FPs]); this is similar to the common "missing data" problem in statistics. On a more abstract plane, I want to initiate an intrinsic approach to this kind of geometry using Elie Cartan's theory of moving frames, Cartan connections, etc., which permit a unified treatment of higher-order structure, in this case related to practical problems in statistical estimation.
[ back to "People at MSI" ]
|