Sunday, January 21, 2007

Hurray! Interview Invitations from USC

After nearly a week's nervous waiting, I received my first interview invitation from USC yesterday. Prof. Xianghong Zhou wrote me the email, and it is possibly she will interview me. I visited her lab's website when I chose schools. She is a Chinese, assistant professor, with a sweet smile. Hopefully she will not be too strict with me in interview.

Well, know yourself and know your enemy, you will never be beaten even in a hundred of battles. I plan to read her papers in the next two days. Her research is centered around methodological study of integrating cross-platform microarray datasets. I have just read her paper of Gene Aging Nexus. She may has a rigorous requirement of statistics. (My disadvantage is my rather limited mathematical vocabulary!) I feel confident in my computer knowledge and programming ability.

Go on! Fighting!

Comments on Zhou's papers:
1. Gene Aging Nexus: A Web Database and Data Mining Platform for Microarray Data on Aging
keywords:
meta-analysis: by first extracting expression patterns form individual microarray datasets and then identifying recurrent signals, these approaches may enhance signal-noise separation.
differential expression analysis:
co-expression analysis: Zhou proposed a new method to mine regulatory modules in previous papers Mining dense subgraphs across massive biological networks for functional discovery.
no major biological breakthrough.

2. Integrative missing value estimation for microarray data
Question Answered:
Due to the inherent noise and the limitation of experimental systems, a microarray dataset on average has more than 5% missing values, affecting more than 60% of the genes. Such missing values made some subsequent analysis methods inapplicable or greatly decrease their performance. Thus the question of missing value estimation.

Basic Idea:
How to choose neighboring genes when not enough information is available in internal microarray dataset. Intuitively, if a set of genes frequently show expression similarity to the target gene over multiple data sets, they constitute a robust neighborhood which tend to show expression co-variations with the target gene.

other concepts:
LLS Local Least Square
Bayesian principle component analysis
singular value decomposition
support vector machines

No comments: