Department of Biostatistics, University of Pennsylvania, Philadelphia, PA 19104 USA
Bioinformatics and biostatistical methods for high throughput sequencing data analysis such as RNA-Seq, ChIP-Seq etc. Genomics such as microbiome, non-coding RNA.
- Seeking a data science position that leverages my scientific background and quantitative skills. Interested in implementing and developing statistical and machine algorithms for big data in healthcare, finance and business
- Accomplished researcher with a strong publication record (17 papers with over 1000 citations and one US patent), solid statistical and machine learning background (probability, statistical inference and method, bayesian method and computation, stochastic inference), proficient programming skills (R, Python etc.) and hands-on experience on analyzing large biological and clinical data.
- Programming: R (tidyr, dplyr, ggplot2, randomForest), Python (pandas,scikit-learn, numpy, matplotlib, scikit-learn, Keras, Tensorflow), Java, C++, Git/GitHub
- Computation: Linux, parallel computing with R foreach package, parallel computing on HPC clusters, and basic knowledge about Hadoop and MapReduce
- Statistics: hypothesis testing, linear model, generalized linear model, random effect model, bayesian hierarchical model etc.
- Machine learning: Deep learning, XGBoost, MDS, PCA, KNN, SVM, random forest etc.
- Data visualization: ggplot2, pheatmap, circos plot
- Financial investment (passed CFA Level 1 exam)