Data Science + Research

Data Science:


Data Science at City of Hope

What is patient-similarity?

Oftentimes in healthcare, predictions are made about the “average patient”. Patient-similarity models make predictions about individual patients by identifying patients which are similar to an index patient and tailoring the predictions based on these sets of similar patients. (click here for more information)

Patient-similarity project

I used machine learning tools to better understand breast cancer genomic clusters. Specifically, I conducted research identifying patient features that highly correlated with breast cancer genomic clusters. I found that t-staging, and m-staging (as in tnm-staging) are correlated with genomic clustering.

I used SQL to pull and prepare data from several tables in Poseidon in the DNAnexus (AWS) environment. I engineered new features from the available data including features like age of onset of breast cancer. I gained key insights using various data visualizations, including violin plots, t-sne visualizations, raincloud plots, and histograms. I used permutation method to find p-values. I also used the Kneedle algorithm to find concavity of the clusters.

1 / 21
2 / 21
3 / 21
4 / 21
5 / 21
6 / 21
7 / 21
8 / 21
9 / 21
10 / 21
11 / 21
12 / 21
13 / 21
14 / 21
15 / 21
16 / 21
17 / 21
18 / 21
19 / 21
20 / 21
21 / 21

Data Science tools used:

General programming: Python (Pandas, Numpy), SQL, Git
Machine Learning modeling: K-means clustering, k-modes clustering, kneedle algorithm
Statistics: Shap, p-value
Visualization: Plotly, matplotlib, bar graphs, shap graphs

Math Research:


Differential Geometry Research:

The Splitting Theorem and Topology of Noncompact Spaces with Nonnegative N-Bakry Émery Ricci Curvature

Abstract: In this paper, we generalize topological results known for noncompact manifolds with nonnegative Ricci curvature to spaces with nonnegative N-Bakry Émery Ricci curvature. We study the Splitting Theorem and a property called the geodesic loops to infinity property in relation to spaces with nonnegative N-Bakry Émery Ricci Curvature. In addition, we show that if M^n is a complete, noncompact Riemannian manifold with nonnegative N-Bakry Émery Ricci curvature where N>n, then Hn-1(M,Z) is 0.

(click here for the arXiv version of my paper)

(published in the Proceedings of the AMS)

Locally Homogeneous Non-gradient Quasi-Einstein 3-Manifolds.

Abstract: In this paper, we classify the compact locally homogeneous non-gradient m-quasi Einstein 3-manifolds. Along the way, we prove that given a compact quotient of a Lie group of any dimension that is m-quasi Einstein, the potential vector field X must be left invariant and Killing. We also classify the nontrivial m-quasi Einstein metrics that are a compact quotient of be the product of two Einstein metrics. We also show that S^1 is the only compact manifold of any dimension which admits a metric which is nontrivially m-quasi Einstein and Einstein.

(click here for the arXiv version of my paper)

(published in Advances in Geometry)

An example related to the N-Bakry-Émery Ricci curvature and a punctured torus

Abstract:This is a note about relates to the N-Bakry-Émery Ricci curvature and the punctured torus. Specifically, we attempt to answer the question, "Can a punctured torus admit metrics which satisfy nonnegative ∞-Bakry-Émery Ricci curvature?"

Click here to read my note

My thesis:

I'm very excited about my thesis and I hope you'll enjoy reading it as much as I enjoyed writing it! All drawings were made using procreate and gravit.

Abstract: We begin the thesis by giving an intuitive introduction to calculus on manifolds for the non-mathematician. We then give a semi-intuitive description on Ricci curvature for the non-geometer. We give a description of the N-Bakry- Émery Ricci curvature and the N-quasi Einstein metric. The main results in this thesis are related to the N-Bakry-Émery Ricci curvature and the N-quasi Einstein metric.

Our first set of main results are as follows. We generalize topological results known for noncompact manifolds with nonnegative Ricci curvature to spaces with nonnegative N-Bakry Émery Ricci curvature. We study the Splitting Theorem and a property called the geodesic loops to infinity property in relation to spaces with nonnegative N-Bakry Émery Ricci Curvature. In addition, we show that if M^n is a complete, noncompact Riemannian manifold with non- negative N-Bakry Émery Ricci curvature where N > n, then Hn-1(M,Z) is 0.

For our second set of main results, we classify the compact locally homogeneous non-gradient N-quasi Einstein 3-manifolds. Along the way, we also prove that given a compact quotient of a Lie group of any dimension that is N-quasi Einstein, the potential vector field X must be left invariant and Killing. We also classify the nontrivial N-quasi Einstein metrics that are a compact quotient of be the product of two Einstein metrics. We also show that S^1 is the only compact manifold of any dimension which admits a metric which is nontrivially N-quasi Einstein and Einstein.

(click here for my thesis)