<aside> <img src="/icons/list_gray.svg" alt="/icons/list_gray.svg" width="40px" /> Home | Research Aims | My journey

</aside>

Home

Untitled

<aside> <img src="/icons/mail_gray.svg" alt="/icons/mail_gray.svg" width="40px" /> Email

</aside>

<aside> <img src="/icons/follow_gray.svg" alt="/icons/follow_gray.svg" width="40px" /> @KlieAdam

</aside>

<aside> <img src="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/911fee6a-b4c8-408c-bab9-97a18ece4d72/github.png" alt="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/911fee6a-b4c8-408c-bab9-97a18ece4d72/github.png" width="40px" /> GitHub

</aside>

<aside> <img src="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/7e8fa7d2-6aab-431e-994a-e9fcf34b7b12/scholar.png" alt="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/7e8fa7d2-6aab-431e-994a-e9fcf34b7b12/scholar.png" width="40px" /> Scholar

</aside>

Ph.D. Candidate

Bioinformatics & Systems Biology

I’m a 5th year Ph.D. candidate in the Bioinformatics and Systems Biology (BISB) program at the University of California, San Diego.

My research focuses on understanding how gene expression is regulated (gene regulation). See below for my specific research aims.

🎓 Education

Curriculum Vitae

CV

Research Aims

The completion of the Human Genome Project in 2003 provided a nearly complete map of human DNA, setting the stage for the field of genomics to blossom. This milestone enabled scientists to explore differences in our DNA sequence (genetic variation) by comparing genomes from many different individuals.

Today, we have genome sequences for over half a million people, identifying hundreds of thousands of genetic variants linked to common diseases like heart disease, Alzheimer’s, and diabetes. Yet, understanding the biological mechanisms behind these links remains a challenge. A significant portion of these genetic variants occur in the regions of the genome that do not produce proteins (called the non-coding genome) but instead play a pivotal role in regulating protein production in cells. Despite its importance, the non-coding genome is far less understood than the regions that code for proteins.

In my thesis, I use machine learning to analyze large-scale genomics datasets, aiming to predict how genetic variation in the non-coding genome impacts biological function.

Building a software ecosystem for machine learning in genomics

When I started my PhD in 2019, machine learning (ML) had already made a substantial impact in genomics. However, turning a new dataset into biological insight using ML was and remains way harder than the publications make it seem.

Tools

EugeneLogoText.png

Docs | Preprint | Publication | GitHub

Talks

https://www.youtube.com/watch?v=47wbTR9yUpg

Learning the grammar of enhancer function

Untitled

Untitled

An exciting direction of ML in genomics is in studying genetic switches called enhancers, short DNA fragments that when activated signal to the cell to create certain proteins. Several exciting ML models have been developed and interpreted in an effort to learn more about the sequence features and their inter-dependencies (collectively termed syntax or grammar) that drive enhancer activity [], including ML models that can be used to design cell type specific enhancers []. Defining the mechanistic roles of enhancer features during development and in tissues with complex patterns of expression remains a substantial challenge.

I leverage both several high-throughput massively parallel reporter assays (MPRAs) offer a high-throughput methodology that directly test the regulatory potential of REs. Predicting outcomes of MPRAs using interpretable ML represents a powerful mechanism towards understanding the sequence features within REs that encode function. Such models can help us find and validate functional genomic enhancers, prioritize enhancer features to test, and uncover dependencies that exist between features. We are are hoping for a set of unifying principles or rules that govern enhancer function. ML models trained on MPRA data, especially in combination with other data types, will play a key role in helping us determine if, when and where such principles exist in biology.

Building models that capture both cis and trans gene regulation

Enhancers act in cis. In many tissues, many enhancers act in concert

To take the above aim a step further, we can also ask questions about the roles that these enhancers and other regulatory elements (REs) play in coordinating what exactly a given cell does. Much work has been done to develop gene regulatory network (GRNs) models of biological systems, but linking these GRNs to the CPs, and other phenotypes, remains a challenge.

My journey

“The road goes ever on and on…”

Father and Son.mp3