<aside> <img src="/icons/list_gray.svg" alt="/icons/list_gray.svg" width="40px" /> Home | Research Aims | My journey

</aside>

Home

Untitled

<aside> <img src="/icons/mail_gray.svg" alt="/icons/mail_gray.svg" width="40px" /> Email

</aside>

<aside> <img src="/icons/follow_gray.svg" alt="/icons/follow_gray.svg" width="40px" /> @KlieAdam

</aside>

<aside> <img src="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/911fee6a-b4c8-408c-bab9-97a18ece4d72/github.png" alt="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/911fee6a-b4c8-408c-bab9-97a18ece4d72/github.png" width="40px" /> GitHub

</aside>

<aside> <img src="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/7e8fa7d2-6aab-431e-994a-e9fcf34b7b12/scholar.png" alt="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/7e8fa7d2-6aab-431e-994a-e9fcf34b7b12/scholar.png" width="40px" /> Scholar

</aside>

Ph.D. Candidate

Bioinformatics & Systems Biology

I’m a 5th year Ph.D. candidate in the Bioinformatics and Systems Biology (BISB) program at the University of California, San Diego.

My research focuses on understanding how gene expression is regulated (see below for my specific research aims).

🎓 Education

Curriculum Vitae

CV

Research Aims

The completion of the Human Genome Project in 2003 provided a nearly complete map of human DNA, setting the stage for the field of genomics to blossom. This milestone enabled scientists to explore differences in our DNA sequence (genetic variation) by comparing genomes from many different individuals.

Today, we have genome sequences for over half a million people, identifying hundreds of thousands of genetic variants linked to common diseases like heart disease, Alzheimer’s, and diabetes. Yet, understanding the biological mechanisms behind these links remains a challenge. A significant portion of these genetic variants occur in the regions of the genome that do not produce proteins (called the non-coding genome) but instead play a pivotal role in regulating protein production in cells. Despite its importance, the non-coding genome is far less understood than the regions that code for proteins.

In my thesis, I use machine learning to analyze large-scale genomics datasets, aiming to predict how genetic variation in the non-coding genome impacts biological function.

Building a software ecosystem for machine learning in genomics

When I started my PhD in 2019, machine learning (ML) was already making a substantial impact in genomics. However, successfully applying ML is consistently more challenging than is reported in the latest papers.

Tools

EugeneLogoText.png

Docs | Publication | Preprint | GitHub

Talks

https://www.youtube.com/watch?v=47wbTR9yUpg

Learning the grammar of enhancer function

Untitled

An exciting direction for ML in genomics is in studying genetic switches called enhancers. Enhancers are short DNA fragments that, when activated, signal to the cell to create certain proteins. Several exciting ML models have been developed and interpreted in an effort to learn more about the sequence features and their inter-dependencies (collectively termed syntax or grammar) that drive enhancer activity []. These include models that can be used to design cell type [] or tissue specific enhancers [], a particular exciting application with implications in synthetic biology. Despite this, defining the mechanistic roles of enhancer features during development or in tissues with complex patterns of expression remains a substantial challenge.

Building models that capture both cis and trans gene regulation (WIP)

Enhancers act in cis. In many tissues, many enhancers act in concert

To take the above aim a step further, we can also ask questions about the roles that these enhancers and other regulatory elements (REs) play in coordinating what exactly a given cell does. Much work has been done to develop gene regulatory network (GRNs) models of biological systems, but linking these GRNs to the CPs, and other phenotypes, remains a challenge.

My journey

“The road goes ever on and on…”