Activity
Time:
Ph.D. Dissertation Defense: Zeynep Hakguder
Date:
12:00 pm –
1:00 pm
Zoom
“High Dimensional Data Modeling Using Graphical Models”
Abstract: One of the defining features of the current era is the availability of massive amounts of data. This interest in data collection is thanks to methods that can draw valuable insights and availability of automated responses to analytical results that bring down cost. Statistical methods and machine learning algorithms have added tremendous value to data that might have not been otherwise tapped into due to lack of resources. Models that distill and serve insights from data are being made commercially available and transforming our lives. Science has also been undergoing a transformation that welcomes data-driven approaches. While many scientific disciplines are adopting a data-centric discovery approach, biomedical sciences have been an early adapter of and contributor to data-driven research. Data produced by biomedical fields is huge and spans many levels of organism structure and function. The hopes and expectations from computational and data-centric methods are vast in this field: automated or computer-assisted diagnosis, automated drug discovery, personalized medicine, personalized nutrition, lifestyle monitoring. However, challenges are equally grand. Biomedical data is high-dimensional, noisy, and incomplete.
In this dissertation, we propose data-driven models that address biomedical problems. We took a probabilistic perspective in approaching these problems and we utilized the practical probabilistic graphical model formalism. First, we consider the problem of microRNA-gene interaction and interaction pattern discovery. In addressing these problems, a computational approach that is highly accurate is valuable given the huge number of possible pairings between microRNAs and genes and the high cost associated with verifying candidate interaction pairs. We developed a non-parametric directed graphical model that was highly accurate in discovering these interactions. Then, we present our contributions to facilitate automated dietary monitoring. We use undirected graphical models to solve two problems we faced as part of our efforts to build a smart mobile dietary monitoring application to track home made meals. These models were built to model food images and recipe texts as well as frequently seen ingredients. We compared several deep learning architectures as the main image classifier and obtained highly accurate results. We incorporated co-occurrence frequencies using a conditional random field on the image predictions. We trained a highly accurate linear chain conditional random field to parse recipe data in order to produce these recipes.
Committee:
- Dr. Stephen Scott
- Dr. Juan Cui
- Dr. Vinod Variyam
- Dr. Bertrand Clarke
Zoom: https://unl.zoom.us/j/97097346085
Abstract: One of the defining features of the current era is the availability of massive amounts of data. This interest in data collection is thanks to methods that can draw valuable insights and availability of automated responses to analytical results that bring down cost. Statistical methods and machine learning algorithms have added tremendous value to data that might have not been otherwise tapped into due to lack of resources. Models that distill and serve insights from data are being made commercially available and transforming our lives. Science has also been undergoing a transformation that welcomes data-driven approaches. While many scientific disciplines are adopting a data-centric discovery approach, biomedical sciences have been an early adapter of and contributor to data-driven research. Data produced by biomedical fields is huge and spans many levels of organism structure and function. The hopes and expectations from computational and data-centric methods are vast in this field: automated or computer-assisted diagnosis, automated drug discovery, personalized medicine, personalized nutrition, lifestyle monitoring. However, challenges are equally grand. Biomedical data is high-dimensional, noisy, and incomplete.
In this dissertation, we propose data-driven models that address biomedical problems. We took a probabilistic perspective in approaching these problems and we utilized the practical probabilistic graphical model formalism. First, we consider the problem of microRNA-gene interaction and interaction pattern discovery. In addressing these problems, a computational approach that is highly accurate is valuable given the huge number of possible pairings between microRNAs and genes and the high cost associated with verifying candidate interaction pairs. We developed a non-parametric directed graphical model that was highly accurate in discovering these interactions. Then, we present our contributions to facilitate automated dietary monitoring. We use undirected graphical models to solve two problems we faced as part of our efforts to build a smart mobile dietary monitoring application to track home made meals. These models were built to model food images and recipe texts as well as frequently seen ingredients. We compared several deep learning architectures as the main image classifier and obtained highly accurate results. We incorporated co-occurrence frequencies using a conditional random field on the image predictions. We trained a highly accurate linear chain conditional random field to parse recipe data in order to produce these recipes.
Committee:
- Dr. Stephen Scott
- Dr. Juan Cui
- Dr. Vinod Variyam
- Dr. Bertrand Clarke
Zoom: https://unl.zoom.us/j/97097346085