Projects

      • Medical Informatics:

      • I3RIS: Interactive, Iterative, Integrated Radiology Image Search
        • Description: The advancements in medical imaging technologies have generated billions of images that are digitally stored and indexed in different data repositories worldwide. Current search mechanisms and query tools used to access these images in clinical practice are text-based only and are not sophisticated enough to fulfill the types of queries that clinicians need. Leveraging the richness of the medical data, the long-term objective of this interdisciplinary effort between DePaul University and University of Chicago is to provide the most useful information, the best images, and the most relevant data sources to clinicians at the point of care. Our specific goals are to design, develop, and evaluate a hybrid search engine that unlocks valuable information from onsite and online radiology data sources (in-house proprietary teaching files and publically available online peer-reviewed teaching files, radiology journals, and imaging related textbooks) to provide radiologists the most relevant information needed at the time of patient care. Our central hypothesis is that having a search mechanism that maps naturally from the user’s limited internal memory of observed cases to a wealth of examples available onsite and online would allow clinicians to make faster, more confident and accurate diagnoses by removing the innate error caused by the limits of human memory. To test the central hypothesis, we propose to 1) create a hybrid text and image distributed database by integrating radiology teaching files, textbooks, and journals, 2) extract knowledge from integrated data sources to augment medical decision making, and 3) develop a domain-specific interactive user interface with iterative query refinement.
        • Student Involvement: Now looking for Undergraduate, Master and PhD students. Master students can work on this project for Capstone.
        • Faculty Contact: For more information, please contact Dr. Daniela Raicu.

      • Medical Health Informatics
        • Description: Suicide is an alarming public health problem accounting for a considerable number of deaths each year worldwide. Many more individuals contemplate suicide. Understanding the attributes, characteristics, and exposures correlated with suicide remains an urgent and significant problem. As social networking sites have become more common, users have adopted these sites to talk about intensely personal topics, among them their thoughts about suicide. Such data has previously been evaluated by analyzing the language features of social media posts and using factors derived by domain experts to identify at-risk users. In this project, we automatically extract informal latent recurring topics of suicidal ideation found in social media posts. Our evaluation has demonstrated that we are able to automatically reproduce many of the expertly determined risk factors for suicide. Moreover, we have identified many informal latent topics related to suicide ideation such as concerns over health, work, self-image, and financial issues. Current projects include 1) expanding this work to other mental health issues, 2) testing additional feature extracting techniques, 3) Designing procedures to acquire a robust and reliable ground truth.
        • Student Involvement: Now looking for Undergraduate, Master and PhD students. Master students can work on this project for Capstone.
        • Faculty Contact: For more information, please contact Dr. Jonathan Gemmell.

      • Revolutionizing Medicine with Machine Learning
        • Description: Machine Learning is on the cliff of revolutionizing medical diagnosis. Diagnostic applications of machine learning are rapidly transitioning from the theoretical to the real-world. The transformational potential of diagnostic applications cannot be overstated from an at-home tool for early detection to an instant “second opinion” for a complex diagnostic case. Machine learning as a diagnostic tool will generate incredible efficiencies and cost savings for patients, doctors, and hospitals, and most importantly of all, it will save lives. In a quest to build more trustable Computer-Aided Diagnosis (CAD) systems for lung cancer, the CDM Medical Informatics Lab and the Imaging Institute at University of Chicago have been collaborating for over a decade to build the next generation CAD system with advanced imaging analytics and reasoning capabilities that can assist in the clinical decision making process. The collaboration involves three stages of research: 1) predictive modeling for high-level diagnostic interpretation derived from low-level image data, 2) learning the human visual perception of similarity using low-level image features and expert-in-the-loop feedback, and 3) evaluating the effects of smart capabilities on traditional CAD systems and medical experts’ performance.
        • Student Involvement: Now looking for Undergraduate, Master and PhD students. Master students can work on this project for Capstone.
        • Faculty Contact: For more information, please contact Dr. Daniela Raicu.

      • Computer-Aided Prognosis of Age-Related Macular Degeneration
        • Description: Advanced form of age-related macular degeneration (AMD) is a major health burden that can lead to irreversible vision loss in the elderly population. For early preventative interventions, there is a lack of effective tools to predict the prognosis outcome of advanced AMD because of the similar visual appearance of retinal image scans in the early stage and the variability of prognosis paths among patients. The existing prognosis models have several limitations: First, previous studies assume constant time intervals between doctor visits; however, in real world clinical settings, the visits may happen at irregular time intervals. The assumption of constant time intervals will lead to over-optimistic prediction results on specific training data sets while failing to produce generalizable results on new patient data sets. Second, current studies only predict one form of advanced AMD form at a time. Third, computer-based prognosis results are typically not validated on new patients and therefore, it is difficult to evaluate the generalizability of the proposed approaches. Lastly, there is a lack of interpretability of the models and explainability of how a computer-based prognosis determination has been made. The overall objective for this project is to design, develop, and evaluate AMD prognosis prediction models that can detect most relevant images containing AMD biomarkers, manage unevenly spaced sequential optical coherence tomography (OCT) images and predict all advanced AMD forms that can help with the interpretation and explainability of computer-aided prognosis models.
        • Faculty Contact: For more information, please contact Dr. Daniela Raicu.

        Recommender Systems:

      • Recommender Systems
        • Description: Recommender systems assist users in navigating complex information spaces and focus their attention on the content most relevant to their needs. Often these systems rely on user activity or descriptions of the content. Social annotation systems, in which users collaboratively assign tags to items, provide another means to capture information about users and items. Each of these data sources provides unique benefits, capturing different relationships. We propose leveraging multiple sources of data: ratings data as users report their affinity toward an item, tagging data as users assign annotations to items, and item data collected from an online database. Taken together, these datasets provide the opportunity to learn rich distributed representations by exploiting recent advances in neural network architectures.
        • Student Involvement: Now looking for Master and PhD students. Master students can work on this project for Capstone.
        • Faculty Contact: For more information, please contact Dr. Jonathan Gemmell.

        Computational Reproducibility:

      • Sciunits: Tools for conducting Reproducible Science
        • Description: Sciunits are efficient, lightweight, self-contained packages of computational experiments that can be guaranteed to repeat or reproduce regardless of deployment issues. Sciunit answers the call for a reusable research object that containerizes and stores applications simply and efficiently, facilitates sharing and collaboration, and eases the task of executing, understanding, and building on shared work. Explore Sciunits at: http://sciunit.run
        • Student Involvement: Now looking for Undergraduate, Master and PhD students. Master students can work on this project for Capstone.
        • Faculty Contact: For more information, please contact Dr. Tanu Malik.

        Transportation:

      • Traffic Crashes
        • Description: Traffic crashes have a significant impact on the economy both in the form of property damage and also in the form of lost time. The congestion likely to happen in busy areas will cause waste of gas and air pollution. The worst are fatalities or severe injuries. The most vulnerable population in traffic crashes are pedestrians and cyclists. Identifying the crash-prone locations will help traffic safety, transportation planning, and law enforcement to prioritize their efforts and resources to minimize the risk of accidents.
          The Michigan traffic crash data is rich and contains a lot of information of the incident, severity, time, involvement of pedestrian or bike along other factors. There are 300K rows each year for the past 10 years. Analyzing this data will provide insight into many aspects of traffic crashes. Joining with other data sources can improve the quality of the data. There can be several paths that can be explored here, one of which is the how the pedestrians or bikes are affected when involved in a traffic crash. Other can be related to the propensity of drug usage increasing the severity of a traffic crash.
          There is a lot of data cleaning, wrangling, visualization, and modeling involved. By participating in this project the student will learn how to perform in-depth data analysis and apply different machine learning models to a real-life data set.
          Skills required:
          •Good knowledge of Python, R or a similar data analytics platform
          •Past experience in creating a data science project from begin to end
          •Can apply data mining, machine learning
          •Geographical analysis experience using coding or GIS tools is a plus
          •Neural networks and deep learning is a plus
        • Student Involvement: Now looking for Undergraduate and Master students. Master students can work on this project for their Capstone.
        • Faculty Contact: For more information, please contact Dr. Ilyas Ustun.

      • Traveling Equipment
        • Description:This project investigates the problem of planning for the allocation of resources to provide services to spatially dispersed customers from a single or a network of hub locations, where resources are stored. These resources may be equipment such as seating and staging equipment in the entertainment industry, freight distribution vehicles, and trade-show booths. The resources may also be human resources such as consultants. One key decision for these problems is the dynamic reallocation of the resources to the hubs, where these resources are routed based on possible demand fluctuations.
          We consider this problem in the context of seating and staging services in the entertainment industry, where the planning process is quite complicated as it involves managing thousands of pieces of modular equipment among warehouses and events, such as concerts and sports games. We investigate improvement opportunities in terms of inventory and transportation management. This study is conducted in collaboration with SGA Production Services which provides temporary seating and staging for large-scale entertainment events with its network of six warehouses in the USA, where they store their equipment.
          The project has two aspects. One aspect is to analyze the data thoroughly and gain insights followed by predictive modeling and machine learning. The other aspect involves optimization of the equipment allocation which requires significant knowledge in operations research. The knowledge gained in the data science part will support the optimization part of the project. Students interested in one or both aspects of the project are welcome to join.
          There is a lot of data cleaning, wrangling, visualization, and modeling involved. By participating in this project the student will learn how to perform in-depth data analysis and apply different machine learning models to a real-life data set.
          The students participating in the project are required to sign a Non-Disclosure Agreement (NDA).Skills required:
          •Good knowledge of Python, R or a similar data analytics platform
          •Past experience in creating a data science project from begin to end
          •Can apply data mining, machine learning
          •Geographical analysis experience using coding or GIS tools is a plus
          •Neural networks and deep learning is a plus
          •Knowledge in operations research and optimization is a plus
          •Usage of operations research and optimization tools or coding regarding optimization is a plus
        • Student Involvement: Now looking for Undergraduate and Master students. Master students can work on this project for their Capstone.
        • Faculty Contact: For more information, please contact Dr. Ilyas Ustun.

        Materials:

      • Human-in-the-loop Image Pattern Detection
        • Description: New materials can provide solutions for key challenges in sustainability, e.g., in energy, new catalysts for more efficient fuel cell technology. One of the several challenges in new materials discovery is the identification of the crystalline phases of inorganic compounds based on an analysis of high-intensity X-ray patterns. Identifying these phases is equivalent to finding the crystal structures (arrangement of atoms) of new compounds, which then leads to determining their properties. Fully automated phase identification is challenging as the images generated with the X-ray instrument can be noisy and the patterns (or series of matching peaks) has to be identified across sets of multiple samples of materials. In this project, we explore a combination of visualization techniques, guided by humans to accelerate crystalline phase identification. We will build on previous work to develop automated pattern detection and investigate opportunities to integrate expert and non-expert feedback.
        • Student Involvement: Now looking for Undergraduate, Master and PhD students. Master students can work on this project for Capstone.
        • Faculty Contact: For more information, please contact Dr. Roselyne Tchoua.

      • Hybrid Human-Machine Information Extraction
        • Description: Materials informatics is an emerging field that has the potential to dramatically reduce the time-to-market and development for new materials; computational models scan large datasets to identify candidates for new materials. As such methods rely on access to large, machine- readable databases, the traditional text-based physical handbooks will not suffice. However, there are few examples of these scientific digital databases and constructing new databases is a monumental and costly task requiring years of expert labor, as the data that populate these databases must often be extracted manually from free-text publications. While, machine learning efforts have begun in materials science, the lack of annotated text hinders attempts to leverage approaches developed for biomedicine for example. In this project, we will build on previous work which leverages human and automated approaches to extract scientific named entities from text. We will enhance this work to tackle scientific entity relation extraction. Specifically, we will explore comparable human-in-the-loop extraction approaches to continue to contribute to existing datasets of annotated materials entities and properties.
        • Student Involvement: Now looking for Undergraduate, Master and PhD students. Master students can work on this project for Capstone.
        • Faculty Contact: For more information, please contact Dr. Roselyne Tchoua.

        Bioinformatics:

      • Functional Neural Mapping for Behavior Modeling Using Big Data Computing
        • Description: A major goal in neuroscience research is to understand behavior at the level of neural networks. While many studies have attempted to tackle this goal, their resolution is not at the single neuron level or their scope is not extensive enough to make a concrete connection between behavior and neural networks. Caenorhabditis elegans provides clear advantages to overcome both of these challenges due to its simple nervous system and completely deciphered anatomical neural map. Moreover, C. elegans exhibits behaviors found in higher organisms, including food search behavior. In this interdisciplinary collaborative project between DePaul University and Rosalind Franklin University Medical School, we will use C. elegans to build functional networks of interneurons for food search behavior. We propose to perform in-depth research and develop new, powerful, and scalable image processing, indexing and data mining methods for efficient and effective analysis-based mapping of neural networks to locomotory search behaviors. Our proposed study will work on neuron-ablated C. elegans image datasets, and focus on (1) extracting representations of movement characteristics, (2) discovering and indexing behavior patterns in large sequential image data, (3) modeling search behavior similarity based on the discovered patterns, and (4) learning functional neural networks from combinations of behavioral models. The amount of data that will be generated from this research study will be in the petabytes range, making it crucial to employ cutting edge big data computing techniques on advanced large-scale distributed systems to make this study tractable.
        • Student Involvement: Now looking for Undergraduate, Master and PhD students. Master students can work on this project for Capstone.
        • Faculty Contact: For more information, please contact Dr. Daniela Raicu.

      • Building Ontologies
        • Description: Successful development of a biofilms information system requires a framework for representing and communicating information about this highly complex domain. To meet this requirement we will develop a biofilms ontology that captures the concepts used in biofilms research, the attributes of these concepts, and the relationships among them. This ontological framework will directly inform the data model that will support the information system, and its database implementation. To the extent possible, we will reuse existing ontologies in domains that overlap with biofilms research.
        • Student Involvement: Now looking for Senior Level Undergraduate or Master students.
        • Technical Skills Required: (1) Learn to use PROTÉGÉ/WEBPROTEGE to implement ontology (2) Python (3) SQL
        • Project Completion deadline: End of June 2020.
        • Faculty Contact: For more information, please contact Dr. Thiru Ramaraj.

Interested?

If you are interested to work on any of these projects as part of your capstone project or independent study, please contact Dr. Daniela Raicu, Dr. Raffaella Settimi or the faculty listed for the project.