DREAM - HU CENG Undergraduate Research Experience Program

Current Project Openings (Updated on February 21, 2026)

DREAM Committee: Nazli Ikizler Cinbis and Erkut Erdem

Memory-efficient and Real-time Continuous Sign Language Translation
Name of the faculty mentor(s): Hacer Yalim Keles (hacerkeles@gmail.com)
Project Description: The goal of this project is to design and implement a memory-efficient and real-time Continuous Sign Language Translation (CSLT) model that operates under realistic computational constraints. Current transformer-based CSLT models achieve strong translation accuracy but require high GPU memory and offline processing, which limits their deployment in real-world scenarios such as edge devices or low-latency applications. Students will develop a compressed and optimized CSLT framework that integrates model pruning, knowledge distillation, and lightweight temporal modeling. The project will investigate how architectural simplifications—such as structured attention head pruning, token compression, layer reduction, and student–teacher distillation—affect translation accuracy under strict memory and latency budgets. The central research question is: Can a continuous sign language translation model operate in real time with significantly reduced memory consumption without sacrificing translation quality? Students are expected to: 1. Reproduce a strong baseline CSLT model, 2. Define clear memory and latency constraints, 3. Implement pruning and distillation strategies, and 4. Conduct systematic performance–efficiency trade-off analysis
Number of students: 2-3 students
Requirements and Additional Info: Required: Strong Python programming skills, solid understanding of Data Structures and Algorithms, basic knowledge of Machine Learning and Deep Learning Preferred: Experience with PyTorch, familiarity with Transformers, basic knowledge of model compression (pruning, quantization, distillation) Motivation: Interest in efficient deep learning, sign language technologies, and deploying AI systems in real-world constrained environments.

Enhancing Prostate Region/Lesion Segmentation in CT/PET Imaging via Super-Resolution Techniques: A Comparative Study on Institutional and Benchmark Datasets
Name of the faculty mentor(s): Aydın Kaya (aydinkaya83@gmail.com)
Project Description: Accurate segmentation of prostate physical region and lesions in CT and PET imaging plays a critical role in diagnosis, staging, radiotherapy planning, and treatment response monitoring. However, the inherently limited spatial resolution and noise characteristics of PET, as well as partial volume effects in CT/PET imaging, often degrade segmentation performance. This project aims to systematically investigate whether super-resolution (SR) techniques can improve downstream prostate lesion segmentation accuracy in multimodal CT/PET imaging. The study will leverage both a newly collected institutional prostate CT/PET dataset and publicly available benchmark datasets to ensure methodological robustness and generalizability. Deep learning-based super-resolution models will be applied and optimized to enhance image quality prior to segmentation. The impact of SR preprocessing will be quantitatively evaluated by comparing segmentation performance (e.g., Dice similarity coefficient, Hausdorff distance, etc) against baseline models trained on native-resolution images. Multiple experimental configurations will be explored, including: (i) applying SR as a preprocessing step before segmentation, (ii) joint SR-segmentation pipelines, and (iii) modality-specific versus multimodal SR strategies. Statistical analyses will be conducted to determine whether performance gains are consistent across datasets and imaging modalities. By systematically evaluating the interaction between image resolution enhancement and segmentation performance, this project seeks to provide evidence-based guidance on the utility of super-resolution techniques in prostate CT/PET imaging workflows. The findings are expected to contribute to more reliable automated delineation systems and ultimately support improved clinical decision-making in prostate cancer management.
Number of students: 1 student
Requirements and Additional Info: Machine learning, deep learning.

ContVAR: Protein Function Prediction with Contrastive Representation Learning
Name of the faculty mentor(s): Tunca Doğan (tuncadogan@gmail.com)
Project Description: This project tackles a bioinformatics problem: explaining how genomic mutations that produce single– amino-acid variants alter protein function—an essential issue in biomedicine and biotechnology. Most current predictors can flag variants as pathogenic or broadly gain-/loss-of-function, but they cannot specify exactly which biological roles are affected. We propose ContVAR, an AI framework for fine- grained functional interpretation of protein variants that integrates self-supervised contrastive learning with sequence-based representation learning. At its core, ContVAR learns variant-sensitive embeddings from protein sequences using a masked language modelling (MLM) objective, allowing it to capture contextual constraints and functional signals without requiring large datasets containing variant-specific function labels. We leverage deep mutational scanning data to support self-supervised training and to improve discrimination between function-altering and neutral variants. The resulting embeddings are then used in a supervised module trained to map representations to Gene Ontology functional profiles learned from natural proteins. We will validate ContVAR on independent benchmarks and resources, and demonstrate utility through targeted case studies in biologically important settings such as the p53– MDM2 pathway.
Number of students: 2 students
Requirements and Additional Info: Machine/deep/representation learning, bioinformatics, language models, transformers

A System-Level Simulation Framework for In-Network Computing on High-Performance Networks
Name of the faculty mentor(s): Kayhan İmre (ki@hacettepe.edu.tr)
Project Description: This project aims to design and implement a comprehensive discrete event simulation framework to analyze the performance of In-Network Computing (INC) capabilities within High-Performance Computing (HPC) environments. As data movement becomes a primary bottleneck in exascale computing, offloading computation to network switches (INC) is a critical research area. The project focuses on three key objectives:
1. Framework Development: Developing a scalable simulation environment using the open-source ns-3 network simulator.
2. Topology Implementation: Modeling next-generation HPC network topologies, specifically Fat-Tree, Dragonfly, and Dragonfly+.
3. Algorithm Validation: Simulating a novel INC-accelerated Parallel Quicksort algorithm to benchmark performance gains. Methodology & Resources: Students will not be starting from zero. An in-house reference prototype, developed from scratch (without simulation libraries), currently simulates both the Conventional and INC-accelerated Parallel Quicksort algorithms on a Fat-Tree topology.
• Phase 1: Analyze the existing in-house prototype to understand the INC-accelerated sorting logic.
• Phase 2: Port and expand this logic into the ns-3 ecosystem, enabling standard protocol support and more complex network models.
• Phase 3: Implement Fat-Tree and/or Dragonfly/Dragonfly+ topologies and compare the ns-3 simulation results against the in-house prototype for validation.
Number of students: 2 students
Requirements and Additional Info: 1) Strong C++ skills (Essential for ns-3 development). 2) Computer Networks concepts (Routing, Topologies, Packet Switching). 3) Familiarity with Linux environment and Git.

Evaluation of Structured and Unstructured DNN Pruning Across Heterogeneous GPU Architectures
Name of the faculty mentor(s): Suleyman Tosun (stosun@hacettepe.edu.tr)
Project Description: Traditional research in Deep Neural Network (DNN) pruning typically treats structured and unstructured methods as distinct paradigms. While unstructured pruning offers higher theoretical compression, it often introduces irregular sparsity that may degrade performance on general-purpose architectures like GPUs. Conversely, structured pruning aligns better with hardware data layouts but may sacrifice model accuracy. This project aims to conduct a comprehensive empirical investigation into how these two pruning types interact with diverse GPU architectures. By utilizing established benchmark datasets, we will quantify the trade-offs between model sparsity and real-world computational throughput (latency, energy consumption, and memory bandwidth). Key Objectives:
Architectural Analysis: Evaluating the "irregularity penalty" of unstructured pruning across various GPU generations to identify specific hardware constraints where structured methods are more beneficial. Comparative Benchmarking: Utilizing standard datasets to provide a rigorous baseline of performance- to-accuracy ratios.
Number of students: 2 students
Requirements and Additional Info: None

AI-Driven Blockchain Prediction Markets for Collective Intelligence and Forecasting Accuracy
Name of the faculty mentor(s): Adnan Özsoy, Cemil Zalluhoğlu (adnan.ozsoy@hacettepe.edu.tr, cemil@cs.hacettepe.edu.tr)
Project Description: This project proposes a blockchain-based, AI-enhanced prediction market platform where participants trade tokenized shares on the outcomes of real-world events (e.g., elections, scientific breakthroughs, or technology trends), while machine learning models continuously analyze external data sources (news, social media, economic indicators) to generate probabilistic forecasts and detect market inefficiencies. Smart contracts on a public or permissioned blockchain will ensure transparent order matching, automated settlement, and tamper-proof recording of trades and outcomes, while oracles securely feed verified real-world results into the system. The platform will integrate large language models for semantic signal extraction, time-series models for trend forecasting, and reinforcement learning agents to simulate and evaluate trading strategies. A key research goal is to study the interaction between decentralized market incentives and AI-generated signals, measuring whether AI guidance improves collective forecasting accuracy or introduces bias or herding. The project will include a web dashboard, data ingestion pipelines, on-chain analytics, and an evaluation framework comparing market prices, AI predictions, and ground truth outcomes over time. Student 1 will design and implement the blockchain smart contracts and on-chain market logic, Student 2 will develop the AI models for data ingestion, forecasting, and signal extraction, Student 3 will build the web platform, dashboards, and evaluation framework integrating both components.
Number of students: 3 students
Requirements and Additional Info: A strong willingness to work and actively contribute to the project.

Generative Motion Priors for Robust Navigation in Highly Constrained Environments: The BARN Challenge on Duckiebots
Name of the faculty mentor(s): Ozgur Erkent (ozgurerkent@hacettepe.edu.tr)
Project Description: The goal of this project is to develop a state-of-the-art autonomous navigation algorithm for mobile robots operating in dense, cluttered, and unseen environments. Specifically, students will tackle The BARN (Benchmark Autonomous Robot Navigation) Challenge, an ICRA-affiliated competition that tests a robot's ability to navigate from a start to a goal without collisions in complex obstacle courses. The project will focus on Diffusion-based Generative Models to generate motion trajectories. Unlike classical planners (like DWA) or standard Reinforcement Learning, diffusion models can learn complex, multi-modal distributions of "successful" paths, allowing the robot to "imagine" and execute safe trajectories even in extremely tight spaces.
Number of students: 3 students
Requirements and Additional Info: Required: Strong Python programming skills, Data Structures, and Algorithms. Preferred: Knowledge of ROS (Robot Operating System), familiarity with Deep Learning frameworks (PyTorch/TensorFlow). Motivation: Interest in Generative AI, Robotics, and hardware-software integration.

Scaling CROSSBAR System for Real World Discovery: Knowledge-Graph Expansion, LLM Agents, and Grounded Biomedical Question Answering
Name of the faculty mentor(s): Tunca Doğan (tuncadogan@gmail.com)
Project Description: Biomedical discovery remains constrained by fragmented repositories, inconsistent metadata, and limited support for integrative, mechanism-oriented reasoning. CROssBARv2 (https://crossbarv2.hubiodatalab.com/) addressed these challenges by unifying heterogeneous biomedical sources into a provenance-rich knowledge graph with standardized ontologies, rich metadata, and vector embeddings, and by providing interactive exploration and CROssBAR-LLM—a natural-language interface that grounds LLM outputs in the underlying graph to reduce hallucinations. This foundation enables scalable access to integrated biomedical evidence and supports downstream tasks such as hypothesis generation, drug repurposing, and protein function prediction.
This follow-on project will scale CROssBAR along two axes: (i) knowledge expansion and (ii) next- generation interaction and reasoning. We will substantially enlarge the graph by integrating additional biological resources and modalities, introducing new node/edge types to represent richer molecular, cellular, and functional relationships, while preserving maintainability. In parallel, we will enable more powerful human–system interaction through agentic, grounded mechanistic QA that can browse both the CROssBAR graph and the scientific literature, consolidate evidence, and return traceable, citation- ready explanations.
Concretely, the work will be organized as: WP1 data onboarding and schema extension (new node/edge types, harmonization, provenance); WP2 an MCP (Model Context Protocol) server for CROssBAR to expose standardized, tool-callable capabilities (graph queries, hybrid retrieval, provenance/evidence export); WP3 graph-native GraphRAG and literature-aware agents for grounded molecular-mechanism answers; WP4 advanced knowledge-graph visualization and UX for interpretable exploration; and WP5 systematic evaluation (use cases, QA benchmarks, grounding/citation quality, usability).
Number of students: 3 students
Requirements and Additional Info: Advanced level programming (Python), Motivation to learn data representation, agentic LLMs, knowledge graphs and NoSQL databases.

Domain-Specific Modeling Infrastructure for an OSS Quality Assessment Tool
Name of the faculty mentor(s): Prof. Dr. Ayça Kolukısa, Dr. Tuğba Erdoğan, Dr. Nebi Yılmaz, Res. Asst. Aslı Taşgetiren (atarhan@cs.hacettepe.edu.tr, aslitasgetiren@cs.hacettepe.edu.tr)
Project Description: This project focuses on designing and implementing the modeling infrastructure of an open-source software (OSS) quality modelling evaluation tool. The tool enables users to define quality models through a domain-specific language (DSL) derived from a formally defined meta-model. These definitions must be parsed, validated, transformed into a Entity–Relationship (E-R) representation, and persisted in the backend system.
The student(s) will (1) design and implement a DSL parsing mechanism that translates user-defined specifications into an E-R representation, (2) develop validation mechanisms to enforce structural constraints and consistency rules derived from the meta-model, and (3) implement backend services that support creation, storage, retrieval, and integrity control of defined structures. The outcome will be a robust modeling engine that ensures correctness, consistency, and maintainability within the tool..
Number of students: 3 students
Requirements and Additional Info: Students should have: Strong Java knowledge, familiarity with database design and E-R modeling and basic knowledge of parsing concepts or compiler fundamentals.

Automated Quality Evaluation Infrastructure for Open-Source Software
Name of the faculty mentor(s): Prof. Dr. Ayça Kolukısa, Dr. Tuğba Erdoğan, Dr. Nebi Yılmaz, Res. Asst. Aslı Taşgetiren (atarhan@cs.hacettepe.edu.tr, tugbagurgen@cs.hacettepe.edu.tr, nebi.yilmaz@hacettepe.edu.tr, aslitasgetiren@cs.hacettepe.edu.tr)
Project Description: Assessing the quality of open-source software (OSS) requires the systematic collection and processing of both repository-level and code-level metrics. While many tools extract partial information (e.g., repository statistics), integrating static code analysis, repository mining, metric normalization, and comparative reporting into a single evaluation pipeline remains a practical challenge.
This project focuses on designing and implementing the evaluation infrastructure of an OSS quality assessment tool developed using Java (Spring Boot), React, and PostgreSQL. The student(s) will (1) investigate and benchmark suitable static analysis tools (e.g., SonarQube, PMD, CK, Understand, Radon), (2) design mechanisms to extract repository metrics via the GitHub API or mining libraries (PyDriller), and (3) implement backend services that collect, normalize, store, and aggregate metrics within a PostgreSQL-backend system. The outcome will be a modular evaluation layer that supports automated metric extraction, score computation, and comparative analysis across OSS projects (traditional software and AI-based repositories).
Students will conduct a small-scale comparative study of static analysis tools as part of the project and document integration trade-offs (accuracy, performance, extensibility, licensing). Some related works: https://github.com/nirhasabnis/gitrank [1] & https://gomstory.github.io/oss-aqm/#/compare [2]
[1] Hasabnis, Niranjan. "GitRank: a framework to rank GitHub repositories." Proceedings of the 19th International Conference on Mining Software Repositories. 2022.
[2] Madaehoh, Arkom, and Twittie Senivongse. "OSS-AQM: An open-source software quality model for automated quality measurement." 2022 International Conference on Data and Software Engineering (ICoDSE). IEEE, 2022.
Number of students: 3 students
Requirements and Additional Info: Students should have: Solid Java knowledge, a basic understanding of Git and open-source development workflows, and familiarity with REST APIs and relational databases Preferred (not mandatory): Experience with static analysis tools, basic knowledge of software metrics (e.g., complexity, cohesion, coupling), and interest in empirical software engineering.

Efficient Multimodal Retrieval for Video Question Answering with Adaptive Modality Selection
Name of the faculty mentor(s): Gulden Olgun
Project Description: This project aims to develop an AI-based retrieval system for educational videos that can answer user queries by adaptively selecting the most informative modality. Instead of always relying solely on transcript data or only on visual information, the system first analyzes the query and determines whether the answer should be retrieved primarily from the transcript, from visual keyframes, or from a combination of both. The main motivation is that many instructional and lecture videos contain important information that is not fully captured in speech alone, while processing all visual content with large multimodal models is computationally expensive. Therefore, the project will focus on building an efficient retrieval pipeline that selectively uses transcript-based retrieval and keyframe-based multimodal reasoning to improve answer accuracy while controlling token usage, computational cost, and latency.
Number of students: 2 students
Requirements and Additional Info: None