DREAM Committee: Nazli Ikizler Cinbis and Erkut Erdem
Project Description: Invoice fraud, in one aspect, includes receiving duplicate invoices yielding the misuse of invoices for several times. Thus, it is crucial to spot the same invoice when it arrives. In this study, we will utilizr a novel and specially crafted document dataset to detect identical receipts regardless of possible changes such as orientation, view point, illumination and clutter. In addition, our aim is to develop a language agnostic approach by leveraging deep neural networks in the field of computer vision and NLP.
Requirements and Additional Info: Pyhton, Keras or Pytorch, Special interest into document understanding
Project Description: Pruning neural networks is a method to reduce the number of parameters in the network. It helps to increase the performance and to reduce the size of the network and computation cost. In this project, the goal is to compare different DNN pruning methods on different GPU architecture in order to see the effect of the proposed methods on different architectures.
Requirements and Additional Info: Successful completion of Artificial Intelligence / Machine Learning course or strong interest and wish to learn the relevant topics
Project Description: In this project, we will select a Machine Learning application such as traffic density prediction, a disease prediction, or a classification problem. Then, we will use different methods to reduce the size of the applications so that it can fit to a small microcontroller such as a Raspberry Pi. We will use quantization and pruning methods to reduce the size of the used neural network.
Requirements and Additional Info: Successful completion of Artificial Intelligence / Machine Learning course strong interest and wish to learn the relevant topics
Project Description: In this project, the student will be working on developing methodological solutions to understand medical images by using vision and NLP techniques. The participant will particularly investigate different approaches to combine vision models with LLMs, and train and evaluate these approaches on datasets consisting of medical data that contain images and text. By the end of the project, the student will become familiar with the recent multimodal generative models.
Requirements and Additional Info: Candidates should have strong programming skills and solid machine learning background. Prior experience with LLMs and vision models are suggested. This project will be carried out in collaboration with Dr. Aykut Erdem of Koc University, Istanbul.
Project Description: In this project, the student will be working on developing methodological solutions to understand satellite images by using vision and NLP techniques. The participant will particularly investigate different approaches to combine vision models with LLMs, and train and evaluate these approaches on datasets consisting of textual and visual data. By the end of the project, the student will become familiar with the recent multimodal generative models.
Requirements and Additional Info: Candidates should have strong programming skills and solid machine learning background. Prior experience with LLMs and vision models are suggested. This project will be carried out in collaboration with Dr. Aykut Erdem of Koc University, Istanbul.
Project Description: Using only graphs to model the group relations in complex real-life networks, result in information loss. For example, the authors in co-authorship networks build groups via their collaborations, or, the group correspondences in email networks provide group relations in email networks. In link prediction problem, one aims to predict the pairwise relations in networks that could occur in future, by modelling the networks as graphs. The hypergraph model would be more suitable instead of graphs, when not only pairwise but also groupwise relations are to be predicted. In this project, we plan to develop a model for predicting group relations in social networks.
Requirements and Additional Info: Being a 3+ year student
Project Description: Proteins are the workhorses in the body of living beings, doing almost every function at the molecular level, from catalyzing reactions for digesting food to carrying oxygen to our cells for energy production. Identifying the specific functions of each protein is crucial to understanding the mechanisms of life and developing new treatments against deadly diseases. A protein’s functionality is closely related to its 3-D structure, which is recorded as the positions of every atom in the protein (roughly 10000 to 50000 per protein) in the 3-D space. Using transformer-based sequence modelling architectures for learning the embeddings of all-atom protein encodings is computationally inefficient due to quadratic complexity. With their subquadratic solutions, structured state space models (SSMs) offer the potential to work this problem out. In this project, the aim is to develop an SSM-based foundation model that successfully learns proteins' structural and functional properties. The system will be trained and validated on open-access ultra-large protein datasets. The optimized models will be fine-tuned for various protein function, interaction, and property prediction-related supervised tasks. Finally, the model will be deployed as an open-access tool for the scientific community to benefit from.
Requirements and Additional Info: Must: experience in machine learning (courses & projects). Advantage: experience in deep learning and language models