Monoshiz Mahbub Khan

Monoshiz Mahbub Khan

(মনসিজ মাহবুব খান)

image

✉️ monoshizmk@gmail.com

🔵 LinkedIn profile

⚫ Personal GitHub repository

CV - Monoshiz Mahbub Khan.pdf64.5 KiB
Resume - Monoshiz Mahbub Khan.pdf60 KiB

Hello There!

My name is Monoshiz Mahbub Khan. I am a 5th year PhD candidate in the PhD in Computing and Information Sciences program at Rochester Institute of Technology, having started in Fall 2021. I have also worked as a Graduate Research Assistant under Dr. Zhe Yu at hil-se lab. I am interested in utilizing AI, Deep learning and NLP tools to build human-centered tools with visible and tangible uses. My work has primarily focused using NLP tools in the Software Engineering domain to develop support tools, with notable work on the Code Search task and the adaptation of Comparative Learning for Agile Story Point estimation. I have also been involved in projects involving multi-modal data, image processing, LLMs and more.

I have experience as a Graduate Teaching Assistant for the courses IDAI-710: Fundamentals of Machine Learning and IDAI-720: Research Methods for Artificial Intelligence at Rochester Institute of Technology.

From June to August of 2024, I worked as an Intern at ABB in Mannheim, Germany through the DAAD Rise Professional Program 2024.

I am from Dhaka, Bangladesh. I completed my B.Sc. from the department of Computer Science and Engineering, University of Dhaka in 2020.

I am currently looking for full-time opportunities in AI/ML Engineering, Applied research or NLP-focused roles with a start date after May 2026.

Feel free to reach out. I’d love to collaborate, network and build new opportunities together!

‣
Education
  • PhD in Computing and Information Sciences
  • Rochester Institute of Technology

    August 2021 - Present (Expected: May 2026)

    CGPA: 3.93 (out of 4.00)

  • BSc in Computer Science and Technology
  • University of Dhaka

    January 2016 - January 2020

    CGPA: 3.55 (out of 4.00)

‣
Work Experience
  • Intern ABB Mannheim, Germany June 2024 - August 2024
    • Developed an end-to-end Named Entity Recognition (NER) pipeline to serve as an internal product for engineers. Used traditional NLP methods, ML models, deep learning models and LLMs.
    • Final NER model showed an improvement in F-1 score of 0.36 over the initial NER model.
    • Used various tools including PyTorch, spaCy, scikit-learn, Hugging Face, MLflow.
    • Conducted under the supervision of Nika Strem as part of the DAAD RISE Professional Program 2024
  • Graduate Research Assistant Lab of Human-In-the-Loop Software Engineering Rochester Institute of Technology Fall 2021 - Fall 2023, Fall 2024, Fall 2025 Supervisor: Dr. Zhe Yu
    • Conducted research on code search and published in EMSE, using NLP and ML tools to retrieve most relevant code snippet based on text query. The proposed showed an average improvement of 10.03% over state-of-the-art methods in terms of MRR scores.
    • Conducted research on comparative learning, using NLP and ML tools for agile story point estimation, showing an average increase of 21.84% in Spearman’s rank correlation coefficient scores.
    • Conducted human subject experiments to support comparative learning research.
    • Also explored research topics involving explainable AI and image classification.
    • Served as Graduate mentor for REU Site: Trustworthy AI Workshop 2025.
    • Mentored Master’s students on thesis projects: guided experimental design, advised research direction, and provided feedback on thesis writing.
  • Graduate Teaching Assistant Rochester Institute of Technology
    • IDAI-720: Research Methods for Artificial Intelligence (Spring 2024)
      • Instructors: Dr. Zhe Yu & Dr. Esa Rantanen
      • Graded assignments and final projects, and hosted office hours
    • IDAI-710: Fundamentals of Machine Learning (Spring 2024, Spring 2025)
      • Instructor: Dr. James Heard
      • Graded assignments, hosted office hours, conducting review classes
‣
Publications

1. Khan, M. M., & Yu, Z. (2024). Approaching Code Search for Python as a Translation Retrieval Problem with Dual Encoders. Empirical Software Engineering. DOI: 10.1007/s10664-w024-10580-3 2. Khan, M. M., Xi, X., Meneely, A., Tang, Y. & Yu, Z. (2026) Efficient Story Point Estimation With Comparative Learning. arXiv preprint arXiv:2507.14642. 3. Bethi, M. R., Jhade, S. R., Yaganti, P., Khan, M. M., & Yu, Z. (2026) Modeling Art Evaluations from Comparative Judgments: A Deep Learning Approach to Predicting Aesthetic Preferences. arXiv preprint arXiv:2602.00394. 4. Minni, K., Zhang, Q., Khan, M. M., & Yu, Z. (2026) Modeling Image-Caption Rating from Comparative Judgments. arXiv preprint arXiv:2602.00381.

‣
Research Experience

Code Search (2021-2024) Research project focusing on retrieving programming language artifacts related to some natural language queries from a pool of possible programming language artifacts and ranking them by relevance, using dual encoder models. Model is built in Python using TensorFlow and Keras modules. This model showed an average improvement of 10.03% over state-of-the-art methods in terms of MRR scores. The research was conducted under the guidance of Dr. Zhe Yu. This work has been published in Empirical Software Engineering (EMSE) Journal and presented at FSE 2025 in the journal-first track. https://github.com/hil-se/CodeSearch

Comparative Learning (2023-2026) Research project focusing on modeling learning comparative judgments for Agile story point estimation through machine learning and human subject experiments. Machine learning experiments involved building a model to learn from pairwise story point data and rank them. These experiments involved using GPT2, SBERT, FastText language models and traditional machine learning methods. The framework was built using TensorFlow modules. The proposed model showed an average increase of 21.84% in Spearman’s rank correlation coefficient scores over state-of-the-art models. The research has been conducted under the guidance of Dr. Zhe Yu.

https://github.com/hil-se/EfficientSPEComparativeLearning

Explainable image classification (2024)

Image processing and Explainable AI-based research project focusing on explaining a pre-trained VGG model's classification decisions on face image data. This work involved fine-tuning a pre-trained VGG model on SCUT face image data for classification, and using the model's gradients on the images to explain why the model made those decisions.

Comparative learning for face image attractiveness (2024)

Research project focused on modeling comparative learning on face image data. This work involved using the comparative judgment framework with a pre-trained VGG model as the encoder to predict a ranked preference order for the images.

Comparative learning for image captioning (2024 - 2026)

Research project focused on modeling comparative learning on image and associated caption data. This work involved using the comparative judgment framework on this multi-modal data to predict whether a paired image and text caption are likely to be connected.

Outdated comment detection for repository commits (2024 - 2025)

Research project focused on detecting whether the comment associated with repository commits are up-to-date or outdated after new commits. This work involved the use of various deep learning structures, including dual encoders.

Modeling Art Evaluations from Comparative Judgments (2024 - 2026)

Research project focused on modeling comparative learning on image data. This work involved using the comparative judgment framework on image based data to evaluate direct and comparative judgments on image data.

Bangla Abstractive Text Summarization using Encoder-Decoder Model (2019-2020) A research project on constructing a dataset for the task of abstractive text summarization in Bangla, and constructing a deep learning based model capable of using said dataset. The model was written in Python using Tensorflow modules. The research was conducted as the final year research project at University of Dhaka under the supervison of Dr. Muhammad Asif Hossain Khan. https://github.com/monoshizmkhan/Bangla-Abstractive-Text-Summarization

‣
Mentorship & Supervision Experience
  • Graduate mentor, REU Site: Trustworthy AI Workshop 2025
    1. Faculty member: Dr. Zhe Yu

      Mentored a visiting student on a research project focused on outdated comment detection in repository commits. Mentorship responsibilities included -

    2. Guiding the overall direction of the research project
    3. Contributing to experimental design and methodology
    4. Providing detailed feedback on literature review, experiment execution, and report writing
  • Informal Graduate Mentor, Human-in-the-Loop Software Engineering Lab, RIT
    1. Mentored two Master’s students on thesis projects involving comparative learning on image-caption data using multi-modal inputs. Support included -

    2. Guiding overall research direction and experiment planning
    3. Offering technical assistance (e.g., code-level help and architecture design)
    4. Providing regular feedback in biweekly meetings
    5. Supported extension of this work into a peer-reviewed article submission
‣
Other Past Projects

Personal projects

  • MLOps and Data Pipeline Projects (2025) Small toy project to learn and brush up on several tools, including - MLflow, Airflow, PySpark. https://github.com/monoshizmkhan/BostonToyProjects/
  • LLM and RAG Projects (2025) Small toy project to learn fine-tuning an LLM (GPT2) and implementing a RAG. Planned future parts of this project include using LangChain modules. https://github.com/monoshizmkhan/LLM-Experiments/

Course projects

  • Kabaddi (2016) A single or multiplayer video game based on the sport of the same name. Written in C++ as the Fundamentals of Programming Lab project at University of Dhaka. https://github.com/monoshizmkhan/Kabaddi
  • Trapped (2017) A single or online multiplayer interactive puzzle game. Written in JAVA as the Object Oriented Programming Lab project at University of Dhaka. https://github.com/monoshizmkhan/Trapped
  • Musyc (2018) A music-based social networking application with a built-in offline music player on Android platform. Made using JAVA and SQLite as the Application Development Lab project at University of Dhaka. https://github.com/monoshizmkhan/Musyc
  • EasyML (2018) A web-based application for the purpose of applying and visualizing several machine learning algorithms on datasets. Written using Python and JavaScript Served as the project in the course Software Engineering Lab at University of Dhaka. https://github.com/Saad-Mahmud/EasyML
  • Pharmassistant (2018) A software as a user interface for the use of online product inventory, searching, sales and finances management by employees of a pharmacy. Written in Python and JavaScript as the Software Design Patterns Lab project at University of Dhaka. https://github.com/HHMoon13/Pharmassist
  • CSEDU Project Hub (2019) A web application for the purpose of storing, sharing and viewing undergrad research projects. Written in Python and JavaScript as the Internet Programming Lab project at University of Dhaka.
  • BackPack (2022) An e-store for selling hiking, camping and miscellaneous equipment and equipment collections (known on the store as backpacks). Written using Java Spring and Angular frameworks as the Foundations of Software Engineering course project at Rochester Institute of Technology.
‣
Technical Strengths
Programming languages
Python, JAVA, R, C, C++, JavaScript
Machine Learning & AI
TensorFlow, Keras, PyTorch, scikit-learn, LLM fine-tuning, RAG
MLOps & Data Engineering
MLflow, Airflow, PySpark, Docker
Frameworks & Databases
Flask, Spring, Angular, SQL (Oracle, SQLite), NoSQL (mongoDB)
Tools & Methodologies
Git (GitHub, Azure DevOps), LaTeX, Agile, Scrum
‣
Scholarships
  • DAAD Rise Professional Program 2024
‣
Volunteering Experience
  • Student Volunteer
    1. ACM FSE 2025 Conference

      Trondheim, Norway

      June 23-25, 2025

    2. Assisted with session logistics including AV setup, attendee support, and slide coordination.