AI Researcher - Software Engineer
I am Hassan, currently working as an AI Researcher at the German Research Center for Artificial Intelligence (DFKI) in Berlin. I hold a Master’s degree in Computer Science from Saarland University, where I graduated with distinction. During my Master's studies, I gained industry experience at prominent organizations, including Bosch Center for AI (BCAI) in Germany, Amazon EU in Luxembourg, and the Max Planck Institute for Informatics, where I served as a Research Assistant.
My research focuses on Large Language Models (LLMs) and Generative AI. I specialize in developing cutting-edge LLM-based applications, from research prototypes to applied demonstrations. One of my key projects involved creating a chatbot for graduate students, designed to enhance their understanding of university courses. This work resulted in two published papers. In addition, I have contributed to improving the user experience of AI-based phone assistants by leveraging the capabilities of LLMs.
My Master's thesis, completed in collaboration with the NLP and Semantic Reasoning group at Bosch Center for AI, focused on Cross-Domain Neural Entity Linking. I investigated a Transformer-based model to help facilitate domain extension by identifying the best data for fine-tuning across different knowledge bases.
I am passionate about researching, experimenting with the trade-offs, and applying the most suitable AI models to real-world use cases that can impact millions of users and other researchers. Outside of work, I enjoy graphic design, fitness, philosophy, and exploring new cultures through travel and group activities.
Updates
[07-2024] Presented my publication with the title "Using Large Language Models for Adaptive Dialogue Management in Digital Telephone Assistants" at the UMAP'24 conference (HAAPIE).
[05-2024] Our demo paper in DFKI with the title "Scalable Mentoring Support with a Large Language Model Chatbot" got accepted to be published at the ECTEL'24 conference in September.
[05-2024] My work with chatbot design in DFKI with the title "Generative KI zur Lernenbegleitung in den Bildungswissenschaften: Implementierung eines LLM-basierten Chatbots im Lehramtsstudium" got accepted to be published at the DELFI'24 conference in September.
[07-2024] Attended the UMAP'24 (User Modelling Adaptation and Personalization) conference in Cagliari, Italy, and attended several keynotes and workshops.
[04-2024] My work with dialogue utterance adaptation based on user's context in DFKI with the title "Using Large Language Models for Adaptive Dialogue Management in Digital Telephone Assistants" got accepted to be published at the UMAP'24 conference (HAAPIE).
[03-2024] Presented recent advancements in Chatbot Design for university-level courses during my work in DFKI in collaboration with our partners from the University of Leipzig.
2024
[09-2024] Presented my publication with the title "Scalable Mentoring Support with a Large Language Model Chatbot" at the ECTEL'24 conference, and won the 2nd Best Demo Paper.
[09-2024] Presented my publication with the title "Generative KI zur Lernenbegleitung in den Bildungswissenschaften: Implementierung eines LLM-basierten Chatbots im Lehramtsstudium" at the DELFI'24 conference.
Interests
Natural Language Processing (NLP)
Large-scale Language Modelling (LLM)
Conversational AI & Chatbot Design
Retrieval Augmented Generation
Neural Information Extraction
Education
Saarland Informatics Campus, Saarland University
Master of Science in Computer Science
2022
GPA: 1.40/1.00
Faculty of Engineering, Alexandria University
Bachelor of Science in Computer and Communication Engineering
2018
GPA: 3.96/4.00
Alexandria, Egypt
Saarbrücken, Germany
Awards
ECTEL
Second Best Demo Paper
2024
Krems, Austria
Alexandria University
First Class Honor Degree
2018
Alexandria, Egypt
Skills
Fields: Machine Learning, Deep Learning, Natural Language Processing (NLP), Natural Language Understanding, Large-Scale Language Modeling, Generative AI, Conversational AI, Dialogue Systems, Neural Machine Translation, Information Extraction, Data Analysis, Data Visualization, Distributed Computing.
Technologies and Libraries: LangChain, LangGraph, LangSmith, LlamaIndex, HuggingFace, Transformers, Scikit-Learn, Keras, PyTorch, TensorFlow, Pandas, NumPy, Matplotlib, Leaflet, Docker, Kubernetes, Airflow, Git, Jira.
Programming and Databases: Python, R, Java, C++/C, Java Spring, Angular.js, SQL, MongoDB, Redshift, DynamoDB.
Languages: Native Arabic, Fluent English, Intermediate German.
Publications
[09-2024] Hassan Soliman, Miloš Kravčík, Alexander Tobias Neumann, Yue Yin, Norbert Pengel and Maike Haag. 2024. Scalable Mentoring Support with a Large Language Model Chatbot. Technology Enhanced Learning for Inclusive and Equitable Quality Education (ECTEL), September 16–20, 2024, Krems, Austria, 6 pages.
https://doi.org/10.1007/978-3-031-72312-4_37
Venues
ECTEL'24
[09-2024] Hassan Soliman, Miloš Kravčík, Alexander Tobias Neumann, Yue Yin, Norbert Pengel, Maike Haag and Heinz-Werner Wollersheim. 2024. Generative KI zur Lernenbegleitung in den Bildungswissenschaften: Implementierung eines LLM-basierten Chatbots im Lehramtsstudium. 22. Fachtagung Bildungstechnologien (DELFI), September 9-11, 2024, Fulda, Germany, 7 pages.
https://doi.org/10.18420/delfi2024_15
DELFI'24
[07-2024] Hassan Soliman, Miloš Kravčík, Nagasandeepa Basvoju, and Patrick Jähnichen. 2024. Using Large Language Models for Adaptive Dialogue Management in Digital Telephone Assistants. In Adjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization (UMAP Adjunct ’24), July 1–4, 2024, Cagliari, Italy. ACM, New York, NY, USA, 12 pages.
https://doi.org/10.1145/3631700.3664902
ACM'24 UMAP
[05-2022] Hassan Soliman, Heike Adel, Mohamed H. Gad-Elrab, Dragan Milchevski, and Jannik Strötgen. 2022. A Study on Entity Linking Across Domains: Which Data is Best for Fine-Tuning?. In Proceedings of the 7th Workshop on Representation Learning for NLP, ACL, 184–190, Dublin, Ireland. https://aclanthology.org/2022.repl4nlp-1.19
ACL'22 RepL4NLP
Preprints
[01-2021] Effective General-Domain Data Inclusion for Machine Translation by Vanilla Transformers.
https://arxiv.org/abs/2209.14073
Built and trained a Transformer from scratch on the German-English translation task applications of WMT’13.
Utilized a general-domain dataset from IWSLT'16 TED talks to help improve performance of the Transformer model, achieving a 25.8 BLEU score.
[08-2019] Offensive Language Detection & Classification on Twitter.
https://arxiv.org/abs/2209.14091
Trained a classifier to detect offensive tweets from Twitter using SVM, after performing iterative experiments.
Achieved a Binary Accuracy of 74% in classifying offensive tweets, and received the highest score among all participant teams.
[06-2019] Data Augmentation using Feature Generation for Volumetric Medical Images.
https://arxiv.org/abs/2209.14097
Proposed using U-net and ACGAN as a learning framework for feature generation of medical images of two complex types of brain tumors.
Deployed a classifier pipeline to test & validate the quality of the generated features.
Experience
German Research Center for AI (DFKI)
AI Researcher
Jan 2023 - Present
Led two projects in the Educational Technology lab, managing technical implementation and supervising two students.
Developed a chatbot for a graduate-level course that answered student queries with 87% accuracy. One of the two papers published on the project was nominated for the Best Demo Award at ECTEL 2024.
Applied advanced Retrieval-Augmented Generation (RAG) techniques, including Hybrid Ensemble Search and Reranking Mechanism, to enhance chatbot interactions and improve the retrieval of course materials.
Supported mentoring-style conversations by leveraging flexible agentic workflows with LangGraph, utilizing multiple small open-source models hosted on Azure, and using databases for user usage tracking and monitoring.
Implemented a sub-module for adaptive dialogue systems, customizing responses based on user emotional state and demographics, and benchmarking performance using OpenAI LLMs and open-source models.
Bosch Center for Artificial Intelligence (BCAI)
Applied Scientist
Intern
May 2022 - Aug 2022
Contributed to the NLP & Semantic Reasoning group, applying findings from my master’s thesis on Neural Entity Linking to a high-impact industrial project using real data at Bosch.
Refactored, tested, and documented production-level code for machine learning models, ensuring scalability and efficiency for real-world deployment, leveraging the in-house GPU cluster for model fine-tuning.
Trained and evaluated machine learning models on a large-scale domain-specific dataset, achieving 77% end-to-end recall for top-3 entity predictions, outperforming existing models.
Berlin, Germany
Renningen, Germany
Bosch Center for Artificial Intelligence (BCAI)
Master's Thesis
Student
Jun 2021 - Jan 2022
Joined the NLP & Semantic Reasoning group and worked on a unified system for linking named entities to general-domain (Wikipedia) and domain-specific knowledge bases (KBs), using context-aware embeddings (BERT) to learn a joint vector space. A pre-print of the thesis is available on arXiv. https://arxiv.org/abs/2210.15616.
Optimized a state-of-the-art model for cross-domain applications, supporting domain extension and identifying optimal data sources for fine-tuning, and improved GPU memory utilization for efficient embedding calculations.
Achieved a 9% increase in Average Precision for the top-1 entity and a 20% gain in Mean Average Precision (MAP) for top-10 entity linking across four domain-specific KBs, resulting in a workshop publication at ACL 2022.
Renningen, Germany
Max Planck Institut for Informatics (MPII)
Research Assistant
Nov 2020 - May 2021
Developed a model prototype within the Database & Information Systems group to identify diverse peer groups for entities, contributing to advanced set expansion techniques.
Implemented a baseline model for entity set expansion, leveraging Wikipedia lists as a knowledge source to enhance the model’s accuracy and comprehensiveness in the expanded sets.
Optimized the algorithm’s performance by achieving a 3x faster runtime using efficient sparse matrix multiplication techniques, significantly improving computational efficiency.
Saarbrücken, Germany
Amazon
Software Development Engineer Intern
Aug 2019 - Feb 2020
Maintained a web-based simulation tool for the Fulfillment Acceleration team using the AWS cloud platform, working as a full-stack software engineer.
Enhanced delivery speed simulations for prime customers, contributing to a successful report on fulfillment operations and improving delivery efficiency.
Collaborated as a system administrator in an Agile environment, managing server infrastructure and providing technical support for team tools.
Luxembourg, Luxembourg
Theses
Cross-Domain Neural Entity Linking
Worked on a unified system for linking named entities to both general-domain (Wikipedia) and domain-specific knowledge bases (KBs).
Improved semantic search using contextual-aware embeddings (BERT) to learn a joint vector space for KBs from different domains.
Achieved a 20% gain in Mean Average Precision (MAP) for top-10 entity linking across four domain-specific KBs.
Co-authored an invention report based on thesis work, leading to a US patent and receiving an Incentive-Prämie.
A pre-print of this thesis is available on arXiv: https://arxiv.org/abs/2210.15616.
Masters'22
Egyptian Car License Plate Information Detection
Implemented an application which extracts license information from car images in Egypt, going over the different product life cycle stages.
Collected datasets for various kinds of car plates in Egypt in various conditions, and applied different Data Transformation (ETL) techniques.
Utilized pre-trained CNN models for fine-tuning for Object Detection, Localization, Semantic Segmentation, and OCR for the letters & numbers.
Bachelors'18
Projects
Scalable Mentoring Support with a LLM Chatbot
Designed and implemented a chatbot based on an LLM-based Agent to provide scalable educational support and timely feedback to students of education sciences, demonstrating significant potential of generative AI in education.
Utilized Advanced techniques in Retrieval Augmented Generation (RAG) to enhance chatbot interactions, e.g., Hybrid Ensemble Search and Reranking Mechanism, enabling it to retrieve and analyze course materials effectively.
The code is subject to a Non-Disclosure Agreement.
Using LLMs for Adaptive Dialogue Management
Adapted user-directed utterances using LLMs based on the user's parameters like gender, age, and sentiment, aiming to optimize user satisfaction in conversational AI systems, focusing on healthcare patient-practice interactions.
Evaluated different LLMs and open-source tools for effectiveness in utterance adaptation, in terms of speed, cost-effectiveness, and quality of the generated text based on the adaptation relevancy and adaptation adequacy.
The code is subject to a Non-Disclosure Agreement.
Information Extraction Pipeline in Medical Text
Extracted symptoms from medical text data (prescriptions) in the German language based on a set of doctor-predefined symptoms and their synonyms.
Extracted additionally other symptoms based on a comprehensive ontology for the symptoms, which is provided by the German Ministry for Health.
The code is subject to a Non-Disclosure Agreement.
Better Diet to fight COVID-19
Analyzed food consumption from all countries to investigate a relationship between country food culture & their recovery rate, using Analytics.
Implemented three Random Forest models and benchmarked the results to report the best one, and displayed figures using the Seaborn library.
The code is published on: https://github.com/HassanMahmoudd/COVID-19-Diet
SmolLM: Implementing, Fine-Tuning, and Aligning a LLM for Grammatical Error Correction
Implemented the SmolLM-135M (by HuggingFace) language model architecture, including components like Rotary Positional Embeddings, KV Cache, and Grouped-Query Attention, RMS Normalization, and SwiGLU Activation.
Fine-tuned the model on the Grammatical Error Correction (GEC) task using the Grammarly CoEdIT dataset.
Applied RLAIF through Direct Preference Optimization (DPO) to align model outputs with desired corrections.
Created a Colab notebook to guide users through implementation, fine-tuning, and evaluation processes.
Achieved significant improvements in grammatical error correction accuracy, scoring an expected BLUE score of ∼ 0.48.
Leveraged Python libraries such as PyTorch, Transformers, Datasets, and TRL to build and train the model effectively.
The code is published on: https://github.com/HassanMahmoudd/SmolLM_RL.
LinguaLexMatch: Enhanced Document Language Detection
Developed and evaluated three language detection models, including an Embedding-Based approach, a TF-IDF-based Multinomial Naive Bayes model, and a fine-tuned Transformer-Based methodology.
Implemented an embedding-based approach using the intfloat/multilingual-e5-large-instruct model by generating a representative embedding for each language and classifying documents based on cosine similarity.
Benchmarked models on the papluca/language-identification dataset, achieving 99.81% accuracy with the embedding-based model.
Analyzed performance metrics such as Accuracy, F1 Scores, and Confusion Matrices across 20 different languages.
Developed a Colab notebook for replicable implementation and evaluation of different language detection models.
Utilized Python libraries including Datasets, Transformers, and Scikit-learn for model development and evaluation.
The code is published on: https://github.com/HassanMahmoudd/LinguaLex_Match.