About Me

Hello! I'm Mahdi Khemakhem, a Machine Learning Engineer and developer who builds scalable, AI-powered systems across diverse domains. My work spans healthcare, research, finance, and blockchain technologies, where I develop solutions that combine technical innovation with practical impact.

At the Precision Brain Health Initiative, I develop a range of systems—from information retrieval architectures to natural language processing pipelines. Beyond my current role, I've created trading systems for blockchain networks, built data platforms for large-scale research initiatives, and implemented automated workflows that significantly enhance productivity.

With a Master's in Computer Science specializing in Data-Centric Computing and a Bachelor's in Neuroscience, I bring a multidisciplinary approach to problem-solving. I'm proficient with Python, ML/DL frameworks, cloud architecture, containerization technologies, and more, allowing me to build end-to-end solutions that deliver concrete value in diverse environments.

Mahdi Khemakhem

Education

My academic background and qualifications.

Master of Science in Computer Science

Sep 2023 - May 2025 (Expected)
Boston, MA

Boston University

Specialization in Data-Centric Computing, 3.82/4.0 GPA

  • Coursework: Algorithms, Networks, Databases, Machine Learning, Deep Learning, Object-Oriented Programming, Neural Modeling, Natural Language Processing

Bachelor of Arts in Neuroscience with Honors

Sep 2017 - Sep 2020
Boston, MA

Boston University

Graduated Summa Cum Laude with a 3.95/4.0 GPA

  • Presidential Scholarship recipient
  • Dean's List all semesters

Work Experience

My professional journey and roles I've held over the years.

Machine Learning Support Engineer

Apr 2024 - Present
Boston, MA

Precision Brain Health Initiative, Boston University Medical School

Applying machine learning techniques to develop solutions in the dementia research space.

  • Academic Chatbot: Architected a retrieval augmented-generation (RAG) system, to enrich LLM prompts with real scientific text, and support responses with accurate citations, improving answers to research related questions
  • Implemented a pipeline to crawl, and embed 3M+ articles using transformers and Pinecone vector database.
  • Deployed a Gradio interface using LLMs (OpenAI, Ollama, vLLM) to support internal research queries.
  • Literature Review Crawler: Developed a Python-based web crawler using LLMs, to automate paper discovery and selection, scaled to process thousands of papers daily, supporting grant-writing efficiency for researchers.
  • Docker Microservices: Developed natural language (NLP) and audio processing containers that provide abstracted REST API endpoints to popular libraries (SpaCy, NLTK, SpeechBrain) reducing workflow integration time.

Research Developer

Mar 2022 - Apr 2024
Boston, MA

Davos Alzheimer's Collaborative

Developed systems to support research cohorts, and team operations.

  • Realtime Reports: Delievered realtime project management reports to leadership, leveraged Monday.com GraphQL API, Webhooks, Azure Functions, and PowerBI push datasets to maintain an up-to-date dashboard.
  • Task Classification: Implemented a realtime SVM classifier, triggered by webhooks and powered by Azure Functions, to label hundreds of tasks daily, increasing labeling compliance and improving data completeness.
  • Data Delivery: Automated participant data upload, using AWS Lambda, S3 and EventBridge.

Research Assistant

Jan 2023 - Sep 2024
Boston, MA

Kolachalama Lab, Boston University

Focused on LLM development for neuroscience research.

  • NeuroLLM: Fine-tuned a Falcon-40B LoRA, improving base performance by 50% on PubMedQA benchmark.
  • Led a team of three engineers to develop an internal site hosting LLMs fine-tuned on scientific neuroscience text.

Skills & Technologies

My technical skill set and expertise across different domains.

Programming Languages

PythonJavaC++MATLABJavaScriptSQLShell/Bash

Machine Learning & AI

PyTorchScikit-LearnLLMsTransformersRAG SystemsModel Fine-tuningSVMRandom Forest

Data Science

PandasNumPyData AnalysisOLS RegressionVector DatabasesPowerBI

Cloud & DevOps

AWSAzureGoogle CloudDockerMicroservicesGitAzure FunctionsAWS Lambda

Web Development

ReactFlaskREST APIsGradioGraphQLWebhooks

Domain Expertise

NLPNeuroscienceComputational ModelingBlockchainWeb3SolidityRedis