About Me

I build intelligent systems that solve real problems. With expertise in Computer Vision, NLP, LLMs and generative AI, I develop scalable solutions that deliver measurable impact.

My approach goes beyond just writing code—I design intelligent, intuitive experiences that push boundaries and explore new possibilities.

As a quick learner and creative problem-solver, I thrive when facing complex challenges and enjoy transforming ambitious ideas into reality. Ready to collaborate on something revolutionary?

Tech Stacks

Docker
Docker
DBeaver
DBeaver
Lens
Lens
SearXNG
SearXNG
Amazon S3
Amazon S3
Amazon RDS
Amazon RDS
Azure
Azure
SQL Alchemy
SQL Alchemy
MySQL
MySQL
PostgreSQL
PostgreSQL
MongoDB
MongoDB
Neo4j
Neo4j
Redis
Redis
Chroma
Chroma
Qdrant
Qdrant
Milvus
Milvus
Weaviate
Weaviate
Python
Python
Pytest
Pytest
PyTorch
PyTorch
Pydantic
Pydantic
TensorFlow
TensorFlow
Locust
Locust
Prometheus
Prometheus
Grafana
Grafana
Airflow
Airflow
Hugging Face
HuggingFace
Ollama
Ollama
NVIDIA Triton
NVIDIA Triton
vLLM
vLLM
LangChain
LangChain
LangGraph
LangGraph
LangFuse
LangFuse
ModelContextProtocol
ModelContextProtocol
Parlant
Parlant
Google ADK
Google ADK
CrewAI
CrewAI
LlamaIndex
LlamaIndex
Dify
Dify
ComfyUI
ComfyUI
Yolo
Yolo
PaddlePaddle
PaddlePaddle
FastAPI
FastAPI
gRPC
gRPC
Streamlit
Streamlit
Gradio
Gradio
Git
Git
GitHub
GitHub
GitLab
GitLab
Git LFS
Git LFS
Slack
Slack
Jira
Jira
Docker
Docker
DBeaver
DBeaver
Lens
Lens
SearXNG
SearXNG
Amazon S3
Amazon S3
Amazon RDS
Amazon RDS
Azure
Azure
SQL Alchemy
SQL Alchemy
MySQL
MySQL
PostgreSQL
PostgreSQL
MongoDB
MongoDB
Neo4j
Neo4j
Redis
Redis
Chroma
Chroma
Qdrant
Qdrant
Milvus
Milvus
Weaviate
Weaviate
Python
Python
Pytest
Pytest
PyTorch
PyTorch
Pydantic
Pydantic
TensorFlow
TensorFlow
Locust
Locust
Prometheus
Prometheus
Grafana
Grafana
Airflow
Airflow
Hugging Face
HuggingFace
Ollama
Ollama
NVIDIA Triton
NVIDIA Triton
vLLM
vLLM
LangChain
LangChain
LangGraph
LangGraph
LangFuse
LangFuse
ModelContextProtocol
ModelContextProtocol
Parlant
Parlant
Google ADK
Google ADK
CrewAI
CrewAI
LlamaIndex
LlamaIndex
Dify
Dify
ComfyUI
ComfyUI
Yolo
Yolo
PaddlePaddle
PaddlePaddle
FastAPI
FastAPI
gRPC
gRPC
Streamlit
Streamlit
Gradio
Gradio
Git
Git
GitHub
GitHub
GitLab
GitLab
Git LFS
Git LFS
Slack
Slack
Jira
Jira

Experiences

AI Engineer
VinSmart Future - VinGroup
Jul 2025 - Present
Ho Chi Minh City, Vietnam
  • Engineered a universal search and recommendation engine for the Vin super-app based on query analysis and behavioral profiling,achieving sub-second latency (<1s) while significantly enhancing search relevance and user engagement
  • Optimized query understanding by fine-tuning Qwen-1.7B for semantic rewriting (abbreviation expansion, ambiguity correction) and deploying MLP models for intent classification, ensuring precise merchant/template detection
  • Architected a scalable multi-agent chatbot system for the VinGroup ecosystem with caching, guardrails, and orchestration layers, capable of handling 1,000 concurrent users with <5s response time through seamless Router-RAG coordination
  • Built hybrid knowledge system combining document indexing with knowledge graph using Qdrant for vector storage and MongoDB for graph relationships, enabling contextual understanding and multi-hop reasoning
  • Implemented semantic caching for RAG to reduce latency and API costs by retrieving cached responses for semantically similar queries
  • Deployed LLM serving infrastructure using vLLM for high-throughput inference supporting both indexing and real-time query processing
  • Engineered comprehensive prompt and context management system with custom tools, memory persistence, and RAG pipeline integration
  • Developed comprehensive evaluation framework for hallucination detection and fact-checking to ensure response accuracy and reliability
Tech Stack
Python
Python
Pydantic
Pydantic
Pytest
Pytest
Locust
Locust
LangFuse
LangFuse
Prometheus
Prometheus
Grafana
Grafana
Airflow
Airflow
Hugging Face
HuggingFace
vLLM
vLLM
FastAPI
FastAPI
gRPC
gRPC
Docker
Docker
DBeaver
DBeaver
SQL Alchemy
SQL Alchemy
PostgreSQL
PostgreSQL
MongoDB
MongoDB
Redis
Redis
Qdrant
Qdrant
Git
Git
Azure
Azure
AI Engineer
Inspire Lab Technology
Mar 2024 - April 2025
Ho Chi Minh City, Vietnam
  • Architected and built from scratch the core infrastructure of an enterprise-level Agentic AI system for automated SEO content generation. The system was regularly validated by SEO professionals for quality, reducing costs and cutting content production time by 80%.
  • Developed an Agentic RAG system that enhances information retrieval capabilities and fine-tuned prompts to maximize language model performance. Built a comprehensive evaluation framework for evaluating RAG results that achieved over 90% accuracy on our custom evaluation dataset.
  • Fine-tuned various LLMs (Llama 3.1, Qwen 2.5, etc.) resulting in 35% improvement in content quality based on professional feedback.
  • Developed pipelines for automated image generation and editing to complement textual content.
  • Implemented CI/CD workflows that reduced deployment time from 3 hours to 20 minutes, enabling bi-weekly feature releases.
  • Presented complex analytical findings and optimization recommendations to management in clear, actionable terms.
Tech Stack
Python
Python
Pydantic
Pydantic
Hugging Face
HuggingFace
Ollama
Ollama
LangChain
LangChain
LangGraph
LangGraph
CrewAI
CrewAI
ComfyUI
ComfyUI
Gradio
Gradio
FastAPI
FastAPI
Docker
Docker
DBeaver
DBeaver
Lens
Lens
SearXNG
SearXNG
Amazon S3
Amazon S3
Amazon RDS
Amazon RDS
SQL Alchemy
SQL Alchemy
MySQL
MySQL
PostgreSQL
PostgreSQL
Chroma
Chroma
Qdrant
Qdrant
Git
Git
AI Engineer intern
ISODS George Washington Institute of Data Science & Artificial Intelligence
Oct 2023 - Dec 2023
Remote
  • Developed specialized approach to detect abnormal head movements and positioning with high precision of ~83% on synthesis data.
  • Created innovative ear-related behavior detection capabilities to identify prohibited actions such as wearing earbuds or headphones, achieving high precision of ~91% on synthesis data and ~89% on collected real-world data.
6DRepNet Euler Angle Ear Landmark Image Processing
Tech Stack
Python
Python
OpenCV
OpenCV
Yolo
Yolo
Pytorch
Pytorch

My Projects

See All Projects

Awards & Achievements

Best Methodology Report of Surgical Tool Detection at MICCAI 2024

Oct 2024
MICCAI 2024

Honored to receive best methodology report award for our work on Surgical Tool Detection at MICCAI 2024 (27th International Conference on Medical Image Computing and Computer Assisted Interventions), one of the premier conferences in medical imaging and computer-assisted interventions.

View Certificate

Top 2 of Surgical Tool Detection at MICCAI 2024

Oct 2024
MICCAI 2024

Honored to receive 2nd place recognition for our work on Surgical Tool Detection at MICCAI 2024 (27th International Conference on Medical Image Computing and Computer Assisted Interventions), one of the premier conferences in medical imaging and computer-assisted interventions.

View Certificate

Top 5 of Surgical Task Recognition at MICCAI 2024

Oct 2024
MICCAI 2024

Honored to receive 5th place recognition for our work on Surgical Task Recognition at MICCAI 2024 (27th International Conference on Medical Image Computing and Computer Assisted Interventions), one of the premier conferences in medical imaging and computer-assisted interventions.

High Distinction - George Washington Fellowship in Computer Vision and AI Proctoring

October 2023
ISODS George Washington Institute of Data Science & Artificial Intelligence

Awarded by the International Society of Data Scientists (ISODS) for completing the Summer 2023 Practicum Program with High Distinction. Specialized in Computer Vision with a focus on creating functions for AI Proctoring Applications. Supervised by Drs. Christopher Do, Thanh Ha, Dinh-Lam Pham, and Tran-Minh-Khuong Vu at the George Washington Institute of Data Science & Artificial Intelligence.

View Certificate

Publications

Evolving Prompts for Synthetic Image Generation with Genetic Algorithm

Khoi Dinh Tran, Dat Viet Bui, Ngoc Hoang Luong

MAPR 2023

October 2023

This paper enhances the EvoGen framework by implementing an improved genetic algorithm with an elitism mechanism that preserves high-performing prompts during evolution. We introduce a cosine loss function as the fitness measure, resulting in faster convergence and better image guidance compared to previous approaches. Our modifications address the inconsistency issues in existing text-to-image prompt evolution methods, making the automatic generation of high-quality, preference-satisfying prompts more reliable.

Genetic Algorithm Prompt Engineer Stable Diffusion Synthetic Image Generation Generative AI

Connect With Me

I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision. Feel free to reach out through any of the channels above!