Md. Mutasim Billah Abu Noman Akanda | Document AI & ML Research

Industry track

Built production AI systems across OCR/document intelligence, voice AI, retrieval systems, and large-scale crawling. Engineering decisions prioritize measurable outcomes: throughput, latency, cost, and reliability.

Best for: AI/ML engineering roles, research engineer roles, startup applied AI positions, and teams needing end-to-end ML ownership from experimentation to deployment.

Core signal: practical systems at scale (10K+ OCR docs/day, multimillion-site pipelines, real-time voice interactions) with quantifiable optimization and delivery impact.

Document: Industry CV (PDF) · source (TeX)

Academic track

Research direction

Document AI and OCR in low-resource language settings, with emphasis on robust methods, reproducible experimentation, and practical deployment constraints.

Evidence of preparation

Peer-reviewed publications (ACM, Springer), strong undergraduate academic record (CGPA 3.96/4.00), and teaching experience in core CS/AI courses.

PhD positioning

Targeting fully funded US PhD programs (Fall 2027) with interests spanning document understanding, multimodal learning, and evaluation-oriented ML systems.

Efficient multimodal learning and edge-centric vision: VLM/OCR adaptation, quantization, and comparative computer vision evaluation (ACM SE'24 DLA).

Profile: Academic track page · Document: Academic CV (PDF) · source (TeX)

Work experience

Senior Machine Learning Engineer
Apurba Technologies Ltd. On-site · Dhaka, Bangladesh · Sep 2025 – present

Project: Borno OCR — funded by the ICT Ministry of Bangladesh

Leading enterprise deep learning OCR for Bengali text recognition, document layout analysis, and automated PDF/image → editable DOCX.
- Architected dots.ocr backend integration for Bengali, scaling to 10,000+ documents daily.
- Optimized Qwen2.5-VL with 4-bit quantization: ~4× memory reduction and ~40% lower infrastructure cost.
- Proprietary bounding-box alignment: +25% conversion accuracy, −80% manual corrections.
Project: AI Interview Bot — real-time voice interview platform

Built the full interview engine and LiveKit WebRTC voice pipeline for automated AI-driven candidate interviews with session-isolated RAG and automated evaluation.
- End-to-end voice pipeline (STT/LLM/TTS): ~sub-3s average turn latency; 4-tier TTS fallback for ~99% synthesis uptime.
- Session-isolated RAG: 10+ concurrent interviews with independent CV/JD context; zero cross-session data leakage.
- Automated evaluation replacing ~20 min manual review per candidate; job-level analytics across multiple roles.
Data Scientist
Pixalate Inc. Remote · California, USA · Mar 2025 – Aug 2025

Project: Ad Intel Crawler — AI-powered ad network detection

AI-driven large-scale web crawling, automated ad network identification, and real-time media analysis for advertising intelligence.
- End-to-end detection system: 73% accuracy across 2M+ websites, 50K+ pages/day.
- Deterministic pattern matching: reduced AI dependency, $50K+/year compute savings.
- Scalable crawler orchestration: 1M+ requests/day with asynchronous processing.
AI Engineer
Green Pants Studio Remote · Texas, USA · Jun 2024 – May 2025

Project: Vendidit — AI eCommerce intelligence

Large-scale scraping, automated fair market valuation, and intelligent data mapping.
- Vendidit Scraper: +40% speed, 100K+ products/day.
- OLS regression for valuation: 92% accuracy, −60% pricing errors.
- Dockerized microservices: −50% deploy time, +35% reliability.
Machine Learning Engineer
Apurba Technologies Ltd. On-site · Dhaka, Bangladesh · Mar 2023 – May 2024

Project: Bengali OCR — ICT Ministry funded

Deep learning OCR for Bengali, layout analysis, and scene text for government digitization.
- YOLOv8 document layout: 85%+ accuracy on Bengali documents.
- 8-bit quantization: 2× faster inference, 50% less memory.
- Co-authored peer-reviewed papers at ACM SE ’24 and Springer (ICTCS / LNNS).
Teaching Assistant, Computer Science & Engineering
University of Liberal Arts Bangladesh (ULAB) On-site · Dhaka, Bangladesh · Feb 2021 – Jan 2023

Supported core undergraduate CS courses for cohorts of 30+ students per offering.
- Courses: Introduction to Programming, Object-Oriented Programming, Artificial Intelligence, Software Engineering.
- Led lab sessions and tutorials; prepared lab materials; graded assignments and projects.
- Office hours and one-to-one mentoring, including guidance on AI/ML course projects.
President
ULAB Computer Programming Club Dhaka, Bangladesh · 2022 – 2023

Led UCPC as President; organized 4 CSE competitive programming contests (2022–2023).
- Take Off Summer 2022: 40+ participants; lead problem setter on a 6-problem contest (CSE department co-hosted).
- Additional UCPC contests: Winter Fall 2022 (6 problems), Independence Day Spring 2023 (5 problems), Take Off Summer 2023 (6 problems).

Technical skills

Programming & development

Python (expert), TypeScript, SQL, RESTful APIs, modular service architecture

Machine learning & AI

PyTorch, TensorFlow, scikit-learn, Sentence Transformers, Hugging Face, faster-whisper, Piper (TTS), vLLM, Ollama, LLM integration, deployment, OpenCV, EAST/CRNN handwritten OCR, QAT, DataParallel

Deep learning & NLP

LangChain, RAG, ChromaDB, transformers, GPT/Qwen workflows, semantic search, prompt engineering

Backend & data

FastAPI, WebSockets, LiveKit/WebRTC, SQLite, Redis, SQLAlchemy, PostgreSQL, pgvector, Pydantic, rate limiting, HTTP microservices

Web automation & discovery

Playwright, Crawl4AI, browser-use, BeautifulSoup4, Scrapy, async crawling pipelines

Frontend & DevOps

Gradio, Next.js, React, Docker, Docker Compose, GitHub Actions, CI/CD, Pytest, Vitest

Data science & analytics

Pandas, NumPy, Matplotlib, Seaborn, MLflow, Weights & Biases, statistical analysis

Tools & collaboration

Git, GitHub, Jira, Notion, Slack, Agile, project management

Research & Technical Projects

Research-relevant flagships first: low-resource OCR, biomedical imaging methods, multimodal systems, and retrieval. Additional demos listed after.

SenseScan

Bengali handwritten document OCR

Full-page Bengali handwriting: EAST + LANMS detection, CRNN recognition with QAT-ready checkpoints, reading-order assembly. FastAPI endpoints (/plugin, /v1/ocr/handwritten), optional Gradio UI, DataParallel multi-GPU.

Python · PyTorch · FastAPI · OpenCV · LANMS · Gradio

GitHub →

RSNA biomedical imaging

Brain MRI and abdominal CT (Kaggle)

EfficientNet3D on mpMRI DICOM (FLAIR, T1w, T1wCE, T2w) for MGMT classification; RSNA 2023 abdominal trauma with 2.5D EfficientNet CNN, 3D R3D-18, and DICOM-to-3D preprocessing. Methods notebooks public; competition ranks not claimed here.

PyTorch · pydicom · EfficientNet3D · KerasCV · R3D-18

Kaggle →

Voice Healthcare Agent

Full-stack conversational agent (FastAPI · Next.js 14)

Open-source monorepo: FastAPI backend with Next.js /call UI—REST and WebSockets for STT/TTS and agent turns, plus optional LiveKit WebRTC via a dedicated worker calling the same API. Uses Ollama, faster-whisper, and Piper; SQLite for appointment tooling and transcript persistence aligned across transports; optional MuseTalk lip-sync. Backend and frontend CI with pytest and Vitest.

Python · TypeScript · FastAPI · Next.js · LiveKit · Ollama · Docker Compose · GitHub Actions

GitHub →

GradConnectAI

Supervisor discovery & matching

End-to-end platform: CV signal extraction, AI-driven professor/opportunity discovery, ranked matches with evidence and outreach drafts.

FastAPI · Next.js · TypeScript · PostgreSQL · pgvector · Playwright · Docker

GitHub →

AI Interview Bot

Real-time voice interview platform

LiveKit WebRTC voice pipeline (faster-whisper STT, Ollama LLM, multi-backend TTS) with session-isolated concurrent interviews. Interview engine handles CV/JD parsing, ChromaDB RAG, ATS scoring, evaluation worker (LLM rubrics + deterministic fallback), and transcript APIs. Rate limiting (Redis/in-memory), turn-level prompt injection defense, Next.js candidate UI with mic/cam pre-checks.

Python · FastAPI · LiveKit · Next.js · ChromaDB · Redis · pytest · Docker

GitHub →

Additional: parking tracking

YOLOv9 + centroid tracking (demo)

Overhead CCTV vehicle detection and tracking prototype using YOLOv9 and Euclidean centroid association. Exploratory computer-vision demo; not a primary research claim.

PyTorch · OpenCV · Ultralytics · YOLOv9

GitHub →

Additional: RAG local chat

Local document Q&A demo

Streamlit app with ChromaDB, LangChain, and Ollama for local document question answering. Teaching / prototyping demo.

Python · ChromaDB · LangChain · Ollama · Streamlit

GitHub →

Publications

Akanda, M.B.A.N.*, Ahmed, M.*, Rabby, A.S.A., & Rahman, F. Optimum Deep Learning Method for Document Layout Analysis in Low Resource Languages. ACM Southeast Conference (ACM SE ’24), 199–204. *Equal contribution.
doi:10.1145/3603287.3651184 · Google Scholar
Akanda, M.B.A.N., Prodhan, M., Sarwar, S., Raatul, A.M., Paul, B. Voice Controlled Home Automation with Cloud-Based Environment Monitoring System. ICTCS 2022, LNNS vol. 623, Springer, Singapore.
doi:10.1007/978-981-19-9638-2_21

Education

University of Liberal Arts Bangladesh (ULAB)

Dhaka, Bangladesh · Summer 2019 – Fall 2022

B.Sc. Computer Science and Engineering (Minor: Business Administration)

Summa Cum Laude — top 1% of graduating class (conferred May 2024)
CGPA: 3.96 / 4.00 (145 credits earned)
Honors: Vice Chancellor’s Honors List; Dean’s List Scholarship (3×)
Leadership: President, ULAB Computer Programming Club (2022–2023); organized 4 contests, including Take Off Summer 2022 (40+ participants)
Coursework: Artificial Intelligence, Digital Image Processing, Algorithms, Data Structures, Statistics and Probability, Software Engineering, Discrete Mathematics, Operating Systems

Honors & awards

Summa Cum Laude May 2024
President, ULAB Computer Programming Club 2022 – 2023 · 4 contests · 40+ participants (Take Off Summer 2022)
Problem Setter, ULAB Take Off Programming Contest Summer 2022
Vice Chancellor’s Honors List Scholarship Summer 2021
2nd Runners Up, ULAB Take Off Programming Contest Fall 2021
Dean’s List Scholarship Summer 2020, Fall 2020, Spring 2021
1st Runners Up, ULAB Take Off Programming Contest Spring 2021

Industry track

Academic track

Research direction

Evidence of preparation

PhD positioning

Work experience

Senior Machine Learning Engineer

Data Scientist

AI Engineer

Machine Learning Engineer

Teaching Assistant, Computer Science & Engineering

President

Technical skills

Programming & development

Machine learning & AI

Deep learning & NLP

Backend & data

Web automation & discovery

Frontend & DevOps

Data science & analytics

Tools & collaboration

Research & Technical Projects

SenseScan

RSNA biomedical imaging

Voice Healthcare Agent

GradConnectAI

AI Interview Bot

Additional: parking tracking

Additional: RAG local chat

Publications

Education

University of Liberal Arts Bangladesh (ULAB)

Honors & awards