
Mayank Sharma
I'm an Engineering student that's trying to get into Research while learning and working on backend and GenAI stuff. Amateur badminton player, pro cook (at home) and an avid music lover.
Experience
Developed a document-grounded RAG assistant with LangChain, Ollama (Gemma 3), FAISS, and FastAPI, enabling scalable semantic retrieval and inference from internal PDFs.
Designed and implemented a document-grounded Retrieval-Augmented Generation (RAG) assistant using LangChain and Ollama (Gemma 3), leveraging open-source embedding models to vectorize internal PDF content for high-accuracy semantic retrieval.
Built end-to-end preprocessing and chunking pipelines for unstructured data, fine-tuned document retrieval with FAISS, and integrated the system with FastAPI to expose scalable inference endpoints; also documented APIs and conducted functional validation.
Building Long term memory for conversational AI agents, enabling them to remember and recall information across sessions.
Got selected for an interview at Y Combinator for the Summer 2025 batch.
Co-Founded an AI Home Automation company, successfully shipping v1.0 of the product in a month and acquired 50+ beta testers.
Developed high-end home automation solution, integrating AI agent with Smart TV, Fire Stick, Apple CarPlay and switch board integrations.
White-labeling the home automation tech to other players.
Technical core member AI&ML team of GDSC Club at SRMIST.
Co-organized Research Summit: SRM 2024, featuring Mr. Abhijeet Bhattacharya, Product Engineer at Coding Blocks, attended by about 50+ students.
Facilitated student presentations and discussions on research ideas.
Education
Bachelor of Technology in Computer Science, Specialization in AI and ML; CGPA: 9.1
Higher Secondary Education; Percentage: 82%
Secondary Education; Percentage: 92%
GitHub Contributions
Skills
Side Projects
Architected a local-first RAG chat assistant using LangChain, FAISS, and Ollama (Gemma 3) to process internal PDFs, ensuring 100% data privacy by executing all LLM inference and vector embeddings strictly on-device.
Engineered a 3-tier document ingestion pipeline integrating PyMuPDF, PyPDF, and Tesseract OCR to handle diverse document formats, enabling automated extraction fallbacks for scanned images and eliminating unreadable file errors.
Deployed an OpenAI-compatible REST API via FastAPI and built a custom Open WebUI Pipe function to bridge backend operations with a browser interface, allowing users to execute live file uploads and stream token-by-token responses.
Optimized query performance by implementing Jaccard similarity conversational caching and Watchdog folder monitoring to prevent redundant model executions, significantly reducing response latency and enabling automated, real-time document indexing.
Two Whisper models (750M & 1.1B) were fine-tuned for Hindi ASR using 10k hours of Gram Vani audio. The models were optimized for real-time transcription with a latency of 200-300ms on 30s audio chunks.
Garden Of Days is a lightweight SwiftUI journaling app that encourages a short daily memory entry for every day of the year. It stores entries using SwiftData and includes a companion widget.
Achievements
Ranked among top 10 teams out of 100+ teams in many Hackathons.
Scored 99/100 marks in English in 10th Board Exams.
Certifications
IIT, Kharagpur
CDAC, Noida