About the Author
A formal record of training, employment, and the experiments that did not make it into the main paper.
B.1Author note
Samarth Suthar is a final-year Computer Engineering undergraduate at the Vidush Somany Institute of Technology and Research, Kadi (2022 — 2026), and a contributor at eInfochips' AI-Studio division. His work centres on retrieval-augmented generation, medical computer vision, and the engineering questions that arise when language models are made to ground their outputs in evidence.
He has authored, contributed to, or shipped a renal-pathology classifier (Kidnex), an open-source cited-RAG document-chat (Kneen), a pothole-measurement pipeline at CSIR · CRRI, and the production embedded-test-case RAG inside AI-Studio. He has solved more than three hundred algorithm problems across LeetCode, Codeforces, and GeeksforGeeks, and graduated from NPTEL's Python for Data Science in the top 2% of his cohort.
B.2Experimental record
B.2.1 Employment
- Architecting a RAG system for embedded-device test-case generation — ingesting datasheets, errata, and historical bug reports.
- Authoring the next-generation retrieval pipeline for AI-Studio's website test-case generator; lifted engineer-accept rate by 34 pp on the internal benchmark.
- Productionising RAG primitives (hybrid retrieval, HyDE, reranking) as reusable internal libraries in Python.
- Built a Flask + OpenCV pipeline to measure pothole area from monocular phone imagery, with a reference object for scale.
- Achieved 8.7% MAPEon a 120-sample field set; the project brief required <15%.
- Internship deliverable detailed in Appendix A.3.
B.2.2 Education
B.3Skill matrix
| Family | Tools & techniques | Self-rated |
|---|---|---|
| Generative AI | LLM prompting, structured output, function-calling, evals | |
| Retrieval & RAG | pgvector, BM25, RRF, HyDE, reranking, chunking strategy | |
| Deep learning | PyTorch, transfer-learning (ResNet), fine-tuning, calibration | |
| Computer vision | OpenCV, classical pipelines, CT image preprocessing | |
| Backend | FastAPI, Flask, Python, SSE / streaming, Postgres / DBMS | |
| Infra | Docker, Redis, Postgres + pgvector, basic Linux ops | |
| Algorithms | 300+ problems on LeetCode / Codeforces / GFG; 5 LC badges |
B.4Achievements & awards
- 300+ algorithm problems solved across LeetCode, Codeforces, and GeeksforGeeks; five achievement badges on LeetCode covering streaks and topic-specific milestones.
- NPTEL · Python for Data Science — 84% final mark; top 2% of national cohort (Swayam programme).
- CSIR — CRRI internship— brief delivered; pipeline beat the brief's accuracy target by 6+ percentage points.
- Open-source maintainer — Kneen (cited-RAG document chat), released under permissive licence.
B.5Correspondence
Recruiters, collaborators, and curious peers are welcome. The fastest channel is email; for project source, the linked GitHub. The fully formatted BibTeX record below is for the meta-paper-inclined.
Cite this portfolio
Should the reader wish to refer back to this document in writing, the canonical BibTeX entry is below. The copy button copies the entry to clipboard.
@misc{suthar2026portfolio, author = {Suthar, Samarth}, title = {Building Retrieval-Augmented Systems at the Edge of Healthcare and Embedded Intelligence}, howpublished = {Working portfolio (arXiv:2026.0519v3)}, year = {2026}, month = {may}, institution = {eInfochips, An ARROW Company; VSITR Kadi}, email = {sutharsamarth16@gmail.com}, url = {https://github.com/Samarth0016}, keywords = {RAG, LLM, pgvector, ResNet-50, OpenCV}, note = {In revision; recruiter comments welcome.}, }
B.6Acknowledgements
Thanks are owed to the AI-Studio team at eInfochips for the latitude to ship the embedded-test-case RAG into a real product, and to the mentors at CSIR — CRRI who tolerated my insistence on doing the pothole work in OpenCV rather than a pre-trained YOLO. Thanks also to the maintainers of pgvector, FastAPI, and the original PyTorchResNet weights — standing on shoulders, as ever. The remaining errors and overclaims in this document are entirely the author's.
B.7Statements
Availability
Open to full-time roles from June 2026; open to internships and contract work immediately. Remote, India-based, or relocation-friendly.
Funding
This portfolio was self-funded. No external sponsor. Coffee was sourced internally.
Competing interests
The author is currently employed by eInfochips (Arrow). All opinions expressed are personal and do not represent the position of his employer.
Data & code availability
Kneen: open source, see GitHub. Kidnex: code on request, weights gated by dataset licence. Pothole work: deliverable property of CSIR — CRRI; technique reproducible from A.3.