This project is designed to process Azure Data Factory (ADF) JSON files, standardize their structure, and store them as Delta files in a specified Azure Data Lake Storage account. The project is ...
A focused pipeline to parse medical guidelines (PDF/HTML) into structured JSON for downstream clinical RAG or summarization. This implements models, parsers, normalization utils, and a CLI to ingest ...