Natural Language ProcessingA comprehensive guide to PDF document parsing: Leveraging Tesseract, PyPDF2 & spaCy