Data Science and Machine Learning projects
OCR pipeline Implementation of a generic and configurable OCR web service for text extraction from scanned documents.
Configurable image processing based preprocessing modules, configurable modules for different OCR engines, Language sensitive word correction post processing modules.
Garnishment document classification and enrichment Testing different enrichment models and perform different tests.
Implementation of Deep learning techniques for document classification and named-entity recognition.
Dockerizing, Integration and testing web services.
Stamp recognition and information extraction Specific stamp recognition from scanned documents and date and time extraction.
Implemented with Keras, scikit-learn, OpenCV.
Managed to achieve 83% accuracy.