AI Techniques for Document Management: The Case of Umm Al-Qura University in Saudi
Classification of the document image is an essential process in digital libraries, office automation, and various image analysis application. There is considerable diversity in classifying the document either in the problem that needs to be solved using the training data to build a class model, classification approach, and document feature. This thesis will address text document image classification for a case study at Umm Al-Qura University in Saudi. We will highlight the importance of the classifier's design, the proper feature and the feature representation, the best classifier model, and the learning mechanism.Developing a specific classifying approach is a challenging task due to the variety of the type of documents. Classify document image is an essential task and is applied in the following fields:• Help to distinguish between documents automatically • Enhance the indexing efficiency • Help in quickly retrieving the document image• Make the high-level documenting analysis more accessible and less complex because most higher-level documents depend on domain-dependent knowledge to achieve higher accuracy.Many of the available systems used to extract information are constructed for a particular document type, such as the postal address of form processing. It is essential to clarify the document first to be suitable for the analysis of the document adopted. The document images classifier system in this thesis uses the A.I. Techniques for Document Management (The Case of Umm Al-Qura University in Saudi ). Document image classification can be constructed without relying on the text feature, so we will use Hog features to train pour machine, learning models in the proposed model. The machine learning model constructed are support vector machine, k- nearest neighbor, and naïve Bayes. We will also use CNN models, which are (VGG16, InceptionV3, ResNet50, InceptionResNetV2, MobileNetV2, DenseNet121, Xception). The dataset used is to train this system is a document scanned image from Umm Al-Qura University in Saudi).