Layout Aware Parsing in Python

Multimodal Fine-Tuning of LLMs for Robust Document Visual Question Answering

Abstract: Document Visual Question Answering (DocVQA) necessitates comprehension of both the spatial layout and the textual content. Multimodal pretraining is a foundational component of existing ...

IEEE

Medical Image Object Detection via Layout-Aware Convolution and Optimal Transport Collaboration

Abstract: To address the dual challenges of limited local feature representation in conventional convolutional networks and the loss of spatial context in knowledge distillation for medical image ...

blockchain

x.ai Launches Grok Collections API for Enhanced Data Retrieval

x.ai introduces the Grok Collections API, enabling efficient data management and retrieval with advanced features like OCR and hybrid search, supporting various file types. In a significant ...

GitHub

Add LightOnOCR model implementation #41621

*This model was released on {release_date} and added to Hugging Face Transformers on 2025-10-27.* *This model was released on {release_date} and added to Hugging Face Transformers on 2025-10-31.* ...

GitHub

zainali2004/android-malware-detection

This implementation demonstrates practical application of academic research in malware detection, making advanced security analysis accessible through a modern web interface.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results