OCR-paper-arxiv-daily latest papers
Automated deployment @ 2023-06-07 08:05:21 Asia/Shanghai
Welcome to contribute! Add your topics and keywords in
topic.yml
. You can also view historical data through the storage.
OCR
OCR
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2023-06-05 | Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents | David Kreuzer et.al. | 2306.02815v1 | null |
2023-06-03 | TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain | Sagar Chakraborty et.al. | 2306.02142v1 | link |
2023-06-02 | DocFormerv2: Local Features for Document Understanding | Srikar Appalaraju et.al. | 2306.01733v1 | null |
2023-06-01 | Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering | Wenjin Wang et.al. | 2306.00526v1 | link |
2023-05-31 | Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model | Haisong Ding et.al. | 2305.19543v1 | null |
2023-05-30 | DuoSearch: A Novel Search Engine for Bulgarian Historical Documents | Angel Beshirov et.al. | 2305.19392v1 | link |
2023-05-29 | GlyphControl: Glyph Conditional Control for Visual Text Generation | Yukang Yang et.al. | 2305.18259v1 | link |
2023-05-28 | FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions | Noam Rotstein et.al. | 2305.17718v1 | link |
2023-05-27 | Exploring Better Text Image Translation with Multimodal Codebook | Zhibin Lan et.al. | 2305.17415v2 | link |
2023-05-27 | Super-Resolution of License Plate Images Using Attention Modules and Sub-Pixel Convolution Layers | Valfride Nascimento et.al. | 2305.17313v1 | link |
2023-05-26 | People and Places of Historical Europe: Bootstrapping Annotation Pipeline and a New Corpus of Named Entities in Late Medieval Texts | Vít Novotný et.al. | 2305.16718v1 | null |
2023-05-24 | Quantifying Character Similarity with Vision Transformers | Xinmei Yang et.al. | 2305.14672v1 | link |
2023-05-21 | Measuring Intersectional Biases in Historical Documents | Nadav Borenstein et.al. | 2305.12376v1 | link |
2023-05-19 | XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages | Sebastian Ruder et.al. | 2305.11938v2 | link |
2023-05-18 | TextDiffuser: Diffusion Models as Text Painters | Jingye Chen et.al. | 2305.10855v2 | link |
2023-05-16 | Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding | Shuwei Feng et.al. | 2305.10448v1 | null |
2023-05-16 | Mobile User Interface Element Detection Via Adaptively Prompt Tuning | Zhangxuan Gu et.al. | 2305.09699v1 | link |
2023-05-13 | On the Hidden Mystery of OCR in Large Multimodal Models | Yuliang Liu et.al. | 2305.07895v2 | link |
2023-05-12 | Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution | Jianfeng Kuang et.al. | 2305.07498v1 | link |
2023-05-11 | Combining OCR Models for Reading Early Modern Printed Books | Mathias Seuret et.al. | 2305.07131v1 | link |
2023-05-09 | E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation | Cong Ma et.al. | 2305.05166v2 | link |
2023-05-04 | Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation | Renshen Wang et.al. | 2305.02577v1 | null |
2023-05-03 | Evaluating BERT-based Scientific Relation Classifiers for Scholarly Knowledge Graph Construction on Digital Library Collections | Ming Jiang et.al. | 2305.02291v1 | null |
2023-04-28 | LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model | Peng Gao et.al. | 2304.15010v1 | link |
2023-04-24 | DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents | Mohamed Dhouib et.al. | 2304.12484v2 | null |
2023-04-24 | ICDAR 2023 Competition on Reading the Seal Title | Wenwen Yu et.al. | 2304.11966v2 | null |
2023-04-17 | Multimodal Short Video Rumor Detection System Based on Contrastive Learning | Yuxing Yang et.al. | 2304.08401v3 | null |
2023-04-15 | TransDocs: Optical Character Recognition with word to word translation | Abhishek Bamotra et.al. | 2304.07637v1 | link |
2023-04-07 | Linking Representations with Multimodal Contrastive Learning | Abhishek Arora et.al. | 2304.03464v2 | null |
2023-04-07 | Cleansing Jewel: A Neural Spelling Correction Model Built On Google OCR-ed Tibetan Manuscripts | Queenie Luo et.al. | 2304.03427v1 | null |
scene text
scene text
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2023-06-05 | Neuralangelo: High-Fidelity Neural Surface Reconstruction | Zhaoshuo Li et.al. | 2306.03092v1 | null |
2023-06-05 | Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models | Andrew F. Luo et.al. | 2306.03089v1 | null |
2023-06-05 | Machine Learning and Statistical Approaches to Measuring Similarity of Political Parties | Daria Boratyn et.al. | 2306.03079v1 | null |
2023-06-05 | Interactive Editing for Text Summarization | Yujia Xie et.al. | 2306.03067v1 | link |
2023-06-05 | Of Mice and Mates: Automated Classification and Modelling of Mouse Behaviour in Groups using a Single Model across Cages | Michael P. J. Camilleri et.al. | 2306.03066v1 | null |
2023-06-05 | Structured Voronoi Sampling | Afra Amini et.al. | 2306.03061v1 | null |
2023-06-05 | ELEV-VISION: Automated Lowest Floor Elevation Estimation from Segmenting Street View Images | Yu-Hsuan Ho et.al. | 2306.03050v1 | null |
2023-06-05 | Designing Equilibria in Concurrent Games with Social Welfare and Temporal Logic Constraints | Julian Gutierrez et.al. | 2306.03045v1 | null |
2023-06-05 | HeadSculpt: Crafting 3D Head Avatars with Text | Xiao Han et.al. | 2306.03038v1 | null |
2023-06-05 | Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination | Yang Li et.al. | 2306.03034v1 | null |
2023-06-05 | Interpretable Alzheimer's Disease Classification Via a Contrastive Diffusion Autoencoder | Ayodeji Ijishakin et.al. | 2306.03022v1 | null |
2023-06-05 | Automating Style Analysis and Visualization With Explainable AI -- Case Studies on Brand Recognition | Yu-hsuan Chen et.al. | 2306.03021v1 | link |
2023-06-05 | Using Sequences of Life-events to Predict Human Lives | Germans Savcisens et.al. | 2306.03009v1 | null |
2023-06-05 | Nonparametric Iterative Machine Teaching | Chen Zhang et.al. | 2306.03007v1 | null |
2023-06-05 | Unveiling the Two-Faced Truth: Disentangling Morphed Identities for Face Morphing Detection | Eduarda Caldeira et.al. | 2306.03002v1 | link |
2023-06-05 | BeyondPixels: A Comprehensive Review of the Evolution of Neural Radiance Fields | AKM Shahariar Azad Rabby et.al. | 2306.03000v1 | null |
2023-06-05 | Long-range UAV Thermal Geo-localization with Satellite Imagery | Jiuhong Xiao et.al. | 2306.02994v1 | link |
2023-06-05 | Second-scale rotational coherence and dipolar interactions in a gas of ultracold polar molecules | Philip D. Gregory et.al. | 2306.02991v1 | null |
2023-06-05 | Integrated Sensing, Computation, and Communication for UAV-assisted Federated Edge Learning | Yao Tang et.al. | 2306.02990v1 | null |
2023-06-05 | Brain tumor segmentation using synthetic MR images -- A comparison of GANs and diffusion models | Muhammad Usman Akbar et.al. | 2306.02986v1 | null |
2023-06-05 | A Term-based Approach for Generating Finite Automata from Interaction Diagrams | Erwan Mahe et.al. | 2306.02983v1 | null |
2023-06-05 | Which Argumentative Aspects of Hate Speech in Social Media can be reliably identified? | Damián Furman et.al. | 2306.02978v1 | link |
2023-06-05 | Best of Both Worlds: Hybrid SNN-ANN Architecture for Event-based Optical Flow Estimation | Shubham Negi et.al. | 2306.02960v1 | null |
2023-06-05 | Complex Preferences for Different Convergent Priors in Discrete Graph Diffusion | Alex M. Tseng et.al. | 2306.02957v1 | null |
2023-06-05 | Explicit Neural Surfaces: Learning Continuous Geometry With Deformation Fields | Thomas Walker et.al. | 2306.02956v1 | null |
2023-06-05 | A Simple and Flexible Modeling for Mental Disorder Detection by Learning from Clinical Questionnaires | Hoyun Song et.al. | 2306.02955v1 | null |
2023-06-05 | Color-aware Deep Temporal Backdrop Duplex Matting System | Hendrik Hachmann et.al. | 2306.02954v1 | null |
2023-06-05 | INDigo: An INN-Guided Probabilistic Diffusion Algorithm for Inverse Problems | Di You et.al. | 2306.02949v1 | null |
2023-06-05 | Continual Learning with Pretrained Backbones by Tuning in the Input Space | Simone Marullo et.al. | 2306.02947v1 | null |
2023-06-05 | Human Spine Motion Capture using Perforated Kinesiology Tape | Hendrik Hachmann et.al. | 2306.02930v1 | link |