OCR-paper-arxiv-daily latest papers

Automated deployment @ 2023-06-07 08:05:21 Asia/Shanghai

Welcome to contribute! Add your topics and keywords in topic.yml. You can also view historical data through the storage.

OCR

Publish Date	Title	Authors	PDF	Code
2023-06-05	Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents	David Kreuzer et.al.	2306.02815v1	null
2023-06-03	TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain	Sagar Chakraborty et.al.	2306.02142v1	link
2023-06-02	DocFormerv2: Local Features for Document Understanding	Srikar Appalaraju et.al.	2306.01733v1	null
2023-06-01	Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering	Wenjin Wang et.al.	2306.00526v1	link
2023-05-31	Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model	Haisong Ding et.al.	2305.19543v1	null
2023-05-30	DuoSearch: A Novel Search Engine for Bulgarian Historical Documents	Angel Beshirov et.al.	2305.19392v1	link
2023-05-29	GlyphControl: Glyph Conditional Control for Visual Text Generation	Yukang Yang et.al.	2305.18259v1	link
2023-05-28	FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions	Noam Rotstein et.al.	2305.17718v1	link
2023-05-27	Exploring Better Text Image Translation with Multimodal Codebook	Zhibin Lan et.al.	2305.17415v2	link
2023-05-27	Super-Resolution of License Plate Images Using Attention Modules and Sub-Pixel Convolution Layers	Valfride Nascimento et.al.	2305.17313v1	link
2023-05-26	People and Places of Historical Europe: Bootstrapping Annotation Pipeline and a New Corpus of Named Entities in Late Medieval Texts	Vít Novotný et.al.	2305.16718v1	null
2023-05-24	Quantifying Character Similarity with Vision Transformers	Xinmei Yang et.al.	2305.14672v1	link
2023-05-21	Measuring Intersectional Biases in Historical Documents	Nadav Borenstein et.al.	2305.12376v1	link
2023-05-19	XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages	Sebastian Ruder et.al.	2305.11938v2	link
2023-05-18	TextDiffuser: Diffusion Models as Text Painters	Jingye Chen et.al.	2305.10855v2	link
2023-05-16	Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding	Shuwei Feng et.al.	2305.10448v1	null
2023-05-16	Mobile User Interface Element Detection Via Adaptively Prompt Tuning	Zhangxuan Gu et.al.	2305.09699v1	link
2023-05-13	On the Hidden Mystery of OCR in Large Multimodal Models	Yuliang Liu et.al.	2305.07895v2	link
2023-05-12	Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution	Jianfeng Kuang et.al.	2305.07498v1	link
2023-05-11	Combining OCR Models for Reading Early Modern Printed Books	Mathias Seuret et.al.	2305.07131v1	link
2023-05-09	E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation	Cong Ma et.al.	2305.05166v2	link
2023-05-04	Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation	Renshen Wang et.al.	2305.02577v1	null
2023-05-03	Evaluating BERT-based Scientific Relation Classifiers for Scholarly Knowledge Graph Construction on Digital Library Collections	Ming Jiang et.al.	2305.02291v1	null
2023-04-28	LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model	Peng Gao et.al.	2304.15010v1	link
2023-04-24	DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents	Mohamed Dhouib et.al.	2304.12484v2	null
2023-04-24	ICDAR 2023 Competition on Reading the Seal Title	Wenwen Yu et.al.	2304.11966v2	null
2023-04-17	Multimodal Short Video Rumor Detection System Based on Contrastive Learning	Yuxing Yang et.al.	2304.08401v3	null
2023-04-15	TransDocs: Optical Character Recognition with word to word translation	Abhishek Bamotra et.al.	2304.07637v1	link
2023-04-07	Linking Representations with Multimodal Contrastive Learning	Abhishek Arora et.al.	2304.03464v2	null
2023-04-07	Cleansing Jewel: A Neural Spelling Correction Model Built On Google OCR-ed Tibetan Manuscripts	Queenie Luo et.al.	2304.03427v1	null

scene text

Publish Date	Title	Authors	PDF	Code
2023-06-05	Neuralangelo: High-Fidelity Neural Surface Reconstruction	Zhaoshuo Li et.al.	2306.03092v1	null
2023-06-05	Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models	Andrew F. Luo et.al.	2306.03089v1	null
2023-06-05	Machine Learning and Statistical Approaches to Measuring Similarity of Political Parties	Daria Boratyn et.al.	2306.03079v1	null
2023-06-05	Interactive Editing for Text Summarization	Yujia Xie et.al.	2306.03067v1	link
2023-06-05	Of Mice and Mates: Automated Classification and Modelling of Mouse Behaviour in Groups using a Single Model across Cages	Michael P. J. Camilleri et.al.	2306.03066v1	null
2023-06-05	Structured Voronoi Sampling	Afra Amini et.al.	2306.03061v1	null
2023-06-05	ELEV-VISION: Automated Lowest Floor Elevation Estimation from Segmenting Street View Images	Yu-Hsuan Ho et.al.	2306.03050v1	null
2023-06-05	Designing Equilibria in Concurrent Games with Social Welfare and Temporal Logic Constraints	Julian Gutierrez et.al.	2306.03045v1	null
2023-06-05	HeadSculpt: Crafting 3D Head Avatars with Text	Xiao Han et.al.	2306.03038v1	null
2023-06-05	Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination	Yang Li et.al.	2306.03034v1	null
2023-06-05	Interpretable Alzheimer's Disease Classification Via a Contrastive Diffusion Autoencoder	Ayodeji Ijishakin et.al.	2306.03022v1	null
2023-06-05	Automating Style Analysis and Visualization With Explainable AI -- Case Studies on Brand Recognition	Yu-hsuan Chen et.al.	2306.03021v1	link
2023-06-05	Using Sequences of Life-events to Predict Human Lives	Germans Savcisens et.al.	2306.03009v1	null
2023-06-05	Nonparametric Iterative Machine Teaching	Chen Zhang et.al.	2306.03007v1	null
2023-06-05	Unveiling the Two-Faced Truth: Disentangling Morphed Identities for Face Morphing Detection	Eduarda Caldeira et.al.	2306.03002v1	link
2023-06-05	BeyondPixels: A Comprehensive Review of the Evolution of Neural Radiance Fields	AKM Shahariar Azad Rabby et.al.	2306.03000v1	null
2023-06-05	Long-range UAV Thermal Geo-localization with Satellite Imagery	Jiuhong Xiao et.al.	2306.02994v1	link
2023-06-05	Second-scale rotational coherence and dipolar interactions in a gas of ultracold polar molecules	Philip D. Gregory et.al.	2306.02991v1	null
2023-06-05	Integrated Sensing, Computation, and Communication for UAV-assisted Federated Edge Learning	Yao Tang et.al.	2306.02990v1	null
2023-06-05	Brain tumor segmentation using synthetic MR images -- A comparison of GANs and diffusion models	Muhammad Usman Akbar et.al.	2306.02986v1	null
2023-06-05	A Term-based Approach for Generating Finite Automata from Interaction Diagrams	Erwan Mahe et.al.	2306.02983v1	null
2023-06-05	Which Argumentative Aspects of Hate Speech in Social Media can be reliably identified?	Damián Furman et.al.	2306.02978v1	link
2023-06-05	Best of Both Worlds: Hybrid SNN-ANN Architecture for Event-based Optical Flow Estimation	Shubham Negi et.al.	2306.02960v1	null
2023-06-05	Complex Preferences for Different Convergent Priors in Discrete Graph Diffusion	Alex M. Tseng et.al.	2306.02957v1	null
2023-06-05	Explicit Neural Surfaces: Learning Continuous Geometry With Deformation Fields	Thomas Walker et.al.	2306.02956v1	null
2023-06-05	A Simple and Flexible Modeling for Mental Disorder Detection by Learning from Clinical Questionnaires	Hoyun Song et.al.	2306.02955v1	null
2023-06-05	Color-aware Deep Temporal Backdrop Duplex Matting System	Hendrik Hachmann et.al.	2306.02954v1	null
2023-06-05	INDigo: An INN-Guided Probabilistic Diffusion Algorithm for Inverse Problems	Di You et.al.	2306.02949v1	null
2023-06-05	Continual Learning with Pretrained Backbones by Tuning in the Input Space	Simone Marullo et.al.	2306.02947v1	null
2023-06-05	Human Spine Motion Capture using Perforated Kinesiology Tape	Hendrik Hachmann et.al.	2306.02930v1	link