LayoutLM

language	license
en	mit

LayoutLM

Multimodal (text + layout/format + image) pre-training for document AI

Microsoft Document AI | GitHub

Model description

LayoutLM is a simple but effective pre-training method of text and layout for document image understanding and information extraction tasks, such as form understanding and receipt understanding. LayoutLM archives the SOTA results on multiple datasets. For more details, please refer to our paper:

LayoutLM: Pre-training of Text and Layout for Document Image Understanding Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, KDD 2020

Training data

We pre-train LayoutLM on IIT-CDIP Test Collection 1.0* dataset with two settings.

LayoutLM-Base, Uncased (11M documents, 2 epochs): 12-layer, 768-hidden, 12-heads, 113M parameters (This Model)
LayoutLM-Large, Uncased (11M documents, 2 epochs): 24-layer, 1024-hidden, 16-heads, 343M parameters

Citation

If you find LayoutLM useful in your research, please cite the following paper:

@misc{xu2019layoutlm,
    title={LayoutLM: Pre-training of Text and Layout for Document Image Understanding},
    author={Yiheng Xu and Minghao Li and Lei Cui and Shaohan Huang and Furu Wei and Ming Zhou},
    year={2019},
    eprint={1912.13318},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

modelee / layoutlm-base-uncased

LayoutLM

Model description

Training data

Citation

简介

发行版

贡献者

近期动态

modelee / layoutlm-base-uncased .gitee-modal { width: 500px !important; }

LayoutLM

Model description

Training data

Citation

简介

发行版

开源评估指数源自 OSS-Compass 评估体系，评估体系围绕以下三个维度对项目展开评估：

贡献者

近期动态

搜索帮助

modelee / layoutlm-base-uncased