Adding Table Transformer models to HuggingFace Transformers #68

NielsRogge · 2022-09-06T15:17:51Z

Hi Table Transformer team :)

As I've implemented DETR in 🤗 HuggingFace Transformers a few months ago, it was relatively straightforward to port the 2 checkpoints you released. Here's a notebook that illustrates inference with DETR for table detection and table structure recognition: https://colab.research.google.com/drive/1lLRyBr7WraGdUJm-urUm_utArw6SkoCJ?usp=sharing

As you may or may not know, any model on the HuggingFace hub has its own Github repository. E.g. the DETR-table-detection checkpoint can be found here: https://huggingface.co/nielsr/detr-table-detection. If you check the "files and versions" tab, it includes the weights. The model hub uses git-LFS (large file storage) to use Git with large files such as model weights. This means that any model has its own Git commit history!

A model card can also be added to the repo, which is just a README.

Are you interested in joining the Microsoft organization on the hub, such that we can store all model checkpoints there (rather than under my user name)?

Also, it would be great to add PubTables-1M (and potentially other datasets, useful for improving AI on unstructured documents) to the 🤗 hub. Would you be up for that?

Let me know!

Kind regards,

Niels
ML Engineer @ HuggingFace

NielsRogge · 2022-09-07T09:22:37Z

Relevant to #17, #44, #59

NielsRogge · 2022-09-07T12:21:31Z

cc'ing @bsmock for visibility

light42 · 2022-09-08T09:18:03Z

@NielsRogge could you add another pipeline to turn the results into DataFrame?

NielsRogge · 2022-09-08T16:02:22Z

Hi,

The model isn't extracting any text from images. It's just an object detector, so it can detect where tables, table rows and columns etc. are in an image. You would need to crop the image based on the bounding box coordinates, after which you can run an OCR model (like TrOCR) to get the actual text.

salman-moh · 2022-09-08T19:53:39Z

Is TrOCR better than tesseract? I thought pytesseract would be the top option
Also is TSR part of your co-lab notebook from the MSFT team table-transformer or just the TD section?

Thanks so much both @NielsRogge & @bsmock.

bsmock · 2022-09-08T23:59:47Z

Hi Niels,

We're excited to see that you were able to successfully integrate the model into HuggingFace Transformers. Nice work--this is great! We will review what you've done to make sure it's all correct and see how we can get involved. Thanks!

Cheers,
Brandon

NielsRogge · 2022-09-09T11:45:11Z

Is TrOCR better than tesseract? I thought pytesseract would be the top option

TrOCR seems better than Tesseract, especially on handwritten text. However, the model does only work on single text-line images (unlike Tesseract which works on entire PDFs).

Also is TSR part of your co-lab notebook from the MSFT team table-transformer or just the TD section?

The Table Structure Recognition part is also from Microsoft (this repository shared 2 checkpoints; one for table detection and one for table structure recognition)

salman-moh · 2022-09-09T18:01:39Z

Any thoughts on #69 (comment) @NielsRogge ?

khadkechetan · 2022-09-12T07:36:56Z

@NielsRogge This is awesome. However I guess we have to work on getting data in csv. Let me know for any reference for the same.
Since this is the MIT license, we can use for commercial. Please confirm.

vallabh001 · 2022-09-12T14:32:24Z

Hey @NielsRogge, @bsmock, can you please help me with this ..
#70

bsmock · 2022-09-16T19:18:17Z

@NielsRogge, I joined the Microsoft organization. Do we need to transfer any assets there, or how do you normally organize the assets between your profile and the Microsoft org?

NielsRogge · 2022-09-22T08:40:18Z

Hi @bsmock,

Thanks for joining the Microsoft organization. I've opened a PR to add Table Transformer here: #18920.

Once reviewed, I'll transfer the checkpoints currently hosted under my name to microsoft. You'll have write access, meaning that you can add model cards, update repos etc.

yellowjs0304 · 2022-10-06T08:56:50Z

Hi, @NielsRogge Thank you for sharing nice work.
I tried your work.
But It returned different result compared with your works. I think the model loading is failed in some layers.

I don't know detailed reason, but maybe there's problem with transformer installation.
I didn't install your private transformer (branch add_table_transformer).
I just installed official transformer(v. 4.22.0.dev0 -> from pip install transformers).
Do i need to install by yours? Is this code is not reflected with official transformers?

Of course, the results is bad.

Could you please share any opinion about this?

salman-moh · 2022-10-06T09:37:12Z

Hi @yellowjs0304 I dont see why there is such a prediction when I just ran the entire notebook of Neils as is, and it works:

Please check his code again

yellowjs0304 · 2022-10-06T09:43:40Z

@salman-moh
Hi, Thank you for sharing opinion.
Could you please tell me how did you installed transformers? Did you used the command lines in jupyter notebook?

salman-moh · 2022-10-06T09:46:23Z

yes, exactly how Neils did. Btw, I also have created a HuggingFace Space if you want to take a look :) https://huggingface.co/spaces/SalML/TableTransformer2CSV
What it does is, paste an image with a table, it will detect and give you a dataframe that can be downloaded as a CSV.
and yes its also based on the 4.22dev version. Do not install by pip install transformers

yellowjs0304 · 2022-10-07T01:44:17Z

@salman-moh Hi, Thank you for fast reply... but, the problem is not solved yet..
I deleted and make new conda env, and followed Niels code too.

but the problem is still remain.. Could you please try it with new environment and same notebook file?
I'm not familiar with notebook, So maybe it's not correct idea. but, I think the command pip install -q ./transformers it might be related with official transformers.

emigomez · 2022-10-07T10:01:11Z

Hi! and thank you for sharing nice work...
I think the output of the TSR model does not fill in the individual cells grids of the table. I suppose that the solution to find these cells is to obtain them from rows and columns bounding boxes, someone has carried out this procedure?

salman-moh · 2022-10-07T10:09:48Z

yes, that is what I've done underneath the HF space, the code can be found here: https://github.com/salman-moh/TableTransformer2CSV/blob/salman-moh-patch-1/app.py or even from the HF repo

emigomez · 2022-10-11T11:06:18Z

Thank you @salman-moh, to share your work with us!!

Another question is whether it is possible to just launch the TSR without any TD (the input image to the TSR would be the table extracted). I try to run just the nielsr/detr-table-structure-recognition model, but the thing is that always the output has a table detected and is usually smaller than the original one, is it possible to run nielsr/detr-table-structure-recognition just to detect structure from an input table image?

This is an example where the input is directly the table, and as you can see the nielsr/detr-table-structure-recognition obtains a table to perform the TSR.

salman-moh · 2022-10-11T11:11:09Z

This is way offtopic for the issue. Please see some of the issues related to padding, in short, pad the entire 4 edges of the table and that might help.

emigomez · 2022-10-11T12:48:20Z

@salman-moh thank you for your response, I know that including padding can help with the table detection issues, but the question is if it is possible to just obtain the table structure of an input table image, without any detection. I mean, if there is a parameter or something to run nielsr/detr-table-structure-recognition to obtain a result of structure over the full image, like in the next example:

This is the result of the notebook https://colab.research.google.com/drive/1lLRyBr7WraGdUJm-urUm_utArw6SkoCJ?usp=sharing#scrollTo=HqaYj7M8PeZe for TSR with the same problem that I explained before:

I don't know if this is the best place to put the issue (sorry for that), but I write it here because I think that it refers more to the notebook than to the current repo

NielsRogge · 2022-10-18T13:43:12Z

Hi folks,

Table Transformer has now officially been added to 🤗 Transformers!

Check the docs here: https://huggingface.co/docs/transformers/main/en/model_doc/table-transformer

Checkpoints are now on the hub, as part of the Microsoft organization:

Demo notebooks can be found here: https://github.com/NielsRogge/Transformers-Tutorials/tree/master/Table%20Transformer

MartinHaus1993 · 2022-10-18T13:45:48Z

Hi @NielsRogge,

Thanks for this!! I saw that you have added TableTransformerForObjectDetection on huggingface.

I was trying to load this via

from transformers import TableTransformerForObjectDetection
model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-detection")

which fails:
ImportError: cannot import name 'TableTransformerForObjectDetection' from 'transformers' (/usr/local/lib/python3.10/site-packages/transformers/__init__.py)

I am using transformer 4.24.0.dev0, same as in the config.json here ( https://huggingface.co/microsoft/table-transformer-detection/blob/main/config.json ). Is this update not yet on the main transformer repo?

When using either "SalML/DETR-table-detection" or "nielsr/detr-table-detection", I run into the same issue like @yellowjs0304 and no table gets detected. However, when I revert back to transformer 4.22.0.dev0, both of them do work, but "microsoft/table-transformer-detection" keeps having the issue with

You are using a model of type table-transformer to instantiate a model of type detr. This is not supported for all configurations of models and can yield errors.
Some weights of the model checkpoint at microsoft/table-transformer-detection were not used when initializing DetrForObjectDetection: ['model.encoder.layernorm.weight', 'model.encoder.layernorm.bias']

TL:DR: "SalML/DETR-table-detection" and "nielsr/detr-table-detection" work with transformer 4.22.0.dev0 but not with transformer 4.24.0.dev0, "microsoft/table-transformer-detection" fails with both and TableTransformerForObjectDetection does not seem to be part of transformer 4.24.0.dev0 yet?

Is there any way this can be fixed?

Thanks!!

NielsRogge · 2022-10-18T13:48:42Z

Hi,

You need to install from the main branch; pip install -q git+https://github.com/huggingface/transformers.git as the model was just added and is not included in a new PyPi version yet.

Also, please don't use nielsr/detr-table-detection anymore, as that one will only work from my add_table_transformer branch.

MartinHaus1993 · 2022-10-18T13:54:36Z

Thanks @NielsRogge, saw that you have just added this now on the repo. Can confirm this works now with transformers 4.24.0.dev0 and the new TableTransformerForObjectDetection.

Thanks!

NielsRogge · 2022-10-18T13:57:34Z

Also @bsmock would be great to add a mention in the README of this repo, as people probably will find it helpful ;)

bsmock · 2022-11-17T19:40:45Z

@NielsRogge, finally getting a chance to circle back to this. First thing I'm wondering is what is the best way to put PubTables-1M onto the hub? Would you recommend just adding tar.gz files like we have currently? Or can the HuggingFace hub support such a large object detection dataset in a more database-like way? Any pointers/documentation you have on uploading large image datasets would be great!

YvanKOB · 2023-05-09T16:59:49Z

Hi @NielsRogge, is it possible to use the microsoft table-transformer pre-trained model to train it on my tables for a specific task? If yes, how to do it? Because I'm lost in the different documentations on HuggingFace or Github
Thanks in advance

pathikg · 2023-06-09T12:13:41Z

Hi @NielsRogge, is it possible to use the microsoft table-transformer pre-trained model to train it on my tables for a specific task? If yes, how to do it? Because I'm lost in the different documentations on HuggingFace or Github Thanks in advance

I think this tutorial might help
https://huggingface.co/docs/transformers/tasks/object_detection

prepare your dataset similar to CPPE-5 dataset as shown in the tutorial
replace checkpoint name with microsoft/table-transformer-detection since table transformer is identical to DETR
use everything as it is

aditya-ihx · 2023-08-24T10:22:22Z

@NielsRogge, finally getting a chance to circle back to this. First thing I'm wondering is what is the best way to put PubTables-1M onto the hub? Would you recommend just adding tar.gz files like we have currently? Or can the HuggingFace hub support such a large object detection dataset in a more database-like way? Any pointers/documentation you have on uploading large image datasets would be great!

Have you found a solution to this problem ? I am facing the same right now

NielsRogge · 2023-12-04T09:23:58Z

Hi folks, new checkpoints (with updated notebooks and demos) are now available at #158

mohammadreza-sheykhmousa · 2023-12-08T06:50:12Z

Hi @NielsRogge thanks for your work. To fine-tune the model on table detection I was wondering if I need to have the annotation in coco format? I appreciate if also let me know about the structure (in terms of directory) of input data -should it be
-data
----training
------images
------annotations
----val
------images
------annotations
----testing
------images
------annotations

NielsRogge · 2023-12-08T07:49:27Z

Hi,

Yes if you follow my fine-tuning notebook, your data needs to be in the COCO format.

Personally I'd recommend a tool like RoboFlow which allows to annotate data and export it into any format you prefer, like COCO.

SohamTolwala · 2024-04-03T06:50:29Z

Hi @NielsRogge I was using the inference notebook just till the part of table detection and cropping the table out. The issue I had was that the model was detecting false tables too, here is an example of what it detected (a complete blank area). It is detecting the required tables correctly although this is an issue I want to tackle, would fine tuning resolve this? Please do suggest on this, maybe any parameters that I can tweak?

NielsRogge closed this as completed Oct 18, 2022

NielsRogge mentioned this issue Nov 1, 2022

Fine-tuning Tutorial #72

Open

NielsRogge mentioned this issue Nov 18, 2022

Add PubTables-1M huggingface/datasets#5261

Open

Adding Table Transformer models to HuggingFace Transformers #68

Adding Table Transformer models to HuggingFace Transformers #68

Comments

NielsRogge commented Sep 6, 2022 • edited

NielsRogge commented Sep 7, 2022 • edited

NielsRogge commented Sep 7, 2022

light42 commented Sep 8, 2022

NielsRogge commented Sep 8, 2022

salman-moh commented Sep 8, 2022 • edited

bsmock commented Sep 8, 2022

NielsRogge commented Sep 9, 2022 • edited

salman-moh commented Sep 9, 2022

khadkechetan commented Sep 12, 2022

vallabh001 commented Sep 12, 2022

bsmock commented Sep 16, 2022

NielsRogge commented Sep 22, 2022 • edited

yellowjs0304 commented Oct 6, 2022 • edited

salman-moh commented Oct 6, 2022

yellowjs0304 commented Oct 6, 2022

salman-moh commented Oct 6, 2022

yellowjs0304 commented Oct 7, 2022

emigomez commented Oct 7, 2022 • edited

salman-moh commented Oct 7, 2022

emigomez commented Oct 11, 2022

salman-moh commented Oct 11, 2022

emigomez commented Oct 11, 2022 • edited

NielsRogge commented Oct 18, 2022 • edited

MartinHaus1993 commented Oct 18, 2022

NielsRogge commented Oct 18, 2022

MartinHaus1993 commented Oct 18, 2022

NielsRogge commented Oct 18, 2022

bsmock commented Nov 17, 2022

YvanKOB commented May 9, 2023

pathikg commented Jun 9, 2023

aditya-ihx commented Aug 24, 2023

NielsRogge commented Dec 4, 2023

mohammadreza-sheykhmousa commented Dec 8, 2023

NielsRogge commented Dec 8, 2023 • edited

SohamTolwala commented Apr 3, 2024

NielsRogge commented Sep 6, 2022 •

edited

NielsRogge commented Sep 7, 2022 •

edited

salman-moh commented Sep 8, 2022 •

edited

NielsRogge commented Sep 9, 2022 •

edited

NielsRogge commented Sep 22, 2022 •

edited

yellowjs0304 commented Oct 6, 2022 •

edited

emigomez commented Oct 7, 2022 •

edited

emigomez commented Oct 11, 2022 •

edited

NielsRogge commented Oct 18, 2022 •

edited

NielsRogge commented Dec 8, 2023 •

edited