Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Table Transformer models to HuggingFace Transformers #68

Closed
NielsRogge opened this issue Sep 6, 2022 · 35 comments
Closed

Adding Table Transformer models to HuggingFace Transformers #68

NielsRogge opened this issue Sep 6, 2022 · 35 comments

Comments

@NielsRogge
Copy link

NielsRogge commented Sep 6, 2022

Hi Table Transformer team :)

As I've implemented DETR in 馃 HuggingFace Transformers a few months ago, it was relatively straightforward to port the 2 checkpoints you released. Here's a notebook that illustrates inference with DETR for table detection and table structure recognition: https://colab.research.google.com/drive/1lLRyBr7WraGdUJm-urUm_utArw6SkoCJ?usp=sharing

As you may or may not know, any model on the HuggingFace hub has its own Github repository. E.g. the DETR-table-detection checkpoint can be found here: https://huggingface.co/nielsr/detr-table-detection. If you check the "files and versions" tab, it includes the weights. The model hub uses git-LFS (large file storage) to use Git with large files such as model weights. This means that any model has its own Git commit history!

A model card can also be added to the repo, which is just a README.

Are you interested in joining the Microsoft organization on the hub, such that we can store all model checkpoints there (rather than under my user name)?

Also, it would be great to add PubTables-1M (and potentially other datasets, useful for improving AI on unstructured documents) to the 馃 hub. Would you be up for that?

Let me know!

Kind regards,

Niels
ML Engineer @ HuggingFace

@NielsRogge
Copy link
Author

NielsRogge commented Sep 7, 2022

Relevant to #17, #44, #59

@NielsRogge
Copy link
Author

cc'ing @bsmock for visibility

@light42
Copy link

light42 commented Sep 8, 2022

@NielsRogge could you add another pipeline to turn the results into DataFrame?

@NielsRogge
Copy link
Author

Hi,

The model isn't extracting any text from images. It's just an object detector, so it can detect where tables, table rows and columns etc. are in an image. You would need to crop the image based on the bounding box coordinates, after which you can run an OCR model (like TrOCR) to get the actual text.

@salman-moh
Copy link

salman-moh commented Sep 8, 2022

Is TrOCR better than tesseract? I thought pytesseract would be the top option
Also is TSR part of your co-lab notebook from the MSFT team table-transformer or just the TD section?

Thanks so much both @NielsRogge & @bsmock.

@bsmock
Copy link
Collaborator

bsmock commented Sep 8, 2022

Hi Niels,

We're excited to see that you were able to successfully integrate the model into HuggingFace Transformers. Nice work--this is great! We will review what you've done to make sure it's all correct and see how we can get involved. Thanks!

Cheers,
Brandon

@NielsRogge
Copy link
Author

NielsRogge commented Sep 9, 2022

Is TrOCR better than tesseract? I thought pytesseract would be the top option

TrOCR seems better than Tesseract, especially on handwritten text. However, the model does only work on single text-line images (unlike Tesseract which works on entire PDFs).

Also is TSR part of your co-lab notebook from the MSFT team table-transformer or just the TD section?

The Table Structure Recognition part is also from Microsoft (this repository shared 2 checkpoints; one for table detection and one for table structure recognition)

@salman-moh
Copy link

Any thoughts on #69 (comment) @NielsRogge ?

@khadkechetan
Copy link

@NielsRogge This is awesome. However I guess we have to work on getting data in csv. Let me know for any reference for the same.
Since this is the MIT license, we can use for commercial. Please confirm.

image

@vallabh001
Copy link

Hey @NielsRogge, @bsmock, can you please help me with this ..
#70

@bsmock
Copy link
Collaborator

bsmock commented Sep 16, 2022

@NielsRogge, I joined the Microsoft organization. Do we need to transfer any assets there, or how do you normally organize the assets between your profile and the Microsoft org?

@NielsRogge
Copy link
Author

NielsRogge commented Sep 22, 2022

Hi @bsmock,

Thanks for joining the Microsoft organization. I've opened a PR to add Table Transformer here: #18920.

Once reviewed, I'll transfer the checkpoints currently hosted under my name to microsoft. You'll have write access, meaning that you can add model cards, update repos etc.

@yellowjs0304
Copy link

yellowjs0304 commented Oct 6, 2022

Hi, @NielsRogge Thank you for sharing nice work.
I tried your work.
But It returned different result compared with your works. I think the model loading is failed in some layers.
image

I don't know detailed reason, but maybe there's problem with transformer installation.
I didn't install your private transformer (branch add_table_transformer).
I just installed official transformer(v. 4.22.0.dev0 -> from pip install transformers).
Do i need to install by yours? Is this code is not reflected with official transformers?

Of course, the results is bad.
image

Could you please share any opinion about this?

@salman-moh
Copy link

Hi @yellowjs0304 I dont see why there is such a prediction when I just ran the entire notebook of Neils as is, and it works:
image
Please check his code again

@yellowjs0304
Copy link

@salman-moh
Hi, Thank you for sharing opinion.
Could you please tell me how did you installed transformers? Did you used the command lines in jupyter notebook?

@salman-moh
Copy link

yes, exactly how Neils did. Btw, I also have created a HuggingFace Space if you want to take a look :) https://huggingface.co/spaces/SalML/TableTransformer2CSV
What it does is, paste an image with a table, it will detect and give you a dataframe that can be downloaded as a CSV.
and yes its also based on the 4.22dev version. Do not install by pip install transformers

@yellowjs0304
Copy link

@salman-moh Hi, Thank you for fast reply... but, the problem is not solved yet..
I deleted and make new conda env, and followed Niels code too.
image

but the problem is still remain.. Could you please try it with new environment and same notebook file?
I'm not familiar with notebook, So maybe it's not correct idea. but, I think the command pip install -q ./transformers it might be related with official transformers.

@emigomez
Copy link

emigomez commented Oct 7, 2022

Hi! and thank you for sharing nice work...
I think the output of the TSR model does not fill in the individual cells grids of the table. I suppose that the solution to find these cells is to obtain them from rows and columns bounding boxes, someone has carried out this procedure?

@salman-moh
Copy link

yes, that is what I've done underneath the HF space, the code can be found here: https://github.com/salman-moh/TableTransformer2CSV/blob/salman-moh-patch-1/app.py or even from the HF repo

@emigomez
Copy link

Thank you @salman-moh, to share your work with us!!

Another question is whether it is possible to just launch the TSR without any TD (the input image to the TSR would be the table extracted). I try to run just the nielsr/detr-table-structure-recognition model, but the thing is that always the output has a table detected and is usually smaller than the original one, is it possible to run nielsr/detr-table-structure-recognition just to detect structure from an input table image?

ex

This is an example where the input is directly the table, and as you can see the nielsr/detr-table-structure-recognition obtains a table to perform the TSR.

@salman-moh
Copy link

This is way offtopic for the issue. Please see some of the issues related to padding, in short, pad the entire 4 edges of the table and that might help.

@emigomez
Copy link

emigomez commented Oct 11, 2022

@salman-moh thank you for your response, I know that including padding can help with the table detection issues, but the question is if it is possible to just obtain the table structure of an input table image, without any detection. I mean, if there is a parameter or something to run nielsr/detr-table-structure-recognition to obtain a result of structure over the full image, like in the next example:
19

This is the result of the notebook https://colab.research.google.com/drive/1lLRyBr7WraGdUJm-urUm_utArw6SkoCJ?usp=sharing#scrollTo=HqaYj7M8PeZe for TSR with the same problem that I explained before:
descarga

I don't know if this is the best place to put the issue (sorry for that), but I write it here because I think that it refers more to the notebook than to the current repo

@NielsRogge
Copy link
Author

NielsRogge commented Oct 18, 2022

Hi folks,

Table Transformer has now officially been added to 馃 Transformers!

Check the docs here: https://huggingface.co/docs/transformers/main/en/model_doc/table-transformer

Checkpoints are now on the hub, as part of the Microsoft organization:

Demo notebooks can be found here: https://github.com/NielsRogge/Transformers-Tutorials/tree/master/Table%20Transformer

@MartinHaus1993
Copy link

Hi @NielsRogge,

Thanks for this!! I saw that you have added TableTransformerForObjectDetection on huggingface.

I was trying to load this via

from transformers import TableTransformerForObjectDetection
model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-detection")

which fails:
ImportError: cannot import name 'TableTransformerForObjectDetection' from 'transformers' (/usr/local/lib/python3.10/site-packages/transformers/__init__.py)

I am using transformer 4.24.0.dev0, same as in the config.json here ( https://huggingface.co/microsoft/table-transformer-detection/blob/main/config.json ). Is this update not yet on the main transformer repo?

When using either "SalML/DETR-table-detection" or "nielsr/detr-table-detection", I run into the same issue like @yellowjs0304 and no table gets detected. However, when I revert back to transformer 4.22.0.dev0, both of them do work, but "microsoft/table-transformer-detection" keeps having the issue with

You are using a model of type table-transformer to instantiate a model of type detr. This is not supported for all configurations of models and can yield errors.
Some weights of the model checkpoint at microsoft/table-transformer-detection were not used when initializing DetrForObjectDetection: ['model.encoder.layernorm.weight', 'model.encoder.layernorm.bias'] 

TL:DR: "SalML/DETR-table-detection" and "nielsr/detr-table-detection" work with transformer 4.22.0.dev0 but not with transformer 4.24.0.dev0, "microsoft/table-transformer-detection" fails with both and TableTransformerForObjectDetection does not seem to be part of transformer 4.24.0.dev0 yet?

Is there any way this can be fixed?

Thanks!!

@NielsRogge
Copy link
Author

Hi,

You need to install from the main branch; pip install -q git+https://github.com/huggingface/transformers.git as the model was just added and is not included in a new PyPi version yet.

Also, please don't use nielsr/detr-table-detection anymore, as that one will only work from my add_table_transformer branch.

@MartinHaus1993
Copy link

Thanks @NielsRogge, saw that you have just added this now on the repo. Can confirm this works now with transformers 4.24.0.dev0 and the new TableTransformerForObjectDetection.

Thanks!

@NielsRogge
Copy link
Author

Also @bsmock would be great to add a mention in the README of this repo, as people probably will find it helpful ;)

@bsmock
Copy link
Collaborator

bsmock commented Nov 17, 2022

@NielsRogge, finally getting a chance to circle back to this. First thing I'm wondering is what is the best way to put PubTables-1M onto the hub? Would you recommend just adding tar.gz files like we have currently? Or can the HuggingFace hub support such a large object detection dataset in a more database-like way? Any pointers/documentation you have on uploading large image datasets would be great!

@YvanKOB
Copy link

YvanKOB commented May 9, 2023

Hi @NielsRogge, is it possible to use the microsoft table-transformer pre-trained model to train it on my tables for a specific task? If yes, how to do it? Because I'm lost in the different documentations on HuggingFace or Github
Thanks in advance

@pathikg
Copy link

pathikg commented Jun 9, 2023

Hi @NielsRogge, is it possible to use the microsoft table-transformer pre-trained model to train it on my tables for a specific task? If yes, how to do it? Because I'm lost in the different documentations on HuggingFace or Github Thanks in advance

I think this tutorial might help
https://huggingface.co/docs/transformers/tasks/object_detection

  • prepare your dataset similar to CPPE-5 dataset as shown in the tutorial
  • replace checkpoint name with microsoft/table-transformer-detection since table transformer is identical to DETR
  • use everything as it is

@aditya-ihx
Copy link

@NielsRogge, finally getting a chance to circle back to this. First thing I'm wondering is what is the best way to put PubTables-1M onto the hub? Would you recommend just adding tar.gz files like we have currently? Or can the HuggingFace hub support such a large object detection dataset in a more database-like way? Any pointers/documentation you have on uploading large image datasets would be great!

Have you found a solution to this problem ? I am facing the same right now

@NielsRogge
Copy link
Author

Hi folks, new checkpoints (with updated notebooks and demos) are now available at #158

@mohammadreza-sheykhmousa

Hi @NielsRogge thanks for your work. To fine-tune the model on table detection I was wondering if I need to have the annotation in coco format? I appreciate if also let me know about the structure (in terms of directory) of input data -should it be
-data
----training
------images
------annotations
----val
------images
------annotations
----testing
------images
------annotations

@NielsRogge
Copy link
Author

NielsRogge commented Dec 8, 2023

Hi,

Yes if you follow my fine-tuning notebook, your data needs to be in the COCO format.

Personally I'd recommend a tool like RoboFlow which allows to annotate data and export it into any format you prefer, like COCO.

@SohamTolwala
Copy link

Hi @NielsRogge I was using the inference notebook just till the part of table detection and cropping the table out. The issue I had was that the model was detecting false tables too, here is an example of what it detected (a complete blank area). It is detecting the required tables correctly although this is an issue I want to tackle, would fine tuning resolve this? Please do suggest on this, maybe any parameters that I can tweak?
table_0_page_3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests