-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Table Transformer models to HuggingFace Transformers #68
Comments
cc'ing @bsmock for visibility |
@NielsRogge could you add another pipeline to turn the results into DataFrame? |
Hi, The model isn't extracting any text from images. It's just an object detector, so it can detect where tables, table rows and columns etc. are in an image. You would need to crop the image based on the bounding box coordinates, after which you can run an OCR model (like TrOCR) to get the actual text. |
Is TrOCR better than tesseract? I thought pytesseract would be the top option Thanks so much both @NielsRogge & @bsmock. |
Hi Niels, We're excited to see that you were able to successfully integrate the model into HuggingFace Transformers. Nice work--this is great! We will review what you've done to make sure it's all correct and see how we can get involved. Thanks! Cheers, |
TrOCR seems better than Tesseract, especially on handwritten text. However, the model does only work on single text-line images (unlike Tesseract which works on entire PDFs).
The Table Structure Recognition part is also from Microsoft (this repository shared 2 checkpoints; one for table detection and one for table structure recognition) |
Any thoughts on #69 (comment) @NielsRogge ? |
@NielsRogge This is awesome. However I guess we have to work on getting data in csv. Let me know for any reference for the same. |
Hey @NielsRogge, @bsmock, can you please help me with this .. |
@NielsRogge, I joined the Microsoft organization. Do we need to transfer any assets there, or how do you normally organize the assets between your profile and the Microsoft org? |
Hi, @NielsRogge Thank you for sharing nice work. I don't know detailed reason, but maybe there's problem with transformer installation. Of course, the results is bad. Could you please share any opinion about this? |
Hi @yellowjs0304 I dont see why there is such a prediction when I just ran the entire notebook of Neils as is, and it works: |
@salman-moh |
yes, exactly how Neils did. Btw, I also have created a HuggingFace Space if you want to take a look :) https://huggingface.co/spaces/SalML/TableTransformer2CSV |
@salman-moh Hi, Thank you for fast reply... but, the problem is not solved yet.. but the problem is still remain.. Could you please try it with new environment and same notebook file? |
Hi! and thank you for sharing nice work... |
yes, that is what I've done underneath the HF space, the code can be found here: https://github.com/salman-moh/TableTransformer2CSV/blob/salman-moh-patch-1/app.py or even from the HF repo |
Thank you @salman-moh, to share your work with us!! Another question is whether it is possible to just launch the TSR without any TD (the input image to the TSR would be the table extracted). I try to run just the nielsr/detr-table-structure-recognition model, but the thing is that always the output has a table detected and is usually smaller than the original one, is it possible to run nielsr/detr-table-structure-recognition just to detect structure from an input table image? This is an example where the input is directly the table, and as you can see the nielsr/detr-table-structure-recognition obtains a table to perform the TSR. |
This is way offtopic for the issue. Please see some of the issues related to padding, in short, pad the entire 4 edges of the table and that might help. |
@salman-moh thank you for your response, I know that including padding can help with the table detection issues, but the question is if it is possible to just obtain the table structure of an input table image, without any detection. I mean, if there is a parameter or something to run nielsr/detr-table-structure-recognition to obtain a result of structure over the full image, like in the next example: This is the result of the notebook https://colab.research.google.com/drive/1lLRyBr7WraGdUJm-urUm_utArw6SkoCJ?usp=sharing#scrollTo=HqaYj7M8PeZe for TSR with the same problem that I explained before: I don't know if this is the best place to put the issue (sorry for that), but I write it here because I think that it refers more to the notebook than to the current repo |
Hi folks, Table Transformer has now officially been added to 馃 Transformers! Check the docs here: https://huggingface.co/docs/transformers/main/en/model_doc/table-transformer Checkpoints are now on the hub, as part of the Microsoft organization:
Demo notebooks can be found here: https://github.com/NielsRogge/Transformers-Tutorials/tree/master/Table%20Transformer |
Hi @NielsRogge, Thanks for this!! I saw that you have added TableTransformerForObjectDetection on huggingface. I was trying to load this via
which fails: I am using transformer 4.24.0.dev0, same as in the config.json here ( https://huggingface.co/microsoft/table-transformer-detection/blob/main/config.json ). Is this update not yet on the main transformer repo? When using either "SalML/DETR-table-detection" or "nielsr/detr-table-detection", I run into the same issue like @yellowjs0304 and no table gets detected. However, when I revert back to transformer 4.22.0.dev0, both of them do work, but "microsoft/table-transformer-detection" keeps having the issue with
TL:DR: "SalML/DETR-table-detection" and "nielsr/detr-table-detection" work with transformer 4.22.0.dev0 but not with transformer 4.24.0.dev0, "microsoft/table-transformer-detection" fails with both and TableTransformerForObjectDetection does not seem to be part of transformer 4.24.0.dev0 yet? Is there any way this can be fixed? Thanks!! |
Hi, You need to install from the main branch; Also, please don't use |
Thanks @NielsRogge, saw that you have just added this now on the repo. Can confirm this works now with transformers 4.24.0.dev0 and the new TableTransformerForObjectDetection. Thanks! |
Also @bsmock would be great to add a mention in the README of this repo, as people probably will find it helpful ;) |
@NielsRogge, finally getting a chance to circle back to this. First thing I'm wondering is what is the best way to put PubTables-1M onto the hub? Would you recommend just adding tar.gz files like we have currently? Or can the HuggingFace hub support such a large object detection dataset in a more database-like way? Any pointers/documentation you have on uploading large image datasets would be great! |
Hi @NielsRogge, is it possible to use the microsoft table-transformer pre-trained model to train it on my tables for a specific task? If yes, how to do it? Because I'm lost in the different documentations on HuggingFace or Github |
I think this tutorial might help
|
Have you found a solution to this problem ? I am facing the same right now |
Hi folks, new checkpoints (with updated notebooks and demos) are now available at #158 |
Hi @NielsRogge thanks for your work. To fine-tune the model on table detection I was wondering if I need to have the annotation in coco format? I appreciate if also let me know about the structure (in terms of directory) of input data -should it be |
Hi, Yes if you follow my fine-tuning notebook, your data needs to be in the COCO format. Personally I'd recommend a tool like RoboFlow which allows to annotate data and export it into any format you prefer, like COCO. |
Hi @NielsRogge I was using the inference notebook just till the part of table detection and cropping the table out. The issue I had was that the model was detecting false tables too, here is an example of what it detected (a complete blank area). It is detecting the required tables correctly although this is an issue I want to tackle, would fine tuning resolve this? Please do suggest on this, maybe any parameters that I can tweak? |
Hi Table Transformer team :)
As I've implemented DETR in 馃 HuggingFace Transformers a few months ago, it was relatively straightforward to port the 2 checkpoints you released. Here's a notebook that illustrates inference with DETR for table detection and table structure recognition: https://colab.research.google.com/drive/1lLRyBr7WraGdUJm-urUm_utArw6SkoCJ?usp=sharing
As you may or may not know, any model on the HuggingFace hub has its own Github repository. E.g. the DETR-table-detection checkpoint can be found here: https://huggingface.co/nielsr/detr-table-detection. If you check the "files and versions" tab, it includes the weights. The model hub uses git-LFS (large file storage) to use Git with large files such as model weights. This means that any model has its own Git commit history!
A model card can also be added to the repo, which is just a README.
Are you interested in joining the Microsoft organization on the hub, such that we can store all model checkpoints there (rather than under my user name)?
Also, it would be great to add PubTables-1M (and potentially other datasets, useful for improving AI on unstructured documents) to the 馃 hub. Would you be up for that?
Let me know!
Kind regards,
Niels
ML Engineer @ HuggingFace
The text was updated successfully, but these errors were encountered: