Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training from wit.ai data causes exception in extractor.py #7676

Closed
sciguy14 opened this issue Jan 4, 2021 · 14 comments
Closed

Training from wit.ai data causes exception in extractor.py #7676

sciguy14 opened this issue Jan 4, 2021 · 14 comments
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/training-data Issues focused around Rasa training data (stories, NLU, domain, etc.) type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@sciguy14
Copy link

sciguy14 commented Jan 4, 2021

Rasa version: 2.2.2

Rasa SDK version (if used & relevant): 2.2.0

Rasa X version (if used & relevant): None

Python version: 3.8.5

Operating system (windows, osx, ...): WSL (Linux-4.4.0-19041-Microsoft-x86_64-with-glibc2.29)

Issue:
Following the instructions to convert my wit.ai data to Rasa training data results in a Python exception. I followed these instructions: https://rasa.com/docs/rasa/migrate-from/facebook-wit-ai-to-rasa/

My utterances-1.json file is present in the data/ folder as noted in the migration instructions.

Error (including full traceback):

(venv) jeremy@PHOENIX:/mnt/c/wit.ai to Rasa$ rasa train nlu
2021-01-03 21:14:05.237561: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-01-03 21:14:05.238107: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-01-03 21:14:07.937718: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-01-03 21:14:07.938268: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2021-01-03 21:14:07.938525: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (PHOENIX): /proc/driver/nvidia/version does not exist
The configuration for pipeline was chosen automatically. It was written into the config file at 'config.yml'.
Training NLU model...
2021-01-03 21:14:08 INFO     rasa.shared.nlu.training_data.training_data  - Training data stats:
2021-01-03 21:14:08 INFO     rasa.shared.nlu.training_data.training_data  - Number of intent examples: 1066 (20 distinct intents)

2021-01-03 21:14:08 INFO     rasa.shared.nlu.training_data.training_data  -   Found intents: 'introductions', 'date', 'snooze', 'play_news', 'prev_track', 'weather', 'play_music', 'out_of_scope', 'choose_music', 'curtains', 'stop_audio', 'goodnight', 'greetings', 'pause_audio', 'alarm', 'volume', 'lighting', 'time', 'next_track', 'wikipedia'
2021-01-03 21:14:08 INFO     rasa.shared.nlu.training_data.training_data  - Number of response examples: 0 (0 distinct responses)
2021-01-03 21:14:08 INFO     rasa.shared.nlu.training_data.training_data  - Number of entity examples: 706 (10 distinct entities)
2021-01-03 21:14:08 INFO     rasa.shared.nlu.training_data.training_data  -   Found entity types: 'relative_adjustment', 'wit$location', 'location_1', 'wit$wikipedia_search_query', 'wit$datetime', 'wit$number', 'artist', 'light_type_entity', 'mood', 'song'
2021-01-03 21:14:08 INFO     rasa.shared.nlu.training_data.training_data  -   Found entity roles: 'wikipedia_search_query', 'relative_adjustment', 'number', 'location', 'location_1', 'datetime', 'artist', 'light_type_entity', 'mood', 'song'
2021-01-03 21:14:08 WARNING  rasa.shared.utils.common  - The Entity Roles and Groups feature is currently experimental and might change or be removed in the future 🔬 Please share your feedback on it in the forum (https://forum.rasa.com) to help us make this feature ready for production.
/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/shared/utils/io.py:93: UserWarning: Intent 'out_of_scope' has only 1 training examples! Minimum is 2, training may fail.
2021-01-03 21:14:08 INFO     rasa.nlu.model  - Starting to train component WhitespaceTokenizer
2021-01-03 21:14:08 INFO     rasa.nlu.model  - Finished training component.
2021-01-03 21:14:08 INFO     rasa.nlu.model  - Starting to train component RegexFeaturizer
2021-01-03 21:14:08 INFO     rasa.nlu.model  - Finished training component.
2021-01-03 21:14:08 INFO     rasa.nlu.model  - Starting to train component LexicalSyntacticFeaturizer
2021-01-03 21:14:09 INFO     rasa.nlu.model  - Finished training component.
2021-01-03 21:14:09 INFO     rasa.nlu.model  - Starting to train component CountVectorsFeaturizer
2021-01-03 21:14:09 INFO     rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer  - 578 vocabulary slots consumed out of 1578 slots configured for text attribute.
2021-01-03 21:14:09 INFO     rasa.nlu.model  - Finished training component.
2021-01-03 21:14:09 INFO     rasa.nlu.model  - Starting to train component CountVectorsFeaturizer
2021-01-03 21:14:09 INFO     rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer  - 3866 vocabulary slots consumed out of 5799 slots configured for text attribute.
2021-01-03 21:14:09 INFO     rasa.nlu.model  - Finished training component.
2021-01-03 21:14:09 INFO     rasa.nlu.model  - Starting to train component DIETClassifier
Traceback (most recent call last):
  File "/mnt/c/wit.ai to Rasa/venv/bin/rasa", line 10, in <module>
    sys.exit(main())
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/__main__.py", line 116, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/cli/train.py", line 195, in train_nlu
    return train_nlu(
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/train.py", line 700, in train_nlu
    return rasa.utils.common.run_in_loop(
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/utils/common.py", line 308, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/train.py", line 749, in _train_nlu_async
    return await _train_nlu_with_validated_data(
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/train.py", line 811, in _train_nlu_with_validated_data
    await rasa.nlu.train(
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/nlu/train.py", line 116, in train
    interpreter = trainer.train(training_data, **kwargs)
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/nlu/model.py", line 209, in train
    updates = component.train(working_data, self.config, **context)
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/nlu/classifiers/diet_classifier.py", line 803, in train
    self.check_correct_entity_annotations(training_data)
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/nlu/extractors/extractor.py", line 418, in check_correct_entity_annotations
    entities_repr = [
  File "/mnt/c/wit.ai to Rasa/venv/lib/python3.8/site-packages/rasa/nlu/extractors/extractor.py", line 422, in <listcomp>
    entity[ENTITY_ATTRIBUTE_VALUE],
KeyError: 'value'

Command or request that led to error:

rasa train nlu

Content of configuration file (config.yml) (if relevant): N/A

Content of domain file (domain.yml) (if relevant): N/A

@sciguy14 sciguy14 added area:rasa-oss 🎡 Anything related to the open source Rasa framework type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. labels Jan 4, 2021
@sara-tagger
Copy link
Collaborator

Thanks for raising this issue, @degiz will get back to you about it soon✨

Please also check out the docs and the forum in case your issue was raised there too 🤗

@degiz
Copy link
Contributor

degiz commented Jan 22, 2021

cc @RasaHQ/enable-squad

@wochinge
Copy link
Contributor

@degiz Can you please add it our inbox in the future?

@sciguy14 Seems there is some issue with converting the entities correctly. Do you want to have a go solving it (I'd be happy to support you with the PR) or should we handle it?

@joejuzl joejuzl added the area:rasa-oss/training-data Issues focused around Rasa training data (stories, NLU, domain, etc.) label Jan 28, 2021
@stale
Copy link

stale bot commented Jul 21, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jul 21, 2021
@wochinge wochinge removed the stale label Jul 26, 2021
@roots-ai
Copy link

roots-ai commented Sep 18, 2021

@degiz Can you please add it our inbox in the future?

@sciguy14 Seems there is some issue with converting the entities correctly. Do you want to have a go solving it (I'd be happy to support you with the PR) or should we handle it?

Facing the same issue. Please help. @sara-tagger @wochinge

@wochinge
Copy link
Contributor

@roots-ai Would you mind creating a PR with the necessary fixes? Unfortunately the team is currently very busy with 3.0 so I don't think we can tackle this one soon ourselves

@roots-ai
Copy link

@roots-ai Would you mind creating a PR with the necessary fixes? Unfortunately the team is currently very busy with 3.0 so I don't think we can tackle this one soon ourselves
@wochinge
Can you please guide me; any ideas or suggestions on what are the possible reasons and how can I go about finding a fix?

@roots-ai
Copy link

@wochinge ?

@wochinge
Copy link
Contributor

It seems like we are missing to set the ENTITY_ATTRIBUTE_VALUE here. We'd probably just need to add the line e[ENTITY_ATTRIBUTE_ROLE] = e.pop("body")

@roots-ai
Copy link

roots-ai commented Oct 7, 2021

@roots-ai Would you mind creating a PR with the necessary fixes? Unfortunately the team is currently very busy with 3.0 so I don't think we can tackle this one soon ourselves

Any expected timeline for when it be possible for the team to fix this?

I installed RASA from the source and made the above changes and now it gives this new error

/root/miniconda3/lib/python3.8/site-packages/rasa/shared/utils/io.py:97: UserWarning: Entity role 'vent_adjective' has only 1 training examples! The minimum is 2, because of this the training may fail. /root/miniconda3/lib/python3.8/site-packages/rasa/shared/utils/io.py:97: UserWarning: Entity entity 'which_person' has only 1 training examples! The minimum is 2, because of this the training may fail. /root/miniconda3/lib/python3.8/site-packages/rasa/shared/utils/io.py:97: UserWarning: Entity role 'which_person' has only 1 training examples! The minimum is 2, because of this the training may fail. 2021-10-07 01:53:03 INFO rasa.nlu.model - Starting to train component WhitespaceTokenizer 2021-10-07 01:53:03 INFO rasa.nlu.model - Finished training component. 2021-10-07 01:53:03 INFO rasa.nlu.model - Starting to train component RegexFeaturizer 2021-10-07 01:53:03 INFO rasa.nlu.model - Finished training component. 2021-10-07 01:53:03 INFO rasa.nlu.model - Starting to train component LexicalSyntacticFeaturizer 2021-10-07 01:53:03 INFO rasa.nlu.model - Finished training component. 2021-10-07 01:53:03 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer 2021-10-07 01:53:03 INFO rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - 2014 vocabulary items were created for text attribute. 2021-10-07 01:53:04 INFO rasa.nlu.model - Finished training component. 2021-10-07 01:53:04 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer 2021-10-07 01:53:04 INFO rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - 8363 vocabulary items were created for text attribute. 2021-10-07 01:53:05 INFO rasa.nlu.model - Finished training component. 2021-10-07 01:53:05 INFO rasa.nlu.model - Starting to train component DIETClassifier /root/miniconda3/lib/python3.8/site-packages/rasa/utils/tensorflow/model_data_utils.py:395: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray np.array([v[0] for v in values]), number_of_dimensions=3 Traceback (most recent call last): File "/root/miniconda3/bin/rasa", line 8, in <module> sys.exit(main()) File "/root/miniconda3/lib/python3.8/site-packages/rasa/__main__.py", line 117, in main cmdline_arguments.func(cmdline_arguments) File "/root/miniconda3/lib/python3.8/site-packages/rasa/cli/train.py", line 196, in run_nlu_training return train_nlu( File "/root/miniconda3/lib/python3.8/site-packages/rasa/model_training.py", line 646, in train_nlu return rasa.utils.common.run_in_loop( File "/root/miniconda3/lib/python3.8/site-packages/rasa/utils/common.py", line 296, in run_in_loop result = loop.run_until_complete(f) File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete File "/root/miniconda3/lib/python3.8/site-packages/rasa/model_training.py", line 696, in train_nlu_async return await _train_nlu_with_validated_data( File "/root/miniconda3/lib/python3.8/site-packages/rasa/model_training.py", line 758, in _train_nlu_with_validated_data await rasa.nlu.train.train( File "/root/miniconda3/lib/python3.8/site-packages/rasa/nlu/train.py", line 111, in train interpreter = trainer.train(training_data, **kwargs) File "/root/miniconda3/lib/python3.8/site-packages/rasa/nlu/model.py", line 221, in train component.train(working_data, self.config, **context) File "/root/miniconda3/lib/python3.8/site-packages/rasa/nlu/classifiers/diet_classifier.py", line 846, in train self.check_correct_entity_annotations(training_data) File "/root/miniconda3/lib/python3.8/site-packages/rasa/nlu/extractors/extractor.py", line 459, in check_correct_entity_annotations entities_repr = [ File "/root/miniconda3/lib/python3.8/site-packages/rasa/nlu/extractors/extractor.py", line 463, in <listcomp> entity[ENTITY_ATTRIBUTE_VALUE], KeyError: 'value'

@wochinge
Copy link
Contributor

wochinge commented Oct 7, 2021

@roots-ai Sorry, as I said in my previous comment it's currently very busy and I can't provide a reliable timeline

@roots-ai
Copy link

roots-ai commented Oct 7, 2021

@wochinge Would be good to put this as a disclaimer on the migration from Wit.ai page so that others don't end up wasting their time with RASA.

@wochinge
Copy link
Contributor

wochinge commented Oct 13, 2021

Since you've asked so nicely I've created a PR with the necessary changes: #9864

@wochinge
Copy link
Contributor

Cutting a release today/tomorrow for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/training-data Issues focused around Rasa training data (stories, NLU, domain, etc.) type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

No branches or pull requests

6 participants