New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add matter_full_text_uri
property to Matter
definition and impl
#162
Comments
What would be appropriate for matter I will take a look later myself. Just wanted to start the question / conversation. |
Great question! Yes the ingestion model would need to be updated as well to add the same property / attribute. Taking this meeting from legistar: https://seattle.legistar.com/MeetingDetail.aspx?ID=929921&GUID=3EB77948-2243-425A-9864-8CD868B96048&Options=&Search= And selecting the first council bill (CB 120263), we get to: https://seattle.legistar.com/LegislationDetail.aspx?ID=5448143&GUID=4F8010D6-BEBB-46AF-BE22-F579AD681B68&Options=&Search= I think what we want really just a link to that page / that above link since it has the full details. But if we wanted to get even more specific, I would say clicking "Reports" and then clicking "Legislation Text" or really any of the options gives us more of a "document view" like this: https://seattle.legistar.com/ViewReport.ashx?M=R&N=Text&GID=393&ID=4717976&GUID=660120D3-9C6F-4314-AFC7-A44217E71237&Title=Legislation+Text |
This is really a bigger deal because like.... currently we don't even store that info to CDP at all, here is the corresponding meeting page for that meeting on seattle staging: http://councildataproject.org/seattle-staging/#/events/f3351cc9822f notice that the minutes item CB 120263 doesnt have any attachments / documents. |
Do we need a separate field for the full text, or could it just be another
For this it's most likely failing due to a connection timeout or an error when making an http request. Since the only validation run on |
I guess we could add this as a |
Since the full text uri is kinda distinct from other Unless there are very discrete categories that we could classify |
Yea the benefit to query time is also a major plus. |
While looking through the frontend and design work on the legislation tracking project, I realized for the first time that I think we may be missing a crucial piece of information which is a link to the actual full matter text.
Current
Matter
is defined as: link but a matter definitely has full text and we should store a link to that full text. I proposefull_text_uri
or some varient of that.Additionally, while we look into this, it would be good to investigate which
MatterFile
's make it through the pipeline: https://github.com/CouncilDataProject/cdp-backend/blob/main/cdp_backend/pipeline/event_gather_pipeline.py#L1444I think the above try-except block may be dropping some
MatterFile
/MinutesItemFile
attachments that would be useful to keep and so we may want to try to fix it if we do see that behavior.The text was updated successfully, but these errors were encountered: