Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add matter_full_text_uri property to Matter definition and impl #162

Open
evamaxfield opened this issue Feb 9, 2022 · 7 comments
Open
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@evamaxfield
Copy link
Member

While looking through the frontend and design work on the legislation tracking project, I realized for the first time that I think we may be missing a crucial piece of information which is a link to the actual full matter text.

Current Matter is defined as: link but a matter definitely has full text and we should store a link to that full text. I propose full_text_uri or some varient of that.

Additionally, while we look into this, it would be good to investigate which MatterFile's make it through the pipeline: https://github.com/CouncilDataProject/cdp-backend/blob/main/cdp_backend/pipeline/event_gather_pipeline.py#L1444

I think the above try-except block may be dropping some MatterFile / MinutesItemFile attachments that would be useful to keep and so we may want to try to fix it if we do see that behavior.

@evamaxfield evamaxfield added enhancement New feature or request help wanted Extra attention is needed labels Feb 9, 2022
@dphoria
Copy link
Contributor

dphoria commented Feb 9, 2022

What would be appropriate for matter full_text_uri, in this example: (I'm presuming the corresponding ingestion_models.Matter must change as well)
https://seattle.legistar.com/MeetingDetail.aspx?ID=930274&GUID=903D2508-9840-4878-8334-1AEF77335BB8
https://gist.github.com/dphoria/3134769fe44686a82fdca2a55b822397

I will take a look later myself. Just wanted to start the question / conversation.

@evamaxfield
Copy link
Member Author

Great question! Yes the ingestion model would need to be updated as well to add the same property / attribute.

Taking this meeting from legistar: https://seattle.legistar.com/MeetingDetail.aspx?ID=929921&GUID=3EB77948-2243-425A-9864-8CD868B96048&Options=&Search=

And selecting the first council bill (CB 120263), we get to: https://seattle.legistar.com/LegislationDetail.aspx?ID=5448143&GUID=4F8010D6-BEBB-46AF-BE22-F579AD681B68&Options=&Search=

I think what we want really just a link to that page / that above link since it has the full details. But if we wanted to get even more specific, I would say clicking "Reports" and then clicking "Legislation Text" or really any of the options gives us more of a "document view" like this: https://seattle.legistar.com/ViewReport.ashx?M=R&N=Text&GID=393&ID=4717976&GUID=660120D3-9C6F-4314-AFC7-A44217E71237&Title=Legislation+Text

@evamaxfield
Copy link
Member Author

This is really a bigger deal because like.... currently we don't even store that info to CDP at all, here is the corresponding meeting page for that meeting on seattle staging: http://councildataproject.org/seattle-staging/#/events/f3351cc9822f

notice that the minutes item CB 120263 doesnt have any attachments / documents.

@isaacna
Copy link
Collaborator

isaacna commented Feb 10, 2022

Do we need a separate field for the full text, or could it just be another MatterFile? If we want to handle the full text differently in the UI than other MatterFiles than I'm all for adding full_text_uri, but otherwise I think it could be another MatterFile

I think the above try-except block may be dropping some MatterFile / MinutesItemFile attachments that would be useful to keep and so we may want to try to fix it if we do see that behavior.

For this it's most likely failing due to a connection timeout or an error when making an http request. Since the only validation run on MatterFile is resource_exists, I think it has to be one of these two

@evamaxfield
Copy link
Member Author

Do we need a separate field for the full text, or could it just be another MatterFile? If we want to handle the full text differently in the UI than other MatterFiles than I'm all for adding full_text_uri, but otherwise I think it could be another MatterFile

I guess we could add this as a MatterFile but there I feel like we would need to add an attribute of type or something? Something to signify what each MatterFile represents (i.e. just a report, an amendment, or the bill text)

@isaacna
Copy link
Collaborator

isaacna commented Feb 18, 2022

I guess we could add this as a MatterFile but there I feel like we would need to add an attribute of type or something? Something to signify what each MatterFile represents (i.e. just a report, an amendment, or the bill text)

Since the full text uri is kinda distinct from other MatterFile's, I think we could just add full_text_uri to Matter (also saves us a query if we want to fetch this for a specific Matter).

Unless there are very discrete categories that we could classify MatterFile into, I don't think we need MatterFile.type and name would be sufficient.

@evamaxfield
Copy link
Member Author

Yea the benefit to query time is also a major plus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants