Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H-2662: Add Scraping glossary page and update Flows > Filters guide page #4431

Merged
merged 17 commits into from
May 20, 2024

Conversation

vilkinsons
Copy link
Member

@vilkinsons vilkinsons commented May 6, 2024

🌟 What is the purpose of this PR?

  • Adds a glossary page on scraping (web scraping)
  • Removes redundant, old topmatter from various other glossary pages
  • Updates the Guide > Flows > Filters page

@vilkinsons vilkinsons self-assigned this May 6, 2024
@github-actions github-actions bot added area/apps > hash.ai Affects the `hash.ai` informational site (app) area/apps labels May 6, 2024
@vilkinsons vilkinsons marked this pull request as ready for review May 17, 2024 17:12
@github-actions github-actions bot added the area/infra Relates to version control, CI, CD or IaC (area) label May 20, 2024
Co-authored-by: Ciaran Morinan <37743469+CiaranMn@users.noreply.github.com>
hashdotai
hashdotai previously approved these changes May 20, 2024
@vilkinsons vilkinsons enabled auto-merge May 20, 2024 14:27
@vilkinsons vilkinsons added this pull request to the merge queue May 20, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 20, 2024
@vilkinsons vilkinsons enabled auto-merge May 20, 2024 21:14
Copy link
Contributor

Benchmark results

@rust/graph-benches – Integrations

scaling_read_entity_complete_one_depth

Function Value Mean
entity_by_id 50 entities $$1.52 \mathrm{s} \pm 3.10 \mathrm{ms}\left({\color{red}493 \mathrm{\%}}\right) $$
entity_by_id 10 entities $$50.4 \mathrm{ms} \pm 163 \mathrm{μs}\left({\color{red}63.2 \mathrm{\%}}\right) $$
entity_by_id 1 entities $$20.9 \mathrm{ms} \pm 80.6 \mathrm{μs}\left({\color{gray}-0.508 \mathrm{\%}}\right) $$
entity_by_id 25 entities $$69.9 \mathrm{ms} \pm 285 \mathrm{μs}\left({\color{gray}-0.012 \mathrm{\%}}\right) $$
entity_by_id 5 entities $$25.1 \mathrm{ms} \pm 246 \mathrm{μs}\left({\color{gray}1.59 \mathrm{\%}}\right) $$

representative_read_entity

Function Value Mean
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1 $$16.7 \mathrm{ms} \pm 199 \mathrm{μs}\left({\color{gray}-0.268 \mathrm{\%}}\right) $$
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1 $$17.3 \mathrm{ms} \pm 204 \mathrm{μs}\left({\color{red}6.35 \mathrm{\%}}\right) $$
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1 $$16.8 \mathrm{ms} \pm 180 \mathrm{μs}\left({\color{gray}-0.558 \mathrm{\%}}\right) $$
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1 $$16.5 \mathrm{ms} \pm 166 \mathrm{μs}\left({\color{gray}0.069 \mathrm{\%}}\right) $$
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1 $$17.4 \mathrm{ms} \pm 215 \mathrm{μs}\left({\color{gray}3.64 \mathrm{\%}}\right) $$
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1 $$16.7 \mathrm{ms} \pm 196 \mathrm{μs}\left({\color{gray}2.66 \mathrm{\%}}\right) $$
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1 $$16.2 \mathrm{ms} \pm 191 \mathrm{μs}\left({\color{gray}2.32 \mathrm{\%}}\right) $$
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1 $$16.1 \mathrm{ms} \pm 179 \mathrm{μs}\left({\color{red}7.08 \mathrm{\%}}\right) $$
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2 $$17.0 \mathrm{ms} \pm 237 \mathrm{μs}\left({\color{gray}4.26 \mathrm{\%}}\right) $$

representative_read_multiple_entities

Function Value Mean
link_by_source_by_property depths: DT=255, PT=255, ET=255, E=255 $$2.00 \mathrm{s} \pm 7.47 \mathrm{ms}\left({\color{gray}0.204 \mathrm{\%}}\right) $$
link_by_source_by_property depths: DT=2, PT=2, ET=2, E=2 $$1.06 \mathrm{s} \pm 3.34 \mathrm{ms}\left({\color{gray}0.020 \mathrm{\%}}\right) $$
link_by_source_by_property depths: DT=0, PT=2, ET=2, E=2 $$1.04 \mathrm{s} \pm 6.53 \mathrm{ms}\left({\color{gray}-0.665 \mathrm{\%}}\right) $$
link_by_source_by_property depths: DT=0, PT=0, ET=0, E=2 $$95.5 \mathrm{ms} \pm 621 \mathrm{μs}\left({\color{gray}-0.900 \mathrm{\%}}\right) $$
link_by_source_by_property depths: DT=0, PT=0, ET=2, E=2 $$417 \mathrm{ms} \pm 1.51 \mathrm{ms}\left({\color{gray}-0.510 \mathrm{\%}}\right) $$
link_by_source_by_property depths: DT=0, PT=0, ET=0, E=0 $$59.7 \mathrm{ms} \pm 273 \mathrm{μs}\left({\color{gray}-2.009 \mathrm{\%}}\right) $$
entity_by_property depths: DT=255, PT=255, ET=255, E=255 $$2.86 \mathrm{s} \pm 11.4 \mathrm{ms}\left({\color{gray}0.353 \mathrm{\%}}\right) $$
entity_by_property depths: DT=2, PT=2, ET=2, E=2 $$994 \mathrm{ms} \pm 6.74 \mathrm{ms}\left({\color{gray}2.06 \mathrm{\%}}\right) $$
entity_by_property depths: DT=0, PT=2, ET=2, E=2 $$966 \mathrm{ms} \pm 2.55 \mathrm{ms}\left({\color{gray}-2.699 \mathrm{\%}}\right) $$
entity_by_property depths: DT=0, PT=0, ET=0, E=2 $$40.2 \mathrm{ms} \pm 316 \mathrm{μs}\left({\color{gray}0.377 \mathrm{\%}}\right) $$
entity_by_property depths: DT=0, PT=0, ET=2, E=2 $$356 \mathrm{ms} \pm 1.64 \mathrm{ms}\left({\color{gray}-1.958 \mathrm{\%}}\right) $$
entity_by_property depths: DT=0, PT=0, ET=0, E=0 $$36.0 \mathrm{ms} \pm 192 \mathrm{μs}\left({\color{gray}-0.649 \mathrm{\%}}\right) $$

representative_read_entity_type

Function Value Mean
get_entity_type_by_id Account ID: d4e16033-c281-4cde-aa35-9085bf2e7579 $$1.37 \mathrm{ms} \pm 9.15 \mathrm{μs}\left({\color{gray}-0.319 \mathrm{\%}}\right) $$

scaling_read_entity_linkless

Function Value Mean
entity_by_id 1000 entities $$3.26 \mathrm{ms} \pm 15.0 \mathrm{μs}\left({\color{gray}0.173 \mathrm{\%}}\right) $$
entity_by_id 100 entities $$2.52 \mathrm{ms} \pm 9.93 \mathrm{μs}\left({\color{gray}-1.884 \mathrm{\%}}\right) $$
entity_by_id 10 entities $$2.41 \mathrm{ms} \pm 11.0 \mathrm{μs}\left({\color{gray}-0.500 \mathrm{\%}}\right) $$
entity_by_id 10000 entities $$13.1 \mathrm{ms} \pm 137 \mathrm{μs}\left({\color{gray}-1.421 \mathrm{\%}}\right) $$
entity_by_id 1 entities $$2.43 \mathrm{ms} \pm 13.0 \mathrm{μs}\left({\color{gray}-0.227 \mathrm{\%}}\right) $$

scaling_read_entity_complete_zero_depth

Function Value Mean
entity_by_id 50 entities $$4.47 \mathrm{ms} \pm 20.6 \mathrm{μs}\left({\color{gray}0.012 \mathrm{\%}}\right) $$
entity_by_id 10 entities $$2.70 \mathrm{ms} \pm 21.2 \mathrm{μs}\left({\color{gray}-0.269 \mathrm{\%}}\right) $$
entity_by_id 1 entities $$2.47 \mathrm{ms} \pm 16.5 \mathrm{μs}\left({\color{gray}-0.439 \mathrm{\%}}\right) $$
entity_by_id 25 entities $$3.29 \mathrm{ms} \pm 89.1 \mathrm{μs}\left({\color{gray}4.01 \mathrm{\%}}\right) $$
entity_by_id 5 entities $$2.50 \mathrm{ms} \pm 12.5 \mathrm{μs}\left({\color{gray}-0.516 \mathrm{\%}}\right) $$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apps > hash.ai Affects the `hash.ai` informational site (app) area/apps area/infra Relates to version control, CI, CD or IaC (area)
Development

Successfully merging this pull request may close these issues.

None yet

3 participants