Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solr reindex Jun 2019 #9

Open
8 of 10 tasks
cdrini opened this issue Jun 9, 2019 · 0 comments
Open
8 of 10 tasks

Solr reindex Jun 2019 #9

cdrini opened this issue Jun 9, 2019 · 0 comments

Comments

@cdrini
Copy link
Owner

cdrini commented Jun 9, 2019

  • Step 1: Create a local postgres copy of the database
    • 1a: Create the postgres instance
    • 1b: Populate the postgres instance
  • Step 2: Populate solr
    • 2a: Setup
    • 2b: Insert works & orphaned editions
    • 2c: Insert authors
    • 2d: Insert subjects
  • Step 3: Final Sync
    • 3a: Run solrupdater
Step Time taken
Step 1: Create a local postgres copy of the database 03:30
-- 1a: Create the postgres instance 00:01
-- 1b: Populate the postgres instance 04:27
---- Downloading the dump 00:03
---- Counting Rows 00:06
---- Sleeping between chunks 00:18
---- Actual import 02:05
---- Creating indices 01:00
Step 2: Populate solr
-- 2a: Setup 00:05
-- 2b: Insert works & orphaned editions
---- Offset startup 00:05
---- Works/orphans import (6 cores) 26:45
-- 2c: Insert authors 11:30
---- Offset startup 00:05
---- Authors import (6 cores) 11:15
-- 2d: Insert subjects (in parallel w 2b) 03:34
Step 3: Final Sync
-- 3a: Run solrupdater

Numbers Validation (old solr dump from 8 May 2019)

Type # in postgres # in old solr # in new solr psql diff solr diff
Works 18081999
Orphans 3735145
Authors 6980217
Subjects 0
@cdrini cdrini added this to MVP in Solr Builder Jun 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

1 participant