-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce storage required for indexing - stop writing sp_name, res_type, and sp_updated to hfj_spidx_* tables #5941
base: master
Are you sure you want to change the base?
Reduce storage required for indexing - stop writing sp_name, res_type, and sp_updated to hfj_spidx_* tables #5941
Conversation
…_type and sp_updated columns of HFJ_SPIDX tables nullable
This Pull Request has failed the formatting check Please run You can automate this auto-formatting process to execute on the git pre-push hook, by installing pre-commit and then calling |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5941 +/- ##
============================================
+ Coverage 83.39% 83.51% +0.11%
- Complexity 26927 27219 +292
============================================
Files 1681 1698 +17
Lines 103965 105204 +1239
Branches 13189 13308 +119
============================================
+ Hits 86702 87858 +1156
- Misses 11613 11656 +43
- Partials 5650 5690 +40 ☔ View full report in Codecov by Sentry. |
...ver-model/src/main/java/ca/uhn/fhir/jpa/model/listener/IndexStorageOptimizationListener.java
Outdated
Show resolved
Hide resolved
… RES_TYPE after update/load
…-required-for-indexing-tables # Conflicts: # hapi-fhir-jpaserver-base/src/main/java/ca/uhn/fhir/jpa/migrate/tasks/HapiFhirJpaMigrationTasks.java
…eters if IndexMissingFields and optimizeIndexStorage are both enabled
…rect configuration handling
… SP recovery, documentation updates
…-required-for-indexing-tables
…-required-for-indexing-tables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved pending various comments.
...main/resources/ca/uhn/hapi/fhir/changelog/7_4_0/5937-reduce-storage-for-sp-index-tables.yaml
Outdated
Show resolved
Hide resolved
hapi-fhir-jpaserver-base/src/main/java/ca/uhn/fhir/jpa/config/SearchConfig.java
Outdated
Show resolved
Hide resolved
@@ -135,6 +135,104 @@ protected void init740() { | |||
.toColumn("RES_ID") | |||
.references("HFJ_RESOURCE", "RES_ID"); | |||
} | |||
|
|||
// Allow null values in SP_NAME, RES_TYPE columns for all HFJ_SPIDX_* tables. These are marked as failure | |||
// allowed, since SQL Server won't let us change nullability on columns with indexes pointing to them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so.... what happens on SQL server? they just cant use this feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this comment related to SQL server to handle rare case when SP_NAME, RES_TYPE columns are included in custom(client) index.
ca.uhn.fhir.jpa.model.entity.StorageSettings#setIndexMissingFields
* <p>
* The following index may need to be added into the indexed tables such as <code>HFJ_SPIDX_TOKEN</code>
* to improve the search performance while <code>:missing</code> is enabled.
* <code>RES_TYPE, SP_NAME, SP_MISSING</code>
* </p>
*/
public void setIndexMissingFields(IndexEnabledEnum theIndexMissingFields) {
Validate.notNull(theIndexMissingFields, "theIndexMissingFields must not be null");
myIndexMissingFieldsEnabled = theIndexMissingFields;
}
Here is a javadoc where we added a recommendation to create RES_TYPE, SP_NAME, SP_MISSING index if IndexMissingFields is enabled. If this index was created for SQL server DB, migration won’t change RES_TYPE, SP_NAME nullability. And yes, it this case new optimization feature won’t work.
Since IndexMissingFields is disabled by default, and won’t be enabled in most cases (as it impact write performance and increases db size), there should be no issues with SQL serve DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we not just not run the migrations on sql server? instead of running them and having them fail? look into onlyappliestoplatforms()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But RES_TYPE, SP_NAME, SP_MISSING
index will not be created in 99% cases, but we will drop support of SQL server DB for this feature. Do we need this ?
} | ||
conditions.add(BinaryCondition.equalTo(getMissingColumn(), generatePlaceholder(theMissing))); | ||
|
||
ComboCondition condition = ComboCondition.and(conditions.toArray()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like michael/ken/james to review this section
...ver-model/src/main/java/ca/uhn/fhir/jpa/model/listener/IndexStorageOptimizationListener.java
Outdated
Show resolved
Hide resolved
indexedSearchParamOptional.get().getParameterName()); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this code is doing exactly? this last method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is added to recover res_type
and sp_name
values from search parameter hash_identity
.
If new optimization feature is enable, when ResourceIndexedSearchParam entity is loaded from DB res_type
and sp_name
are null, so we reconstruct these values.
At first, this logic was added to allow usage of ResourceIndexedSearchParam.getParamName()
InMemoryResourceMatcher.java
. But later InMemoryResourceMatcher was tweaked to use hash_identity
instead of ParamName if new feature is enabled.
As of now, recovered res_type
and sp_name
values are used mostly by tests. It is possible to completely switch all tests to use hash_identity
instead. However, I think we can leave this logic, as res_type
and sp_name
values could be helpful during debugging/troubleshooting and it is relatively cheap operation performance-wise.
Note: res_type
and sp_name
values are not recovered if StorageSettings#isIndexOnContainedResources,
StorageSettings#isIndexOnContainedResourcesRecursively are enabled
hapi-fhir-jpaserver-model/src/main/java/ca/uhn/fhir/jpa/model/util/SearchParamHash.java
Show resolved
Hide resolved
…-required-for-indexing-tables
…-required-for-indexing-tables # Conflicts: # hapi-fhir-jpaserver-model/src/test/java/ca/uhn/fhir/jpa/model/entity/ResourceIndexedSearchParamCoordsTest.java # hapi-fhir-jpaserver-model/src/test/java/ca/uhn/fhir/jpa/model/entity/ResourceIndexedSearchParamDateTest.java # hapi-fhir-jpaserver-model/src/test/java/ca/uhn/fhir/jpa/model/entity/ResourceIndexedSearchParamQuantityNormalizedTest.java # hapi-fhir-jpaserver-model/src/test/java/ca/uhn/fhir/jpa/model/entity/ResourceIndexedSearchParamQuantityTest.java # hapi-fhir-jpaserver-model/src/test/java/ca/uhn/fhir/jpa/model/entity/ResourceIndexedSearchParamStringTest.java # hapi-fhir-jpaserver-model/src/test/java/ca/uhn/fhir/jpa/model/entity/ResourceIndexedSearchParamTokenTest.java # hapi-fhir-jpaserver-model/src/test/java/ca/uhn/fhir/jpa/model/entity/ResourceIndexedSearchParamUriTest.java
Migration:
HFJ_SPIDX
tables to allowSP_NAME
andRES_TYPE
columns to be nullable.failureAllowed()
, as bothSP_NAME
andRES_TYPE
could be included in custom indexes, and SQL Server won't let us change nullability on columns with indexes pointing to them.Optimization changes:
HFJ_SPIDX
tables - nullingSP_NAME
,RES_TYPE
SP_UPDATED
if this feature is enabled.JpaSearchParamCache
to recoverSP_NAME
,RES_TYPE
from hash_identity after loading from DB.HASH_IDENTITY
field was added toBaseResourceIndexedSearchParam
entity and removed from all inheritor entities.ResourceIndexedSearchParam
now usingHASH_IDENTITY
instead ofsp_name
andres_type
. This is required to makeResourceIndexedSearchParam
objects with and without optimization to be equal - to not cause unnecessaryResourceIndexedSearchParams
updates. (as we are comparing db version of entities with in-memory built Search params)DaoSearchParamSynchronizer
logic to check whether it is needed to update existing search parameters afterisIndexStorageOptimized
changeisIncludePartitionInSearchHashes
andisIndexStorageOptimized
are enabled on server. (isIncludePartitionInSearchHashes
is not supported ifisIndexStorageOptimized
is set totrue
)InMemoryResourceMatcher
now useshashIdentity
to filter SearchParams instead ofsp_name
. (only if optimization is enabled)hashIdentity
instead ofSP_NAME
,RES_TYPE
to build a query. This way new optimization could work in pair with EnabledIndexMissingFields
setting. (only if optimization is enabled)