Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EZP-30759: Creating content with terminal escape codes chars inside Searchable field will return 500 error when using Solr #159

Open
wants to merge 2 commits into
base: 1.5
Choose a base branch
from

Conversation

kmadejski
Copy link
Member

Question Answer
JIRA issue EZP-30759
Bug yes
New feature no
Target version 1.5 and higher

This PR fixes the issue described in the mentioned JIRA ticket. The simple solution is to filter out all the terminal escape characters but just for Solr as Legacy storage handles it without any issue.

I'll add type hints on merge up, as 1.5 can be used with eZ Platform v1.x on PHP 5.x.

I'm about to create an issue in Solr's JIRA tracker, but as for now, I'm waiting for their devs to confirm that from their perspective this can be considered as a bug. I'll update the PR and add a comment in the code once the issue is created.

…earchable field will return 500 error when using Solr
Copy link
Contributor

@andrerom andrerom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with comments fixed :)

@@ -10,6 +10,7 @@ parameters:
ezpublish.search.solr.field_mapper.location.class: EzSystems\EzPlatformSolrSearchEngine\FieldMapper\LocationFieldMapper\Aggregate
ezpublish.search.solr.field_mapper.boost_factor_provider.class: EzSystems\EzPlatformSolrSearchEngine\FieldMapper\BoostFactorProvider
ezpublish.search.solr.field_mapper.boost_factor_provider.map: []
ez_search_engine_solr.invalid_characters_pattern: "#\\x1b[[][^A-Za-z]*[A-Za-z]#"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment with link to Solr issue here so we can track it.

@@ -42,22 +42,30 @@ class BlockDocumentsBaseContentFields extends ContentFieldMapper
*/
protected $sectionHandler;

/**
* @var string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this (here and on the others):

Suggested change
* @var string
* @var string Pattern with special characters that should be stripped from field value due to Solr not accepting them.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'regexp pattern' ?

@gggeek
Copy link

gggeek commented Nov 14, 2019

-1. We should quote special chars instead of stripping them out. Esp. if this is done pervasively. Do you prohibit single quotes chars from ever being stored in varchar fields in databases?

@gggeek
Copy link

gggeek commented Nov 14, 2019

ps: I agree about filtering out a very limited set of characters, but not fe. '['.
Also, I'd try to push any stripping to the kernel, so that the data stored in Solr and in the db is as much in sync as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4 participants