-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Kafka Managed IO #31172
Support Kafka Managed IO #31172
Changes from 6 commits
62a4b2a
b8c5b36
591726b
1a43ed2
8a2acfe
06b9bdb
1404d1c
96030e2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ | |
*/ | ||
package org.apache.beam.sdk.managed; | ||
|
||
import static org.apache.beam.sdk.managed.ManagedTransformConstants.MAPPINGS; | ||
import static org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkArgument; | ||
|
||
import com.google.auto.service.AutoService; | ||
|
@@ -202,7 +203,26 @@ Row getConfigurationRow() { | |
static Row getRowConfig(ManagedConfig config, Schema transformSchema) { | ||
// May return an empty row (perhaps the underlying transform doesn't have any required | ||
// parameters) | ||
return YamlUtils.toBeamRow(config.resolveUnderlyingConfig(), transformSchema, false); | ||
String yamlConfig = config.resolveUnderlyingConfig(); | ||
Map<String, Object> configMap = YamlUtils.yamlStringToMap(yamlConfig); | ||
|
||
// The config Row object will be used to build the underlying SchemaTransform. | ||
// If a mapping for the SchemaTransform exists, we use it to update parameter names and align | ||
// with the underlying config schema | ||
Map<String, String> mapping = MAPPINGS.get(config.getTransformIdentifier()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if the transform identifier isn't found - semes like Iceberg already falls into that case, I'm also worried about typos though if folks don't use the enum (maybe rare) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar to the reasoning above, a mapping exists only if the transform needs one. |
||
if (mapping != null && configMap != null) { | ||
Map<String, Object> remappedConfig = new HashMap<>(); | ||
for (Map.Entry<String, Object> entry : configMap.entrySet()) { | ||
String paramName = entry.getKey(); | ||
if (mapping.containsKey(paramName)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right now any parameters that aren't recognized are silently dropped. I think we should be at least warning here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The mapping contains parameter names that should be updated. If a parameter is not included in the mapping, we assume its original name is correct. But agreed some logging would be helpful. I'll add a log showing the final Row configuration used to build the underlying transform There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. P.S. alternatively we can make it mandatory to include all parameter names in the mapping, regardless if they need to be updated or not. (i'm open to this option but i find it unnecessarily strict) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah got it - I think leaving it is fine, thanks
Sounds good, thank you! |
||
paramName = mapping.get(paramName); | ||
} | ||
remappedConfig.put(paramName, entry.getValue()); | ||
} | ||
configMap = remappedConfig; | ||
} | ||
|
||
return YamlUtils.toBeamRow(configMap, transformSchema, false); | ||
} | ||
|
||
Map<String, SchemaTransformProvider> getAllProviders() { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be good to add a basic Kafka IT (though this is about testing Managed as much as it is Kafka). I'm fine deferring that to a future PR where we can address this together with Iceberg or including it here, up to you