Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: MySQL's MySqlColumnDesc fails to separate concerns #26999

Open
sploiselle opened this issue May 9, 2024 · 0 comments
Open

storage: MySQL's MySqlColumnDesc fails to separate concerns #26999

sploiselle opened this issue May 9, 2024 · 0 comments
Labels
C-refactoring Category: replacing or reorganizing code

Comments

@sploiselle
Copy link
Contributor

MySQL sources store information about the tables' columns in MySqlColumnDesc:

pub struct MySqlColumnDesc {
    /// The name of the column.
    pub name: String,
    /// The intended data type of this column within Materialize
    /// If this is None, the column is intended to be skipped within Materialize
    pub column_type: Option<ColumnType>,
    /// Optional metadata about the column that may be necessary for decoding
    pub meta: Option<MySqlColumnMeta>,
}

Here is the iimpl of mz_repr::relation::ColumnType:

pub struct ColumnType {
    /// The underlying scalar type (e.g., Int32 or String) of this column.
    pub scalar_type: ScalarType,
    /// Whether this datum can be null.
    #[serde(default = "return_true")]
    pub nullable: bool,
}

MySqlColumnDesc::column_type:

  • Is an Option because it is None when users specify IGNORE COLUMNS on this column.
  • Is capable of expressing all columns in terms of ScalarType because if the MySQL type is unsupported, we force users to specify TEXT COLUMNS.

This fails to separate concerns––e.g. the MySqlColumnDesc cannot describe the actual upstream table's schema because we apply apply configuration options to it.

If we fully seprated concerns, purification would be able to describe the tables we want to ingest, and processing the TEXT COLUMNS and IGNORE COLUMNS options could occur in planning. With the current design that is impossible.

@sploiselle sploiselle added the C-refactoring Category: replacing or reorganizing code label May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-refactoring Category: replacing or reorganizing code
Projects
None yet
Development

No branches or pull requests

1 participant