Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db_append_table should not double escape already escaped quotes #270

Open
jd4ds opened this issue Jun 13, 2022 · 4 comments · May be fixed by #271
Open

db_append_table should not double escape already escaped quotes #270

jd4ds opened this issue Jun 13, 2022 · 4 comments · May be fixed by #271

Comments

@jd4ds
Copy link

jd4ds commented Jun 13, 2022

Hey there,
long awaited and now available, I was very happy about the possibility to insert data into MySQL databases via local file. However, I may have found a small issue when creating the TSV file.
{readr} seems to escape already escaped quotes, which leads to unwanted results. My suggestion would be to use readr::write_delim(..., escape = "none"), unless the default escaping of readr was chosen on purpose.

Please let me know if you need more information on this!

Cheers,
Janis

escape_testing <- function(escape = "double"){
  file_default_escape <- tempfile(fileext = ".tsv")
  readr::write_delim(
    RMariaDB:::csv_quote(value,
                         warn_factor = TRUE,
                         conn = target_conn),
    file_default_escape,
    quote = "none",
    delim = "\t",
    na = "\\N",
    col_names = FALSE,
    escape = escape
  )
  value_default_escape <- readr::read_tsv(file = file_default_escape)
  print(as.data.frame(value_default_escape))
  unlink(file_default_escape)
}

value <-
  data.frame(test = c("test_header", 'Test \"quote\" escaping.'))

# Escape default value
escape_testing(escape = "double")
#> Rows: 1 Columns: 1
#> -- Column specification --------------------------------------------------------
#> Delimiter: "\t"
#> chr (1): test_header
#> 
#> i Use `spec()` to retrieve the full column specification for this data.
#> i Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>                test_header
#> 1 Test ""quote"" escaping.

# `escape = "none"`
escape_testing(escape = "none")
#> Rows: 1 Columns: 1
#> -- Column specification --------------------------------------------------------
#> Delimiter: "\t"
#> chr (1): test_header
#> 
#> i Use `spec()` to retrieve the full column specification for this data.
#> i Specify the column types or set `show_col_types = FALSE` to quiet this message.
#>              test_header
#> 1 Test "quote" escaping.
@jd4ds
Copy link
Author

jd4ds commented Jun 13, 2022

To add on this a little:
readr::write_delim still seems way slower than for instance to use data.table::fwrite. Since this function is suppose to focus on speed, would it be an option to consider swapping the function that creates the tsv?

@krlmlr
Copy link
Member

krlmlr commented Jun 14, 2022

Thanks, happy to review a PR, I don't have a strong opinion regarding the method to create the TSVs as long as it's robust.

@jd4ds
Copy link
Author

jd4ds commented Jun 14, 2022

Sure I can look into that.
But for now is there any reason why to use the default escaping from {readr}? Otherwise I would suggest just to add the escape = "none" argument to the function call as a quick fix (for now). ( I could open a PR for this as well if requested).
Thanks!

@jd4ds jd4ds linked a pull request Jun 14, 2022 that will close this issue
@jd4ds
Copy link
Author

jd4ds commented Jun 24, 2022

I tested the changes against all our MySQL database schemas for identity and it seems to work just fine. I do not really know how DBItest works and on a first glance it seems the issue is with MariaDB which I do not have access to.
Anything I can do here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants