Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not return invalid content type on text files #2474

Merged
merged 1 commit into from
Jan 17, 2021
Merged

Do not return invalid content type on text files #2474

merged 1 commit into from
Jan 17, 2021

Conversation

inkstak
Copy link
Contributor

@inkstak inkstak commented Apr 23, 2020

Fixes #2424

It also supports this edge case:
I have to deal with text files provided by public administrations with inconsistent extensions ( .N000777, .BATI, .txt, missing extensions) alongside known extensions (.csv, .pdf, images, etc.)

Returning a missing content type is quite different from returning an invalid content type.

@inkstak inkstak changed the title Do not returns invalid content type on text files Do not return invalid content type on text files Apr 23, 2020
@n-rodriguez
Copy link

Hi there! Any news?

1 similar comment
@n-rodriguez
Copy link

Hi there! Any news?


if type.nil?
type = ::MiniMime.lookup_by_filename(path).try(:content_type)
type = 'invalid/invalid' unless type.nil? || type.start_with?('text/')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why you didn't go with your original approach for image/* check only? There are a lot of other options like application/* or audio/* for which the issue will still be relevant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual strategy on carrierwave is giving high confidence in the result of MimeMagic.by_magic:

  • If it returns something : it's valid
  • If it returns nothing : it's invalid

The approach I suggested in #2424 doesn't trust the result and still considers missing types as invalid:

  • If it returns an image : we double-check the type from the filename
  • If it's present but anything other than an image : it's valid
  • If it returns nothing : it's still invalid

The final solution tries to keep confidence in MimeMagic.by_magic but considers the nothing case:

  • If it returns something : it's still valid.
  • If it returns nothing : we double-check the type from the file name:
    • if it's still missing : it's just something unknown so let's return nil
    • if it looks like text : the types don't match but there is no risk in considering it valid
    • if it's anything else : it looks like type spoofing so let's return invalid type

I guess images, audio or video files not detected as such by MimeMagic.by_magic should be considered at risk as they can then be processed by sub-programs, added and played from HTML pages.

I've mixed feelings about application/* because it covers a lot of cases.

@zjeraar
Copy link

zjeraar commented Dec 29, 2020

Any chance of merging this?

@mshibuya mshibuya merged commit 8c6743c into carrierwaveuploader:master Jan 17, 2021
@mshibuya
Copy link
Member

Thank you and sorry for delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug with content type detection
5 participants