Skip to content
This repository has been archived by the owner on Jul 13, 2023. It is now read-only.

MIME type detection of Paperclip::MediaTypeSpoofDetector doesn't work with old versions of file #2527

Closed
vakuum opened this issue Jan 11, 2018 · 2 comments

Comments

@vakuum
Copy link
Contributor

vakuum commented Jan 11, 2018

In old versions of file a space is used to separate the MIME type from the encoding:

$ file --version
file-4.24

$ file -b --mime test.txt
text/html charset=us-ascii

$ cat /etc/SuSE-release 
SUSE Linux Enterprise Server 11 (x86_64)

In new versions of file a semicolon is used to separate the MIME type from the encoding:

$ file --version
file-5.25

$ file -b --mime test.txt
text/html; charset=us-ascii

$ lsb_release -d
Description:    Ubuntu 16.04.3 LTS

In media_type_spoof_detector.rb the split method is used to separate the MIME type from the encoding:

def type_from_file_command
...
	Paperclip.run("file", "-b --mime :file", :file => @file.path).split(/[:;]\s+/).first
...
end

But this doesn't work for old versions of file because the regexp doesn't match the space correctly:

> 'text/html charset=us-ascii'.split(/[:;]\s+/).first
=> "text/html charset=us-ascii"

> 'text/html; charset=us-ascii'.split(/[:;]\s+/).first
=> "text/html"

Moving the whitespace character into the character class solves the problem:

> 'text/html charset=us-ascii'.split(/[:;\s]+/).first
=> "text/html

> 'text/html; charset=us-ascii'.split(/[:;\s]+/).first
=> "text/html"

Monkey patch

Currently I am using the following monkey patch as a workaround:

module Paperclip
  class MediaTypeSpoofDetector
    private

    def type_from_file_command
      begin
        Paperclip.run('file', '-b --mime-type :file', :file => @file.path)
      rescue Cocaine::CommandLineError
        ''
      end
    end
  end
end
@vakuum
Copy link
Contributor Author

vakuum commented Jan 11, 2018

Fun fact 1

The file_command_content_type_detector.rb contains another implementation of the type_from_file_command method:

def type_from_file_command
  type = begin
    Paperclip.run("file", "-b --mime :file", :file => @filename)
    ...
  end
  ...
  type.split(/[:;\s]+/)[0]
end

The regexp of this implementation works correctly, because the whitespace character is already part of the character class.

Fun fact 2

The original implementation of Paperclip::MediaTypeSpoofDetector used the --mime-type parameter like the monkey patch above:

vakuum added a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
vakuum added a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
vakuum added a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
vakuum added a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
vakuum added a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
vakuum added a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
vakuum added a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
@vakuum
Copy link
Contributor Author

vakuum commented Jan 15, 2018

I created a pull request for this issue: #2528.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants