Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature add an option to only copy files if they are different #9389

Closed

Conversation

boussaffawalid
Copy link

@boussaffawalid boussaffawalid commented Aug 5, 2021

Changelog: added new param to self.copy function to only copy files if they are different. Similar to the cmake function copy_if_different
Mainly for performance purpose to speedup imports when reusing the same build directory.

Docs: conan-io/docs#2186

  • I've read the Contributing guide.
  • I've followed the PEP8 style guides for Python code.
  • I've opened another PR in the Conan docs repo to the develop branch, documenting this one.

Note: By default this PR will skip the slower tests and will use a limited set of python versions. Check here how to increase the testing level by writing some tags in the current PR body text.

@CLAassistant
Copy link

CLAassistant commented Aug 5, 2021

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@memsharded memsharded left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature is not completely clear.

In the case this will progress, it would need some tests, to be added in the PR before merging.

@@ -51,7 +52,7 @@ def report(self, output):
return report_copied_files(self._copied, output)

def __call__(self, pattern, dst="", src="", keep_path=True, links=False, symlinks=None,
excludes=None, ignore_case=True):
excludes=None, ignore_case=True, if_different=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear in which scenarios it will be copying a file and the target already exists. The idea is that the self.copy() copies to an empty package folder, so this shouldn't be an issue.

Why is this being added? Performance?

In any case, it is extremely recommended to not change the current default, as the likelihood of breaking is much much higher. In this case the default should be if_different=False

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this draft PR to start a discussion, that is why there are no tests so far.
My understanding is that this function is being used in the conan imports method as well.

Use case:
When doing local development, it is very common to reuse the same build directory. Adding this new param if_different will optimize copying files to local build directory, specially when:

  • large files needs to be copied to the build directory
  • Some files in the build directory are already used by another process (currently the import fail, even there is no need to re import this file as it already exist)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thanks, yes, it makes sense.

Some hints about the future though:

  • We will make explicit the self.copy(), so maybe this could be added to the imports case only
  • In Conan 2.0 we will probably reduce or integrate the imports case with the deploy case. The general idea is that copying files over and over is indeed inefficient, and with virtualenvs it is not needed in most cases, specially development cases. It will only be needed for final extraction of files for final deployment, but not for regular development.

@@ -237,6 +239,7 @@ def _copy_files(files, src, dst, keep_path, symlinks):
pass
os.symlink(linkto, abs_dst_name) # @UndefinedVariable
else:
shutil.copy2(abs_src_name, abs_dst_name)
if not if_different or not os.path.exists(abs_dst_name) or not filecmp.cmp(abs_src_name, abs_dst_name):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the reason is performance, some data/benchmark about the savings would be great.

@boussaffawalid
Copy link
Author

I've use this benchmarking class

class DemoConan(ConanFile):
    name = 'demo'
    version = '1.0'
    license = ''
    settings = 'os', 'compiler', 'build_type', 'arch'
    #generators = 'cmake_find_package_multi'
    keep_imports = True
    no_copy_source = True

    def requirements(self):
        self.requires.add("cmake/3.20.2")

    def imports(self):
        start = time.time()
        for i in range(100):
            self.copy('*', root_package='cmake', src='bin', dst='bin', if_different=False)
        end = time.time()
        runtime = end - start
        print("100 imports took {} seconds to complete its execution for if_different=False".format(runtime))

        start = time.time()
        for i in range(100):
            self.copy('*', root_package='cmake', src='bin', dst='bin1', if_different=True)
        end = time.time()
        runtime = end - start
        print("100 imports took {} seconds to complete its execution for if_different=True".format(runtime))

    def build(self):
        pass

    def package(self):
        pass

and I got these results

100 imports took 10.610847234725952 seconds to complete its execution for if_different=False
100 imports took 0.14913082122802734 seconds to complete its execution for if_different=True

@boussaffawalid
Copy link
Author

@memsharded could you re-review this PR

@memsharded
Copy link
Member

I am afraid that this PR being red was buried in our notifications and never got reviewed properly.
Now the FileCopier has been deprecated and removed in 2.0, in favor of a copy() method, so I am closing this as outdated. If this functionality is desired there, it would be necessary to report against 2.0. Thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants