Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

race condition in FileUtils.copyFile() #47

Open
gregallen opened this issue Oct 10, 2018 · 14 comments
Open

race condition in FileUtils.copyFile() #47

gregallen opened this issue Oct 10, 2018 · 14 comments

Comments

@gregallen
Copy link

in a parallel multi-module maven build, running copy-dependencies with a shared output directory...

more than one module may attempt to copy the file at once. The length check can then fail since it sees the size of the partially copied new file.

suggest using the return value from nio Files copyFile, which is the number of bytes copied. can be safely compared against the size of the source file.

also, you can surely rip out the old pre-nio code and drop java 6 support?

@gregallen
Copy link
Author

also raised in Maven dependencies plugin: https://issues.apache.org/jira/browse/MDEP-633

@michael-o
Copy link
Member

Can you provide a minimal viable example project for this? I'd assume even if you have one shared output dir the parallel builders build disjoint stuff.

@gregallen
Copy link
Author

Just need a parent with two children
Both share an ideally large dependency
Build with -T2 clean dependency:copy
Repeat until fail

@gregallen
Copy link
Author

$ more pom.xml */pom.xml | cat
::::::::::::::
pom.xml
::::::::::::::
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.apache.maven</groupId>
    <artifactId>mdep-633</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>pom</packaging>

    <modules>
        <module>mod1</module>
        <module>mod2</module>
    </modules>

    <build>
        <plugins>
            <plugin>
                <artifactId>maven-dependency-plugin</artifactId>
                <version>3.1.1</version>
                <configuration>
                    <outputDirectory>${project.basedir}/../target</outputDirectory>
                </configuration>
            </plugin>
        </plugins>
    </build>

</project>
::::::::::::::
mod1/pom.xml
::::::::::::::
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.apache.maven</groupId>
    <artifactId>mod1</artifactId>
    <version>1.0-SNAPSHOT</version>

    <parent>
        <groupId>org.apache.maven</groupId>
        <artifactId>mdep-633</artifactId>
        <version>1.0-SNAPSHOT</version>
    </parent>

    <dependencies>
        <dependency>
            <groupId>org.quickfixj</groupId>
            <artifactId>quickfixj-messages-all</artifactId>
            <version>2.1.0</version>
            <classifier>javadoc</classifier>
        </dependency>
    </dependencies>

</project>
::::::::::::::
mod2/pom.xml
::::::::::::::
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.apache.maven</groupId>
    <artifactId>mod2</artifactId>
    <version>1.0-SNAPSHOT</version>

    <parent>
        <groupId>org.apache.maven</groupId>
        <artifactId>mdep-633</artifactId>
        <version>1.0-SNAPSHOT</version>
    </parent>

    <dependencies>
        <dependency>
            <groupId>org.quickfixj</groupId>
            <artifactId>quickfixj-messages-all</artifactId>
            <version>2.1.0</version>
            <classifier>javadoc</classifier>
        </dependency>
    </dependencies>

</project>

@gregallen
Copy link
Author

gregallen commented Dec 7, 2018

nice -19 mvn -T4 clean org.apache.maven.plugins:maven-dependency-plugin:copy-dependencies

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.1.1:copy-dependencies (default-cli) on project mod1: Error copying artifact from /home/X/.m2/repository/org/quickfixj/quickfixj-messages-all/2.1.0/quickfixj-messages-all-2.1.0-javadoc.jar to /home/X/workspace-gitlab/my_sandbox/mdep633/mod1/../target/quickfixj-messages-all-2.1.0-javadoc.jar: Failed to copy full contents from /home/X/.m2/repository/org/quickfixj/quickfixj-messages-all/2.1.0/quickfixj-messages-all-2.1.0-javadoc.jar to /home/X/workspace-gitlab/my_sandbox/mdep633/mod1/../target/quickfixj-messages-all-2.1.0-javadoc.jar -> [Help 1]

@gregallen
Copy link
Author

Caused by: java.io.IOException: Failed to copy full contents from .m2/repository/org/quickfixj/quickfixj-messages-all/2.1.0/quickfixj-messages-all-2.1.0-javadoc.jar to my_sandbox/mdep633/mod1/../target/quickfixj-messages-all-2.1.0-javadoc.jar
    at org.codehaus.plexus.util.FileUtils.copyFile (FileUtils.java:1090)
    at org.apache.maven.plugins.dependency.AbstractDependencyMojo.copyFile (AbstractDependencyMojo.java:184)
    at org.apache.maven.plugins.dependency.fromDependencies.CopyDependenciesMojo.copyArtifact (CopyDependenciesMojo.java:249)
    at org.apache.maven.plugins.dependency.fromDependencies.CopyDependenciesMojo.doExecute (CopyDependenciesMojo.java:124)
    at org.apache.maven.plugins.dependency.AbstractDependencyMojo.execute (AbstractDependencyMojo.java:143)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:208)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:154)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute5.jar
 (MojoExecutor.java:146)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:200)
    at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:196)
    at java.util.concurrent.FutureTask.run (FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:511)
    at java.util.concurrent.FutureTask.run (FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624)
    at java.lang.Thread.run (Thread.java:748)

@gregallen
Copy link
Author

@michael-o please see test case above

@michael-o
Copy link
Member

@gregallen Thanks for the case. This is, of course, a conceptual problem. I guess this case was never considered. How do you want to make it thread-safe? It need some locking mechanism. I do not unterstand how the amount of written bytes will help if two threads are writing the same moment with different regions. Do you have a PR you'd like to propose? I am all ears..

@gregallen
Copy link
Author

@michael-o I suggest that the low-level file writing method should return the total number of bytes written, and that is what you should check against the size of the source file.

Do not attempt to check the size of the destination file, since another racing thread may have already "accidentally" deleted (or truncated) the newly copied file, and be in the middle of re-copying it.

It's not ideal that the file gets copied twice, but not a big deal. Not worth trying to devise some locking approach to avoid that. Instead just simplify the check IMHO

@gregallen
Copy link
Author

Looks like org.codehaus.plexus.util.IOUtil#copy(java.io.InputStream, java.io.OutputStream, int) needs to change to return a long and ripple that return value up...

@gregallen
Copy link
Author

OR, get rid of the check altogether - the javadoc for java.io.OutputStream#write(byte[], int, int) implies that the bytes WILL be written if no exception is thrown, and there is no exceptional code path back to the size check.

@belingueres
Copy link
Contributor

@gregallen Is it an option to refactor the pom files so that each module copy files to its own directory, then merge those directories into one in a later phase? seems to me that the plugin is not designed to concurrently copy the same files into a common directory.

@gregallen
Copy link
Author

no, that would not be possible. There are 150 modules, and 95% of the dependencies are identical. However, there are a few modules that require differing versions of some artifacts, so it is also not possible to do the copy from a single module (since that would only copy one version per artifact). The build runs with 20x parallelism, so the race condition is often seen

@gregallen
Copy link
Author

What is the scenario in which the File.length() check adds value? In the out-of-diskspace case, doesn't an IOException get thrown?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants