Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THRIFT-5682: Move default constructor and operator== implementation to CPP file #2755

Merged
merged 5 commits into from
Mar 8, 2024

Conversation

tinloaf
Copy link
Contributor

@tinloaf tinloaf commented Feb 8, 2023

Overview

Both the default constructor and operator== implementations reference certain member functions of the class' members. As an example, the default constructor references (i.e., "uses") the default constructors of its members.

If a class contains a std::vector<Foo>, and Foo has only been forward declared (which happens often in Thrift-generated code), this creates undefined behavior: The std::vector specification states that as long as Foo is an incomplete type, it is fine to reference std::vector<Foo>, but not any members (such as its default constructor).

Thus, we must defer our default constructor's implementation (which references the default constructor of std::vector) to a point where Foo is a complete type. That is the case in the .cpp file.

The same holds for operator==.

Example

Take this very simple Thrift file:

struct StructA {
    11:required list<StructB> myBs
}

struct StructB
{
    1:required string someString
}

If I compile this using thrift --gen cpp:no_skeleton -o out ./file.thrift I get a header file that contains the following (full file here ):

 class StructA;
 class StructB;

 class StructA : … {
  public:
     …
     StructA() noexcept {
     }
     …
     std::vector<StructB>  myBs;
     …
 };

 …

 class StructB : … {
     …
 };

In this case, the default constructor for StructA references the default constructor of std::vector<StructB> while StructB is still an incomplete type. This is undefined behavior. It did apparently compile with all big compilers until recently, but with C++20, Clang 15 stops accepting this kind of construct, as you can see here at CompilerExplorer.

Checklist

  • Did you create an Apache Jira ticket? (not required for trivial changes)
  • If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
    Note: As far as I can tell, no ticket exists for this.
  • Did you squash your changes to a single commit? (not required, but preferred)
  • Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?

@Jens-G Jens-G added need JIRA ticket Please create a JIRA ticket to track this issue c++ labels Feb 8, 2023
@Jens-G Jens-G changed the title [TICKET NEEDED] C++ generator: Move default constructor and operator== implementation to CPP file Move default constructor and operator== implementation to CPP file Feb 8, 2023
@tinloaf
Copy link
Contributor Author

tinloaf commented Feb 9, 2023

I don't think that the failing checks are caused by my change. Most of the failing tests are in some other languages, and the C++ tests (which are run with make) run fine on my machine.

@tinloaf
Copy link
Contributor Author

tinloaf commented Feb 9, 2023

Hi @Jens-G , thanks for having a look. How do we proceed with the Jira ticket? Should I contact someone to get a Jira account? If so, whom? The user mailing list did not help. Or are you creating a Jira ticket for this? Thanks!

@Jens-G
Copy link
Member

Jens-G commented Feb 9, 2023

We need some info to create the account for you. Send this to jensg at apache dot com and I can do the rest. Sorry for the inconveniences.

@tinloaf
Copy link
Contributor Author

tinloaf commented Feb 9, 2023

@Jens-G Don't worry, I actually got ahead and contacted a friend who's involved in some other Apache project. I have an account, will create the ticket tomorrow.

@tinloaf tinloaf changed the title Move default constructor and operator== implementation to CPP file THRIFT-5682: Move default constructor and operator== implementation to CPP file Feb 10, 2023
@tinloaf
Copy link
Contributor Author

tinloaf commented Feb 10, 2023

@Jens-G I created a Jira ticket here: https://issues.apache.org/jira/browse/THRIFT-5682

Somehow Jira mangled my code examples, I'll try to clean them up later.

@Jens-G Jens-G removed the need JIRA ticket Please create a JIRA ticket to track this issue label Feb 10, 2023
@tinloaf
Copy link
Contributor Author

tinloaf commented Apr 3, 2023

I would love to move forward with this. I think this is more or less a necessary precondition to use Apache Thrift with C++20 - and even before C++20, the previous behavior is UB. We are using a Thrift version with this patch in production by now. Maybe a review of my proposed changes could be a first step towards a merge?

@Jens-G @cyyever @kashirin-alex @zeshuai007 @dsandbrink you contributed to this file "relatively" recently (although it was potentially several years ago - there is not much movement in this file). Maybe one of you would feel up to the task? Thanks a lot in advance!

@cyyever
Copy link
Contributor

cyyever commented Apr 3, 2023

I think the changes are fine if they pass C++ tests.

@kashirin-alex
Copy link
Contributor

kashirin-alex commented Apr 3, 2023 via email

@kashirin-alex
Copy link
Contributor

kashirin-alex commented Apr 3, 2023 via email

@bsergean
Copy link

bsergean commented Apr 4, 2023

Hi there, for context Apple released a new version of Xcode (which can be auto-updated with the rest of the OS), they might prepare for the Apple Developer Conference and a new Xcode 15 release.

In this version of xcode, this error show up when using C++20 mode, while it wasn't the case before, so a lot of code-base are breaking, pytorch for example had this problem.

@bsergean
Copy link

bsergean commented Apr 4, 2023

ChildService.obj : warning LNK4006: "public: __cdecl apache::thrift::test::ParentService_addString_args::ParentService_addString_args(void)" (??0ParentService_addString_args@test@thrift@apache@@QEAA@XZ) already defined in ParentService.obj; second definition ignored [C:\projects\build\MSVC2017\x64\lib\cpp\test\testgencpp_cob.vcxproj]
ChildService.obj : warning LNK4006: "public: __cdecl 

One of the failed appveyor job has those, double definitions ?
I noticed that the pytorch PR for an equivalent bug had some export macro for some similar issues, but maybe it's not relevant and related to saying an API has changed.

Here it is for reference, if it can gives ideas -> pytorch/pytorch@e7874ee

@tinloaf
Copy link
Contributor Author

tinloaf commented Apr 4, 2023

Hi @bsergean Yeah, I just started noticing these, too, when I tried to build with the warning settings suggested by @kashirin-alex . I have no clue why I did not see these in my earlier tests. I just pushed the fix for that.

@bsergean
Copy link

bsergean commented Apr 4, 2023 via email

@bsergean
Copy link

bsergean commented Apr 4, 2023 via email

@tinloaf
Copy link
Contributor Author

tinloaf commented Apr 5, 2023

Thanks @Jens-G for triggering the CI builds again. I'm not sure whether I interpret the results, especially the failures, correctly:

  • the "Build / lib-go (1.19)" and "Build / lib-go (1.20)" runs fail/are cancelled. There is something about a missing go.sum file in the logs. I'll assume that this is not related to my change?

  • The "Build / cross-test" build is skipped, I assume as a result of lib-go not being built?

  • The AppVeyor build fails. That's the one where I'm not sure whether I am at fault. I see three failed builds, two of them with MSVC, one with MinGW. I do have MSVC on my machine, however I have been developing/building/testing Thrift under Linux via WSL1, using GCC and Clang. Maybe I need to set up an MSVC build. See below for details on the three failed builds. However, if I look at the AppVeyor builds of the latest commit on the master branch, it seems to me that these are the exact same three errors. So, I also assume that my changes have nothing to do with these errors.

Failed AppVeyor Builds

  • For the first AppVeyor build, I don't see any error. These are the last lines in the log:
  Building Custom Rule C:/projects/thrift/lib/cpp/test/CMakeLists.txt
  ProcessorTest.cpp
  EventLog.cpp
  ServerThread.cpp
  Generating Code...
  processor_test.vcxproj -> C:\projects\build\MSVC2017\x64\bin\Release\processor_test.exe
  Command exited with code 1

This is weird. Usually, MSVC does print an error if it fails to compile / link something…

  • For the second build, the log is similar but warns about a library compatibility problem:
  Building Custom Rule C:/projects/thrift/lib/cpp/test/CMakeLists.txt
  ProcessorTest.cpp
  EventLog.cpp
  ServerThread.cpp
  Generating Code...
LINK : warning LNK4098: defaultlib 'LIBCMT' conflicts with use of other libs; use /NODEFAULTLIB:library [C:\projects\build\MSVC2015\x86\lib\cpp\test\processor_test.vcxproj]
  processor_test.vcxproj -> C:\projects\build\MSVC2015\x86\bin\Debug\processor_test.exe
  processor_test.vcxproj -> C:/projects/build/MSVC2015/x86/bin/Debug/processor_test.pdb (Full PDB)
Command exited with code 1
  • The third build (the MinGW one) fails with undefined references to boost::unit_test. I'll just assume that this is not caused by my changes.

@bsergean
Copy link

bsergean commented Apr 5, 2023

@tinloaf / maybe one way to prove your theory would be to make a separate 'no-op' PR, (say that add #include header somewhere to the generated code), and see if those 3 same errors + the golang one still show up.

@tinloaf
Copy link
Contributor Author

tinloaf commented Apr 26, 2023

@bsergean I could do that, but I think this would just run the same tests as are run on all commits on master anyways, and those are failing, too, I guess. Also I think that @Jens-G would have to manually authorize those tests anyway.

However, I think that the failing Go tests have been fixed in 5cdf890. I will rebase this PR on top of the current master and would then expect the Go tests to succeed, but the cross-tests to still fail (as they do on master).

@tinloaf
Copy link
Contributor Author

tinloaf commented Apr 27, 2023

I did the rebase - @Jens-G if you approve the workflows to run I think the go tests should pass now.

@tinloaf
Copy link
Contributor Author

tinloaf commented Jul 19, 2023

Just for clarification @Jens-G - are the failing cross-tests (which I still think are not failing because of this change) blocking this PR? Or is there anything else I can do to help you? Is this a change that's just not wanted in Thrift, in which case we should discuss other ways of getting rid of the UB?

Don't get me wrong, I understand this is FOSS software, everyone is doing as much as they can/want, nobody is paid for their time working on this, and everyone has different priorities. I'm just trying to get this moving again. We are using Thrift with this patch applied in production for a while now, it's working, and it would of course be preferrable not to roll our own patched Thrift version.

@cyyever
Copy link
Contributor

cyyever commented Jul 20, 2023

I think the cross test failures should be fixed first if caused by this PR.

@tinloaf
Copy link
Contributor Author

tinloaf commented Jul 20, 2023

I think the cross test failures should be fixed first if caused by this PR.

The following cross-tests currently fail on the current master (see here):

(kotlin,swift)
(kotlin,java,kotlin)
(kotlin,go,rs)
(swift,java,kotlin)
(rs,swift)
(java,swift)
(rs,go,rs)
(go,java,kotlin)
(rs,java,kotlin)
(java,go,rs)
(java,java,kotlin)

The following cross-tests fail in this PR:

(java,go,rs)
(java,swift)
(kotlin,go,rs)
(kotlin,swift)
(go,java,kotlin)
(rs,java,kotlin)
(rs,go,rs)
(rs,swift)
(swift,java,kotlin)

If I'm not mistaken, the set of faling cross-tests in this PR is a subset of the failing cross-tests on master (which this PR was forked from). Also, none of these cross-tests involve C++, and I only made changes to the C++ generator.

Both leads me to believe that my changes do not contribute to breaking all these cross-tests.

I'm sorry, but I won't have the time (or the knowledge) to fix all these independently broken cross-tests. If that is a precondition for acceptance of this PR, I will have to abandon it (and just keep using a privately patched Thrift version).

@emmenlau
Copy link
Member

This is a nice, and very relevant change. I will try to find some time to look into it. @tinloaf could you kindly rebase your changes on latest master and force push one more time, so we get more up-to-date test results? Thanks a lot!

Both the default constructor and operator== implementations reference
certain member functions of the class' members. As an example, the default
constructor references (i.e., "uses") the default constructors of its
members.

If a class contains a std::vector<Foo>, and Foo has only been *forward*-
declared (which happens often in Thrift-generated code), this creates
undefined behavior: The std::vector specification states that as long as
Foo is an incomplete type, it is fine to reference std::vector<Foo>, but
not any members (such as its default constructor).

Thus, we must defer our default constructor's implementation (which references
the default constructor of std::vector<Foo>) to a point where Foo is a
complete type. That is the case in the .cpp file.

The same holds for operator==.
@tinloaf
Copy link
Contributor Author

tinloaf commented Oct 23, 2023

Hi @emmenlau ,

thanks for reviving this. :)

I just rebased and force-pushed. It seems that after the rebase, I cannot execute the bin/UnitTests executable locally anymore (and I think I could back in February?). The output I see is:

> ./bin/UnitTests
Running 61 test cases...
unknown location(0): fatal error: in "ToStringTest/locale_de_DE_floating_point_to_string": std::runtime_error: locale::facet::_S_create_c_locale name not valid
~/src/thrift/thrift/lib/cpp/test/ToStringTest.cpp(57): last checkpoint: "locale_de_DE_floating_point_to_string" test entry
An error message from getaddrinfo on the console is expected:
Thrift: Mon Oct 23 11:03:28 2023 getaddrinfo() -> -2; Name or service not known

*** 1 failure is detected in the test module "thrift"

Looks like a locale is requested that my system does not have? This may be because I'm working in WSL, but I'm pretty sure it is not caused by my changes, which do not touch anything around locales.

Edit: Okay, this was easy, I just needed to generate the de_DE.UTF-8 locale on my system. Now the UnitTests pass locally.

@tinloaf
Copy link
Contributor Author

tinloaf commented Dec 9, 2023

Hi @emmenlau , is there anything more I can do to help getting this in?

@tinloaf
Copy link
Contributor Author

tinloaf commented Mar 8, 2024

Hi @emmenlau @Jens-G any chance of getting this moving? It's been over a year now, and as I wrote multiple times already: This PR does not break any of the tests. All the tests reported as broken in this PR are broken because they are broken in master (or were broken when I last rebased). If I'm mistaken, please tell me which tests break because of this.

We're using a Thrift version patched with this PR in production for over a year now, since it is otherwise impossible to build Thrift-generated code in C++20 mode, as @bsergean confirmed.

I will gladly rebase the changes one more time, but only if somebody actually takes a look at them then. Just spending the time of rebasing them every couple of months with zero effect seems kind of a waste.

@Jens-G Jens-G merged commit 9bd8f1e into apache:master Mar 8, 2024
1 of 2 checks passed
@Jens-G
Copy link
Member

Jens-G commented Mar 8, 2024

You are right. Quite a bunch of eyes had seen/evaluated this so I guess its good to merge now.

@tinloaf
Copy link
Contributor Author

tinloaf commented Mar 8, 2024

Thanks a lot. :)

@CJCombrink
Copy link
Contributor

I guess https://issues.apache.org/jira/browse/THRIFT-5682 can be closed as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
7 participants