Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang-tidy] Optimize realpath in readability-identifier-naming #92659

Merged

Conversation

PiotrZSL
Copy link
Member

@PiotrZSL PiotrZSL commented May 18, 2024

@llvmbot
Copy link
Collaborator

llvmbot commented May 18, 2024

@llvm/pr-subscribers-clang-tidy

@llvm/pr-subscribers-clang-tools-extra

Author: Piotr Zegar (PiotrZSL)

Changes
  • Reduce disk IO usage by adding cache to an realpath introduced by #81985
  • Refactor HungarianNotation::getDeclTypeName to remove some string copies and use more safe string API.

Full diff: https://github.com/llvm/llvm-project/pull/92659.diff

2 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp (+35-33)
  • (modified) clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.h (+2)
diff --git a/clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp b/clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp
index c3208392df156..b2ab6e283c079 100644
--- a/clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp
+++ b/clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp
@@ -305,27 +305,20 @@ std::string IdentifierNamingCheck::HungarianNotation::getDeclTypeName(
   auto &SM = VD->getASTContext().getSourceManager();
   const char *Begin = SM.getCharacterData(VD->getBeginLoc());
   const char *End = SM.getCharacterData(VD->getEndLoc());
-  intptr_t StrLen = End - Begin;
+  if (End < Begin)
+    return {};
 
   // FIXME: Sometimes the value that returns from ValDecl->getEndLoc()
   // is wrong(out of location of Decl). This causes `StrLen` will be assigned
   // an unexpected large value. Current workaround to find the terminated
   // character instead of the `getEndLoc()` function.
-  const char *EOL = strchr(Begin, '\n');
-  if (!EOL)
-    EOL = Begin + strlen(Begin);
-
-  const char *PosList[] = {strchr(Begin, '='), strchr(Begin, ';'),
-                           strchr(Begin, ','), strchr(Begin, ')'), EOL};
-  for (const auto &Pos : PosList) {
-    if (Pos > Begin)
-      EOL = std::min(EOL, Pos);
-  }
+  llvm::StringRef NameRef(Begin, End - Begin);
+  auto Pos = NameRef.find_first_of("\n\r=;,)");
+  if (Pos != llvm::StringRef::npos)
+    NameRef = NameRef.substr(0, Pos);
 
-  StrLen = EOL - Begin;
-  std::string TypeName;
-  if (StrLen > 0) {
-    std::string Type(Begin, StrLen);
+  if (!NameRef.empty()) {
+    std::string Type(NameRef.begin(), NameRef.end());
 
     static constexpr StringRef Keywords[] = {
         // Constexpr specifiers
@@ -339,66 +332,67 @@ std::string IdentifierNamingCheck::HungarianNotation::getDeclTypeName(
 
     // Remove keywords
     for (StringRef Kw : Keywords) {
-      for (size_t Pos = 0;
-           (Pos = Type.find(Kw.data(), Pos)) != std::string::npos;) {
+      for (size_t Pos = 0; (Pos = Type.find(Kw, Pos)) != std::string::npos;) {
         Type.replace(Pos, Kw.size(), "");
       }
     }
-    TypeName = Type.erase(0, Type.find_first_not_of(' '));
 
     // Remove template parameters
     const size_t Pos = Type.find('<');
     if (Pos != std::string::npos) {
-      TypeName = Type.erase(Pos, Type.size() - Pos);
+      Type.erase(Pos);
     }
 
     // Replace spaces with single space.
     for (size_t Pos = 0; (Pos = Type.find("  ", Pos)) != std::string::npos;
-         Pos += strlen(" ")) {
-      Type.replace(Pos, strlen("  "), " ");
+         Pos += 2U) {
+      Type.replace(Pos, 2U, " ");
     }
 
     // Replace " &" with "&".
     for (size_t Pos = 0; (Pos = Type.find(" &", Pos)) != std::string::npos;
-         Pos += strlen("&")) {
-      Type.replace(Pos, strlen(" &"), "&");
+         Pos += 2U) {
+      Type.replace(Pos, 2U, "&");
     }
 
     // Replace " *" with "* ".
     for (size_t Pos = 0; (Pos = Type.find(" *", Pos)) != std::string::npos;
-         Pos += strlen("*")) {
-      Type.replace(Pos, strlen(" *"), "* ");
+         Pos += 1U) {
+      Type.replace(Pos, 2U, "* ");
     }
 
+    Type.erase(0, Type.find_first_not_of(' '));
+
     // Remove redundant tailing.
     static constexpr StringRef TailsOfMultiWordType[] = {
         " int", " char", " double", " long", " short"};
     bool RedundantRemoved = false;
     for (auto Kw : TailsOfMultiWordType) {
-      size_t Pos = Type.rfind(Kw.data());
+      size_t Pos = Type.rfind(Kw);
       if (Pos != std::string::npos) {
         const size_t PtrCount = getAsteriskCount(Type, ND);
-        Type = Type.substr(0, Pos + Kw.size() + PtrCount);
+        Type.erase(Pos + Kw.size() + PtrCount);
         RedundantRemoved = true;
         break;
       }
     }
 
-    TypeName = Type.erase(0, Type.find_first_not_of(' '));
+    Type.erase(0, Type.find_first_not_of(' '));
     if (!RedundantRemoved) {
       std::size_t FoundSpace = Type.find(' ');
       if (FoundSpace != std::string::npos)
         Type = Type.substr(0, FoundSpace);
     }
 
-    TypeName = Type.erase(0, Type.find_first_not_of(' '));
+    Type.erase(0, Type.find_first_not_of(' '));
 
     QualType QT = VD->getType();
     if (!QT.isNull() && QT->isArrayType())
-      TypeName.append("[]");
+      Type.append("[]");
+    return Type;
   }
 
-  return TypeName;
+  return {};
 }
 
 IdentifierNamingCheck::IdentifierNamingCheck(StringRef Name,
@@ -1414,13 +1408,21 @@ IdentifierNamingCheck::getDiagInfo(const NamingCheckId &ID,
                   }};
 }
 
+StringRef IdentifierNamingCheck::getRealFileName(StringRef FileName) const {
+  auto Iter = RealFileNameCache.try_emplace(FileName);
+  SmallString<256U> &RealFileName = Iter.first->getValue();
+  if (!Iter.second)
+    return RealFileName;
+  llvm::sys::fs::real_path(FileName, RealFileName);
+  return RealFileName;
+}
+
 const IdentifierNamingCheck::FileStyle &
 IdentifierNamingCheck::getStyleForFile(StringRef FileName) const {
   if (!GetConfigPerFile)
     return *MainFileStyle;
 
-  SmallString<128> RealFileName;
-  llvm::sys::fs::real_path(FileName, RealFileName);
+  StringRef RealFileName = getRealFileName(FileName);
   StringRef Parent = llvm::sys::path::parent_path(RealFileName);
   auto Iter = NamingStylesCache.find(Parent);
   if (Iter != NamingStylesCache.end())
diff --git a/clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.h b/clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.h
index 27c8e4bc768c4..646ec0eac8dd1 100644
--- a/clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.h
+++ b/clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.h
@@ -205,6 +205,7 @@ class IdentifierNamingCheck final : public RenamerClangTidyCheck {
                        const NamingCheckFailure &Failure) const override;
 
   const FileStyle &getStyleForFile(StringRef FileName) const;
+  StringRef getRealFileName(StringRef FileName) const;
 
   /// Find the style kind of a field in an anonymous record.
   StyleKind findStyleKindForAnonField(
@@ -222,6 +223,7 @@ class IdentifierNamingCheck final : public RenamerClangTidyCheck {
   /// Stores the style options as a vector, indexed by the specified \ref
   /// StyleKind, for a given directory.
   mutable llvm::StringMap<FileStyle> NamingStylesCache;
+  mutable llvm::StringMap<SmallString<256U>> RealFileNameCache;
   FileStyle *MainFileStyle;
   ClangTidyContext *Context;
   const bool GetConfigPerFile;

- Reduce disk IO usage by adding cache to an realpath
  introduced by llvm#81985
@PiotrZSL PiotrZSL force-pushed the readability-identifier-naming-realpath-cache branch from 45bce19 to a2ae273 Compare May 21, 2024 18:46
Copy link
Contributor

@5chmidti 5chmidti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This halves the time clang-tidy takes to run this check on IdentifierNamingCheck.cpp for me :)

Maybe add a release note?

@PiotrZSL
Copy link
Member Author

No need for release notes as this realpath stuff were added in this release anyway. So this is just a fix for degradation.

@PiotrZSL PiotrZSL merged commit c96860a into llvm:main May 28, 2024
4 checks passed
@PiotrZSL PiotrZSL deleted the readability-identifier-naming-realpath-cache branch May 28, 2024 20:09
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
…#92659)

- Reduce disk IO usage by adding cache to an realpath introduced by
llvm#81985
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

readability-identifier-naming reports nothing but does something when no CheckOptions are provided
4 participants