Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor collections_dir feature for consistency #6685

Merged
merged 4 commits into from Jan 25, 2018

Conversation

ashmaroli
Copy link
Member

@ashmaroli ashmaroli commented Jan 8, 2018

I believe this removes bugs with the recently released collection_dir feature reported via jekyll/jekyll-feed#202 and confirmed via #6679

Basically with this patch,

  • Site runs as usual if the collections_dir setting is not customized.
  • When the collections_dir setting is customized, Jekyll looks for all collections within that directory only:
    • Posts and drafts at the root will no longer be read / processed — you need to move them inside the collections_dir
    • Relative path of all "documents" will be with-respect-to that directory. e.g. If the directory is named my_collections, then the relative_path for ./my_collections/_puppies/rover.md is simply _puppies/rover.md
    • Ergo, relative_path for a document does not change whether the collection is at the site-root or at the root of the collection_dir

@localheinz
Copy link
Contributor

localheinz commented Jan 8, 2018

@ashmaroli

As a note, what I tripped over initially was that I assumed that a collections directory would be special directory (similar to _data, _includes, etc), so after upgrading from 3.6.2 to 3.7.0, my initial assumption was that with a configuration of

collections_dir: _collections

and a directory layout similar to

.
├── _collections
│   ├── books
│   └── puppies
├── _data
├── _includes
├── _layouts
└── _posts

everything would be neatly arranged, and work fine.

Note:

  • collections directory prefixed with _ (so special directories are all in order)
  • directories for collections within collections directory without _ prefix

Just felt like throwing this in, as an FYI.

@DirtyF DirtyF requested a review from a team January 8, 2018 12:54
Copy link
Member

@Crunch09 Crunch09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good, just some small things. Can we close #6679 as this supersedes that one?

Then I should get a zero exit status
And the _site directory should exist
And the "_site/puppies/rover.html" file should exist
And the "_site/post-at-root.html" file should not exist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand it doesn't exist anyway but shouldn't that be _site/2009/03/27/post-at-root.html ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.. it should.. we ought to be checking for the expected outcomes.. 👍

And the "_site/puppies/rover-in-gathering.html" file should exist
And the "_site/2009/03/27/post-in-gathering.html" file should exist
And the "_site/2009/03/27/draft-in-gathering.html" file should exist
And the "_site/gathering/2009/03/27/draft-in-gathering.html" file should not exist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of lines below we check that the _site/gathering directory doesn't exist, so no need to check for a file in a non-existing directory i think.

Given(%r!^I have the following documents? under the "(.*)" collection within the "(.*)" directory:$!) do |label, dir, table|
table.hashes.each do |input_hash|
title = slug(input_hash["title"])
path = File.join(dir, "_#{label}", "#{title}.md")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you chose .md for documents but .markdown for the post / drafts step above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only because that's how our default site gets created from jekyll new .. and I think, an existing step also has something similar set up.. not sure..

# outside that directory.
def not_within_configured_directory?(dir)
collections_dir = site.config["collections_dir"]
!collections_dir.empty? && !dir.start_with?("/#{collections_dir}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of renaming this method to within_configured_directory? and then just checking with unless or if ! ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can be done.. I kept it negative so that human can readily parse two negative expressions easily in comparison to one negative and the other positive..

    def within_configured_directory?(dir)
      collections_dir = site.config["collections_dir"]
      !collections_dir.empty? && dir.start_with?("/#{collections_dir}")
    end

@ashmaroli
Copy link
Member Author

@localheinz I understand your suggestion completely.
While I agree that the custom collections_dir with a leading underscore would allow greater aesthetics, I'm not in favor of removing the underscore from the directory for collections for reasons below:

  • having the leading underscore will allow greater flexibility when moving the collections en bulk into and outside the custom container (during migration and testing across older and newer versions of Jekyll)
  • if we decide to add support for categories in collections, similar to hows posts have it with e.g. movies/_posts/*.markdown, having the leading underscore in the path would allow writing simpler code and also allow users to mentally scan for their actual collection directory in a file tree

@localheinz
Copy link
Contributor

@ashmaroli

I'm totally fine with _ and without, so all good. Thanks a lot for your efforts towards fixing these issues! 👍

Copy link
Member

@DirtyF DirtyF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding those tests !

Minor concerns:

  1. Even if we document this on the website or in the default _config.yml generated by jekyll new, we should warn people that if a collection_dir is defined they should move _posts and _drafts into _collection_dir. Tests only check for a zero exit code, but it should trigger a warning in those use cases. This could be tackled in a new PR.

  2. Out of curiosity, why do we add a date to the collection documents here?

📦 Happy to release a 3.7.1 patch once the team is onboard with this 😄

@ashmaroli
Copy link
Member Author

why do we add a date to the collection documents here?

Its not required at all, but existing scenarios had them, so I included them here as well.. will make it easier if future developers need to test for some other date-related parameter..

@ashmaroli ashmaroli added this to the v3.7.1 milestone Jan 11, 2018
Then I should get a zero exit status
And the _site directory should exist
And the "_site/puppies/rover.html" file should exist
And the "_site/2009/03/27/post-at-root.html" file should not exist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woof, this is going to screw people up. We can't enforce that the _posts directory is in the collections_dir until 4.0, if we even want to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not.. we just released this feature in 3.7.0 .. The sooner we ship this patch, the lesser no.of users gonna have issues..

Then I should get a zero exit status
And the _site directory should exist
And the "_site/puppies/rover.html" file should exist
And the "_site/2009/03/27/draft-at-root.html" file should not exist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_drafts can't be enforced to be in collections_dir until 4.0 either.

@relative_directory ||= Pathname.new(directory).relative_path_from(
Pathname.new(site.source)
).to_s
@relative_directory ||= "_#{label}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be relative to the site source, so if source is . then it should be collections_dir/_label.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things that're going to break if @relative_directory ||= collections_dir/_label:

if draft?
categories_from_path("_drafts")
else
categories_from_path(collection.relative_directory)
end

def draft?
data["draft"] ||= relative_path.index(collection.relative_directory).nil? &&
collection.label == "posts"
end

def cleaned_relative_path
@cleaned_relative_path ||=
relative_path[0..-extname.length - 1].sub(collection.relative_directory, "")
end

@@ -83,7 +83,7 @@ def draft?
# Returns a String path which represents the relative path
# from the site source to this document
def relative_path
@relative_path ||= Pathutil.new(path).relative_path_from(site.source).to_s
@relative_path ||= Pathutil.new(path).relative_path_from(focal_point).to_s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't this be the site.source? Are you thinking the collections_dir needs to disappear in the output? Hm. Maybe relative_path_from(collection.path)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collection does not have an instance method :path

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason it can't have that method? I don't like the focal_point method – seems like it's capturing some logic that should be handled somewhere else, like path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason...?

I did not want to introduce a new public_method in a patch-release..

Copy link
Member

@DirtyF DirtyF Jan 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well now that we have introduce a potential breaking change in a minor version, we should be above that kind of consideration here. The more collections_dir behavior is well thought, the better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parkr Your call.. should I introduce a new public_method in a patch-release..?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding Collection#path to fix this bug is a fine solution for a patch release. SemVer makes no assumptions about methods being added, as far as I can tell. Our SemVer is guided by new features (minor bump) and breaking changes (major bump).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call "enhancing" Collections class with a new public_method as a feature.. just like removing an existing public_method is considered a breaking change..

That said, this is not the place for debating SemVer.. your comment above is clear enough an answer for how to proceed..
Will push a commit soon..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small change of plans.. Instead of having Collection#path for each collection in a Site, I decided to compute the path once via Site#collections_path

Static files within a collection inside a custom collections_dir needs
to output to the same path as when the collection is at the root of the
source directory.
@ghost
Copy link

ghost commented Jan 25, 2018

@jekyllbot: merge +bug

@jekyllbot jekyllbot merged commit a6b4ce0 into jekyll:master Jan 25, 2018
jekyllbot added a commit that referenced this pull request Jan 25, 2018
@ashmaroli ashmaroli deleted the fix-collections-dir branch January 29, 2018 16:51
@jekyll jekyll locked and limited conversation to collaborators Jul 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants