Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce build_lazy/2 #406

Closed
wants to merge 2 commits into from
Closed

Introduce build_lazy/2 #406

wants to merge 2 commits into from

Conversation

germsvel
Copy link
Collaborator

@germsvel germsvel commented Dec 8, 2020

Closes #402, #385, and #373

The problem

Consider this scenario. People want a different email per account (especially if the email factory has a sequence), but they do this:

build_pair(:account, email: build(:email))

The problem is that build/2 is just function calling. So the above is equivalent to this:

email = build(:email)
build_pair(:account, email: email)

In other words, we get one email factory for all of the accounts.

The problem is worse when using it with Ecto. We can image the following scenario:

insert_pair(:account, user: build(:user))

If the user factory has a uniqueness constraint, insert_pair/2 will raise an error because we'll try to insert a user with the same value (even if using a sequence).

Solution

The solution is to delay evaluation of the factory. We do this by introducing a private struct %ExMachina.InstanceTemplate{} to hold the data necessary to build the instance of that factory.

Note that this struct is private and liable to change, so users of the library shouldn't depend on it.

The trick then lies in the build/2 function. We make it a terminal function in that it will evaluate any lazy factories recursively. To do that, we update the build/2 function to evaluate ExMachina.InstanceTemplate structs after building the build/2 factory.

In an early implementation of this feature, I evaluated the lazy factories when we first got the attributes in build/2 (i.e. attrs |> evaluate_lazy_factories() |> Enum.into(%{})). But then I opted for evaluating the lazy factories after the factory is created because that allows us to use build_lazy/2 from within factory definitions, and not just when combined with build/2 or build_* functions.

lib/ex_machina.ex Outdated Show resolved Hide resolved
add :links, {:array, :map}, default: []
add(:article_id, :integer)
add(:author, :map)
add(:links, {:array, :map}, default: [])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autoformatter added all the parentheses.

add(:pub_number, :string)
end

create(unique_index(:publishers, [:pub_number]))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this uniqueness constraint to properly test the scenario in ecto where insert_pair(:account, publisher: build(:publisher)) would raise a constraint error.

We used to have a struct factory defined as `%{__struct__: FooBar}` but
it was confusing since the factory name was `struct_factory`.

So we change to be `foo_bar_factory` and create a module with a struct
called `FooBar`. This has the benefit of allowing us to assert
`%FooBar{}` in tests -- something we couldn't do with the previous
format.
The problem
------------

People want a different email per account (especially if the email
factory has a sequence), but they do this:

```elixir
build_pair(:account, email: build(:email))
```

The problem is that `build/2` is just function calling. So the above is
equivalent to this:

```elixir
email = build(:email)
build_pair(:account, email: email)
```

In other words, we get on email factory for all of the accounts.

The problem is made worse when using it with Ecto. We can imagine the
following scenario:

```elixir
insert_pair(:account, user: build(:user))
```

If the user factory has a uniqueness constraint, `insert_pair/2` will
raise an error because we'll try to insert a user with the same value
(even if using a sequence).

Solution
--------

The solution is to delay evaluation of the factory. We do this by
introducing a private struct `%ExMachina.InstanceTemplate{}` to hold the
data necessary to build the instance of that factory.

Note that this struct is private and liable to change, so users of the
library shouldn't depend on it.

The trick then lies in the `build/2` function. We make it a terminal
function in that it will evaluate any lazy factories recursively. To do
that, we update the `build/2` function to evaluate
`ExMachina.InstanceTemplate` structs after building the `build/2`
factory.

In an early implementation of this feature, I evaluated the lazy
factories when we first got the attributes in `build/2` (i.e. `attrs |>
evaluate_lazy_factories() |> Enum.into(%{})`). But then I opted for
evaluating the lazy factories after the factory is created because that
allows us to use `build_lazy/2` from within factory definitions, and not
just when combined with `build/2` or `build_*` functions.

How it interacts with "full-control" factories
---------------------------------------------

We evaluate lazy factories before passing the attrs to the factories
that have full control. That means that we can still use `build_lazy`
within `build/2` and `build_*` functions that have a factory with full
control, but we cannot use `build_lazy/2` as part of the definition of a
factory that has full control.

In other words, we can do this:

```elixir
def account_factory(attrs) do
  %{
    user: build(:user),
    ...
  }
end

build(:account, build_lazy(:user))
```

But we cannot do this:

```elixir
def account_factory(attrs) do
  %{
    user: build_lazy(:user),
    ...
  }
end
```

This seems like the best compromise we can make so that people can
continue using factories with full control, but without the ability to
define lazy factories in the definition -- since ExMachina does nothing
after the factory is defined.

Notes on the name `InstanceTemplate`
-----------------------------------

I call it `InstanceTemplate` because it's not exactly an instance of a
factory. I only has the recipe to build a factory in that it has the
factory `name` and the extra attributes. But, evaluating the instance
template twice when the underlying factory has a sequence will not
result in identical factories.

In other words, given the factory:

```elixir
def user_factory do
  %{email: sequence("email")}
end
```

Then evaluating the same build lazy template will not give use the same
sequence:

```elixir
template = build_lazy(:user)

InstanceTemplate.evaluate(template) != InstanceTemplate.evaluate(template)
```

Potential future work
---------------------

This work opens the possibility of passing the built factory as an
argument to `build_lazy` in a factory definition, but we'd have to
modify how it works. I leave that for future work, if indeed that's
something people want.

Here's an example of how it could work:

```elixir
def account_factory do
  %{
    private: true,
    profile: build_lazy(fn account ->
      build(:profile, private: account.private
    end)
  }
end
```
Copy link

@WilHall WilHall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation makes sense to me, though I am curious what the benefits and drawbacks of this approach are over allowing factory attributes to accept functions? This would remove the need for a separate build_lazy method which would then become:

insert_pair(:account, user: fn -> build(:user) end)

This would be a bit more general purpose, allowing other types of delayed execution when resolving attributes, and would potentially allow for passing context into the anonymous function if we ever needed to do that.

@germsvel
Copy link
Collaborator Author

@WilHall that's an excellent question. I wrote a little bit about it in #402, but I should've mentioned it here.

The anonymous function seems simpler to me, and I like that people could even use anonymous functions to delay execution instead of using build_lazy/2. But I don't know if that would be a breaking change. If people have been using anonymous functions in their factories as an attribute like this:

def user_factory do
  %{
    name: "Some Name",
    creation_callback: fn -> User.user_created!() end}
  }
end

We would all of the sudden evaluate those anonymous functions when building a factory because the anonymous function is now the mechanism by which we delay execution. It's not something I have done in apps, and maybe others don't use ExMachina like that, but I also don't know all the cases for how ExMachina is used in the wild.

What do you think Wil? Is it safe enough to use the anonymous function approach? I really do like it.

@germsvel
Copy link
Collaborator Author

@WilHall I've given more thought to what you said, and it made me realize that perhaps we really can introduce this feature without adding a new function like build_lazy/2. We can just take anonymous functions.

What I really like about the new approach is that we can also include passing the parent factory into the anonymous function so that we can get something like "derived" attributes.

In any case, I'm opening a different PR since it's fairly different.

Closing this in favor of #408

@germsvel germsvel closed this Dec 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: Introduce build_lazy/2
2 participants