add `group_by` method to `AgentSet` for attribute-based grouping #2092

ai-naymul · 2024-03-28T09:10:36Z

Discussion(#2074)

In this pr I have added a group_by method in the AgentSet class, allowing agents to grouped based on some specific attributes.

Signed-off-by: Naymul Islam <naymul504@gmail.com>

for more information, see https://pre-commit.ci

github-actions · 2024-03-28T09:17:00Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
Schelling	small	🔵 -0.0% [-0.3%, +0.3%]	🔵 -1.0% [-1.3%, -0.8%]
Schelling	large	🔵 +0.2% [-0.7%, +1.1%]	🔵 -3.9% [-6.6%, -1.5%]
WolfSheep	small	🔵 -0.4% [-1.3%, +0.5%]	🔵 -0.5% [-0.7%, -0.3%]
WolfSheep	large	🔵 +0.6% [+0.2%, +0.9%]	🔵 +0.9% [-2.0%, +3.6%]
BoidFlockers	small	🔵 +1.8% [+1.1%, +2.5%]	🔵 +1.0% [+0.3%, +1.6%]
BoidFlockers	large	🔵 -3.2% [-6.6%, -0.0%]	🔵 -3.8% [-5.8%, -1.6%]

EwoutH · 2024-03-28T10:01:49Z

Thanks for this PR! It looks like it could fill a nice use case, especially for attributes with discrete values.

On thing I'm thinking about is if we can do something smart to select ranges or thresholds for attributes with continuous values. For example divide agents in to bins based on some attribute.

Curious what the performance of this is compared to (multiple) select() cases.

quaquel · 2024-03-28T10:08:08Z

I also had a quick look. I like the direction. Some quick points:

how to handle non-discrete attributes? The current code is not explicit on this. The obvious direction is to have some kind of binning. Alternatively, there could be some check of the type of the value of the attribute. If this is, for example, a float, an exception or warning might be raised. Either way, this issue need to be handled explicitly.
I would like to see unit tests for this code to ensure we maintain proper coverage.

ai-naymul · 2024-03-28T11:57:48Z

Thanks for this PR! It looks like it could fill a nice use case, especially for attributes with discrete values.

On thing I'm thinking about is if we can do something smart to select ranges or thresholds for attributes with continuous values. For example divide agents in to bins based on some attribute.

Curious what the performance of this is compared to (multiple) select() cases.

Hi,
Thank you for the feedback!

Integrating range or threshold-based grouping for continuous attributes is a great idea. We could potentially extend group_by to accept a binning function or range specifications to categorize these attributes. Regarding performance, initial tests suggest group_by is more efficient for discrete attributes compared to multiple select() calls, especially as the number of attributes increases. I'll explore the binning functionality and conduct further performance comparisons.

ai-naymul · 2024-03-28T12:00:14Z

I also had a quick look. I like the direction. Some quick points:

how to handle non-discrete attributes? The current code is not explicit on this. The obvious direction is to have some kind of binning. Alternatively, there could be some check of the type of the value of the attribute. If this is, for example, a float, an exception or warning might be raised. Either way, this issue need to be handled explicitly.

I would like to see unit tests for this code to ensure we maintain proper coverage.

Addressing non-discrete attributes through binning or type checks is a valid concern. I propose adding an optional binning_function parameter to the method to handle continuous attributes. This function can categorize attributes into discrete bins. If not provided for continuous attributes, we'll issue a warning. I'll also add the necessary unit tests.

Signed-off-by: Naymul Islam <naymul504@gmail.com>

…mul/mesa into feature-agentset-group-by

for more information, see https://pre-commit.ci

rht · 2024-03-29T01:53:19Z

I tried pandas' groupby on float values:

In [4]: data = {"a": np.random.random(4), "b": np.random.random(4)}

In [5]: df = pd.DataFrame(data=data)

In [6]: df
Out[6]:
          a         b
0  0.437550  0.389805
1  0.180405  0.032264
2  0.509254  0.147376
3  0.181636  0.102611
In [9]: for k, v in df.groupby("b"):
   ...:     print(df.groupby("b").get_group(k))
   ...:
          a         b
1  0.180405  0.032264
          a         b
3  0.181636  0.102611
          a         b
2  0.509254  0.147376
         a         b
0  0.43755  0.389805

i.e. they seem to just let you do it even if the group has 1 element each.

quaquel · 2024-04-07T08:38:46Z

I tried pandas' groupby on float values:

I like to get other people's perspective on this as well. Personally, I am in favor of mimicking pandas as much as possible, so just group on the float as normal. This keeps the code nice and simple. If you want to do binning, you can already easily code this up through AgentSet.do and a callable. So

my_agents.do(some_binning_function).group_by("my_newly assigned attribute")

quaquel · 2024-04-07T08:40:11Z

mesa/agent.py

+        Returns:
+            dict: A dictionary where keys are attribute values and values are lists of agents with those attribute values.
+        """
+        grouped_agents = defaultdict(list)


Should this not be a defaultdict(AgentSet)? Similar to the binning discussion, this would mimic pandas in that groupby returns a group identifier and an AgentSet with the agents in that group.

You raise a valid point regarding the return type of the group_by method. Changing the return type to defaultdict(AgentSet) would indeed align more closely with pandas' groupby functionality.

grouped_agents = defaultdict(AgentSet)

when adding agents to the groups:

for agent in self: attr_value = getattr(agent, attr_name, None) if attr_value not in grouped_agents: grouped_agents[attr_value] = AgentSet(model=self.model) grouped_agents[attr_value].add(agent)

Let me know whats your thought on these changes. Looking forward :)

quaquel · 2024-04-07T08:41:29Z

Thanks for this! Just one thought on the binning conversation and a question regarding the appropriate return type. The code itself, and the tests look fine.

ai-naymul · 2024-04-09T06:10:20Z

Thanks for this! Just one thought on the binning conversation and a question regarding the appropriate return type. The code itself, and the tests look fine.

Hi sorry for the late reply.

I think we don't need to change the It group_by method to handle the binning specifically, it will handle continuous attributes directly without enforcing binning.

Here is example usage for binning and grouping :

def binning_function(agent):
agent.binned_attribute = agent.continuous_attribute // 10 * 10

my_agents.do(binning_function)

grouped_agents = my_agents.group_by("binned_attribute")

EwoutH · 2024-04-17T12:39:52Z

@ai-naymul is this PR ready for (final) review? If so, can you mark it as such (by pressing the "Ready to review") button?

Please also update the PR message with a bit of context (including which problem this solves and how it's implemented on a high level).

ai-naymul · 2024-04-19T06:34:20Z

@ai-naymul is this PR ready for (final) review? If so, can you mark it as such (by pressing the "Ready to review") button?

Please also update the PR message with a bit of context (including which problem this solves and how it's implemented on a high level).

Hi sorry for the inconvenience, I was having some health issues and wasn't able to work on this properly as well.

I think need some improvement like need the feedback regarding the return type of the group_by method from @quaquel .

I think it's final for review although. I will add some description about this pr as well.

ai-naymul and others added 2 commits March 28, 2024 15:04

add group_by method to AgentSet for attribute-based grouping

7243dc9

Signed-off-by: Naymul Islam <naymul504@gmail.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

7478489

for more information, see https://pre-commit.ci

EwoutH requested review from quaquel, rht and Corvince March 28, 2024 10:02

EwoutH added the enhancement Release notes label label Mar 28, 2024

ai-naymul and others added 3 commits March 28, 2024 18:06

add test for agentset-group-by

322bcda

Signed-off-by: Naymul Islam <naymul504@gmail.com>

Merge branch 'feature-agentset-group-by' of https://github.com/ai-nay…

8115a42

…mul/mesa into feature-agentset-group-by

[pre-commit.ci] auto fixes from pre-commit.com hooks

d5fc9b9

for more information, see https://pre-commit.ci

quaquel reviewed Apr 7, 2024

View reviewed changes

ai-naymul marked this pull request as ready for review April 20, 2024 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add `group_by` method to `AgentSet` for attribute-based grouping #2092

add `group_by` method to `AgentSet` for attribute-based grouping #2092

ai-naymul commented Mar 28, 2024 •

edited

github-actions bot commented Mar 28, 2024

EwoutH commented Mar 28, 2024

quaquel commented Mar 28, 2024

ai-naymul commented Mar 28, 2024

ai-naymul commented Mar 28, 2024

rht commented Mar 29, 2024

quaquel commented Apr 7, 2024

quaquel Apr 7, 2024

ai-naymul Apr 19, 2024 •

edited

quaquel commented Apr 7, 2024

ai-naymul commented Apr 9, 2024

EwoutH commented Apr 17, 2024 •

edited

ai-naymul commented Apr 19, 2024

add group_by method to AgentSet for attribute-based grouping #2092

Are you sure you want to change the base?

add group_by method to AgentSet for attribute-based grouping #2092

Conversation

ai-naymul commented Mar 28, 2024 • edited

In this pr I have added a group_by method in the AgentSet class, allowing agents to grouped based on some specific attributes.

github-actions bot commented Mar 28, 2024

EwoutH commented Mar 28, 2024

quaquel commented Mar 28, 2024

ai-naymul commented Mar 28, 2024

ai-naymul commented Mar 28, 2024

rht commented Mar 29, 2024

quaquel commented Apr 7, 2024

quaquel Apr 7, 2024

Choose a reason for hiding this comment

ai-naymul Apr 19, 2024 • edited

Choose a reason for hiding this comment

quaquel commented Apr 7, 2024

ai-naymul commented Apr 9, 2024

EwoutH commented Apr 17, 2024 • edited

ai-naymul commented Apr 19, 2024

add `group_by` method to `AgentSet` for attribute-based grouping #2092

add `group_by` method to `AgentSet` for attribute-based grouping #2092

ai-naymul commented Mar 28, 2024 •

edited

ai-naymul Apr 19, 2024 •

edited

EwoutH commented Apr 17, 2024 •

edited