findOne with nested refpath in large array of subdocuments causes out of memory exception

**Do you want to request a *feature* or report a *bug*?**

Bug

**What is the current behavior?**

A findOne() invocation, paired with a dynamic populate() call against a document with an array of subdocuments that contain a nested refpath, 'arrayName.typeField', causes an out of memory exception. This particularly happens when the subdocument array contains a large number of records (i.e. 2100). I don't fully understand the logic, so I will just list my findings

\------

During the Model.populate('arrayName.idField') -> getModelsMapForPopulate() -> _getModelNames() method invocation, the array of modelNames returned contains 2100 elements, each of which with a string value of the referred Model . Down the line, this causes the addModelNamesToMap() method to create a map with a very large memory footprint, where an individual model in the map has a total of 2.1 million records in it's allIds property, which just seems wrong:

![count undefined](https://user-images.githubusercontent.com/90337173/142239732-c4851d3d-7c12-41eb-bb49-eb6a779a22f0.png)

Each of the arrays in allIds are copies of each other, which brings into question why we need them all. In this scenario, we ultimately see an out of memory exception during an invocation of Model.populate('arrayName.idField') -> _done() -> _assign -> utils.clone(mod.allIds):

![RUNNING](https://user-images.githubusercontent.com/90337173/142239804-2d057ff3-9302-4396-a5b7-280bbbdd82b7.png)

This does not seem right. If we return the modelNames array from _getModelNames() with just 3 elements, 'Model1', 'Model2', and 'Model3', populate('arrayName.idField') seems to produce the correct results without this huge amount of memory usage. Maybe I don't really know what I'm talking about, but this seems like overkill

---

**If the current behavior is a bug, please provide the steps to reproduce.**

Pre-requisite: 1000+ elements exist in the document's 'items' array

```
const SampleSchema = new mongoose.Schema(
  {
    name: String,
    items: [{
      itemId: {
        type: mongoose.Schema.Types.ObjectId,
        required: true,
        refPath: 'items.type'
      },
      type: {
        type: String,
        required: true,
        enum: ['Model1, Model2, Model3']
      }
    }],
 },
 {
    timestamps: {
      createdAt: 'create_date',
      updatedAt: 'update_date'
    }
 }
)

const SampleModel = mongoose.model('Sample', SampleSchema)

async.waterfall([
      (cb) => {
        SampleModel
          .findOne({
            _id: mongoose.Types.ObjectId(id),
            _organization: mongoose.Types.ObjectId(params.organizationId)
          })
          .populate('links.item')
          .lean()
          .exec(cb)
     },
     (sample, cb) => {
	// process data here
     }
], completedCallback)
```

**What is the expected behavior?**

The above code executed with the pre-requisite number of subdocuments does not blow up the heap

**What are the versions of Node.js, Mongoose and MongoDB you are using? Note that "latest" is not a version.**

NodeJs: 12.22.6
Mongoose: 6.0.13
MongoDB: 5.0.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Uh oh!

findOne with nested refpath in large array of subdocuments causes out of memory exception #10983

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

findOne with nested refpath in large array of subdocuments causes out of memory exception #10983

Description

Activity

vkarpov15 commented on Dec 19, 2021

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions