Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When the transient high traffic impacts, the memory usage of the internal buffer pool of Logrus is high. #1124

Closed
edoger opened this issue Mar 31, 2020 · 18 comments · Fixed by #1145
Assignees

Comments

@edoger
Copy link
Contributor

edoger commented Mar 31, 2020

In our production environment, when the transient high traffic impact occurred, we found that the memory occupied by the buffer pool inside Logrus could not be released quickly.

// entry.go 246 - 255
...
	entry.fireHooks()

	buffer = bufferPool.Get().(*bytes.Buffer)
	buffer.Reset()
	defer bufferPool.Put(buffer)
	entry.Buffer = buffer

	entry.write()

	entry.Buffer = nil
...

Objects in the buffer pool are not cleaned up when released.

@dgsb dgsb pinned this issue May 2, 2020
@dgsb
Copy link
Collaborator

dgsb commented May 19, 2020

hello @edoger, that's actually the point of using a sync.Pool here in order to release the pressure on the garbage collector. I'm not sure if this common implementation is that efficient though.

@edoger
Copy link
Contributor Author

edoger commented May 20, 2020

@dgsb The actual situation is tested, and resetting the buffer before placing it in the pool can effectively alleviate this particular problem.

// entry.go
...
	entry.fireHooks()

	buffer = bufferPool.Get().(*bytes.Buffer)
	defer func () {
             buffer.Reset()
             bufferPool.Put(buffer)
         }()
	entry.Buffer = buffer

	entry.write()

	entry.Buffer = nil
...

@dgsb
Copy link
Collaborator

dgsb commented May 20, 2020

I'm not sure to understand how it would solve the issue. As far as I understand the Reset method does not release the underlying storage memory.

@edoger
Copy link
Contributor Author

edoger commented May 26, 2020

@dgsb After resetting the buffer, the occupied memory can be quickly released by the GC. If you do not do this, the memory will be occupied until it is taken out of the pool again. After a system has a large traffic peak, if no subsequent requests arrive (For example, the service is degraded), the objects in the pool will not be used for a long time, then the log content they hold will always remain in memory.

@dgsb
Copy link
Collaborator

dgsb commented May 26, 2020

We can try to do that, but what I understand from the documentation in bytes.Buffer, I'm not sure that the associated data memory can be reclaimed by the GC.

go doc bytes.Buffer.Reset
package bytes // import "bytes"

func (b *Buffer) Reset()
    Reset resets the buffer to be empty, but it retains the underlying storage
    for use by future writes. Reset is the same as Truncate(0).

@edoger
Copy link
Contributor Author

edoger commented May 26, 2020

@dgsb This code shows that the memory was released by the GC.

package main

import (
	"bytes"
	"fmt"
	"runtime"
)

func main() {
	buf := new(bytes.Buffer)
	max := 1024*1024*32
	mem := new(runtime.MemStats)
	runtime.ReadMemStats(mem)
	fmt.Println(mem.Alloc)
	for i := 0; i < max; i++ {
		buf.WriteByte('1')
	}

	runtime.ReadMemStats(mem)
	fmt.Println(mem.Alloc)

	buf.Reset()
	runtime.GC()
	runtime.ReadMemStats(mem)
	fmt.Println(mem.Alloc)
}

Output:

123592
51239344
122544

@edoger
Copy link
Contributor Author

edoger commented May 26, 2020

This code is similar to the scenario we encountered, the memory was not released quickly, and our alarm continued until the service was restarted. @dgsb

package main

import (
	"bytes"
	"fmt"
	"runtime"
	"sync"
	"time"
)

var pool = &sync.Pool{
	New: func() interface{} {
		return new(bytes.Buffer)
	},
}

func main() {
	max := 1024 * 1024
	mem := new(runtime.MemStats)
	runtime.ReadMemStats(mem)
	fmt.Println(mem.HeapAlloc, mem.Alloc)
	for i := 0; i < 50; i++ {
		buf := new(bytes.Buffer)
		for j := 0; j < max; j++ {
			buf.WriteByte('1')
		}
		pool.Put(buf)
	}
	runtime.ReadMemStats(mem)
	fmt.Println(mem.HeapAlloc, mem.Alloc)
	for {
		runtime.ReadMemStats(mem)
		fmt.Println(mem.HeapAlloc, mem.Alloc)
		time.Sleep(time.Second * 2)
	}
}

@dgsb
Copy link
Collaborator

dgsb commented May 26, 2020

with your first example, I have the same output on memory usage with or without the Reset() call.
I guess the full object has been reclaimed by the gc because it's not referenced anymore after the runtime.MemStats call. The Reset may not have anything to do with that.

@edoger
Copy link
Contributor Author

edoger commented May 26, 2020

with your first example, I have the same output on memory usage with or without the Reset() call.

Yes, this program is too simple. The compiler or GC may be smart enough. I added a second program, which is actually the scene encountered.

@dgsb
Copy link
Collaborator

dgsb commented May 26, 2020

Yes I've tried this one too, adding a Reset or not does not seem to have any impact

@dgsb
Copy link
Collaborator

dgsb commented May 26, 2020

I didn't check the implementation of the bytes.Buffer type, but the documentation is quite explicit about keeping the underlying storage for future write.

We can add the Reset call in the defer function, but I guess that won't solve your problem.

I think that the buffer pool is too generic for everybody usage and we should provide somehow an interface which would allow to plug a specific free item list management system

@dgsb
Copy link
Collaborator

dgsb commented May 27, 2020

@edoger what do you think about providing a system for the library consumer to implement its own free item list ?

Would that meet your need if you have strict memory constraint ?

@edoger
Copy link
Contributor Author

edoger commented May 27, 2020

@dgsb This may require defining a resettable interface and a buffer pool that can control the number of objects.

@dgsb
Copy link
Collaborator

dgsb commented May 27, 2020

I don't think we need to do something to replace the bytes.Buffer object.
But we could definitely be able to make the bufferPool exchangeable with something else which provides a Put and Get interface.

Depending on your needs you may even implements something which does not keep track at of a free item pool/list. You may:

  • just allocate and unreference things
  • or implement a pool with a max number of object or a max number of buffer bytes
  • whatever you need

But having a sync.Pool for every use case does not seem flexible enough to me. Because its a tradeoff between memory comsumption and cpu usage by the gc and not every body need to have the cursor put at the same place there.

@edoger
Copy link
Contributor Author

edoger commented May 27, 2020

I didn't check the implementation of the bytes.Buffer type, but the documentation is quite explicit about keeping the underlying storage for future write.

We can add the Reset call in the defer function, but I guess that won't solve your problem.

I think that the buffer pool is too generic for everybody usage and we should provide somehow an interface which would allow to plug a specific free item list management system

The patch we are using in our production environment is to create a new buffer object every time, and discard it directly after use. Compared with the overhead of GC, the memory problem is more worthy of attention.

I checked some information, it is very necessary to do the necessary cleanup before the temporary object is put into the pool (the Reset method of the buffer creates a new slice, and the array referenced behind has not changed. May need further confirmation), after Go1.12, the life cycle of the objects in the pool has been extended.

@edoger
Copy link
Contributor Author

edoger commented May 27, 2020

I don't think we need to do something to replace the bytes.Buffer object.
But we could definitely be able to make the bufferPool exchangeable with something else which provides a Put and Get interface.

Depending on your needs you may even implements something which does not keep track at of a free item pool/list. You may:

  • just allocate and unreference things
  • or implement a pool with a max number of object or a max number of buffer bytes
  • whatever you need

But having a sync.Pool for every use case does not seem flexible enough to me. Because its a tradeoff between memory comsumption and cpu usage by the gc and not every body need to have the cursor put at the same place there.

Can we provide a setting option to control whether to use the buffer pool?

@dgsb
Copy link
Collaborator

dgsb commented May 27, 2020

Can we provide a setting option to control whether to use the buffer pool?

Indeed, the current buffer pool implementation with the reset moved before the put would be the default configuration for the package.

@dgsb
Copy link
Collaborator

dgsb commented May 28, 2020

@edoger please check #1145, it should help to let you configure/tune the buffer pool usage.
I've also added a buffer.Reset call before putting the buffer back in the pool when using the defaultPool, but I don't expect that will solve your issues.

@dgsb dgsb unpinned this issue May 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants