Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.OutOfMemoryError during deserialization #18

Closed
psindrome opened this issue Jun 4, 2012 · 18 comments
Closed

java.lang.OutOfMemoryError during deserialization #18

psindrome opened this issue Jun 4, 2012 · 18 comments

Comments

@psindrome
Copy link

The stacktrace has been attached at the bottom. Known issue ?

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at com.fasterxml.jackson.core.util.TextBuffer._charArray(TextBuffer.java:705)
at com.fasterxml.jackson.core.util.TextBuffer.expand(TextBuffer.java:664)
at com.fasterxml.jackson.core.util.TextBuffer.append(TextBuffer.java:468)
at com.fasterxml.jackson.core.io.SegmentedStringWriter.write(SegmentedStringWriter.java:67)
at com.fasterxml.jackson.core.json.WriterBasedJsonGenerator._flushBuffer(WriterBasedJsonGene
rator.java:1799)
at com.fasterxml.jackson.core.json.WriterBasedJsonGenerator._writeString(WriterBasedJsonGene
rator.java:975)
at com.fasterxml.jackson.core.json.WriterBasedJsonGenerator.writeString(WriterBasedJsonGener
ator.java:436)
at com.fasterxml.jackson.databind.ser.std.StringSerializer.serialize(StringSerializer.java:3
6)
at com.fasterxml.jackson.databind.ser.std.StringSerializer.serialize(StringSerializer.java:1
)
at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter
.java:464)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerB
ase.java:504)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:117)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedLi
stSerializer.java:94)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedLi
stSerializer.java:1)
at com.fasterxml.jackson.databind.ser.std.AsArraySerializerBase.serialize(AsArraySerializerB
ase.java:150)
at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter
.java:464)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerB
ase.java:504)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:117)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedLi
stSerializer.java:94)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedLi
stSerializer.java:1)
at com.fasterxml.jackson.databind.ser.std.AsArraySerializerBase.serialize(AsArraySerializerB
ase.java:150)
at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter
.java:464)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerB
ase.java:504)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:117)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedLi
stSerializer.java:94)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedLi
stSerializer.java:1)
at com.fasterxml.jackson.databind.ser.std.AsArraySerializerBase.serialize(AsArraySerializerB
ase.java:150)
at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter
.java:464)
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerB
ase.java:504)
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:117)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedLi
stSerializer.java:94)
at com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedLi
stSerializer.java:1)

@cowtowncoder
Copy link
Member

From stack trace, looks like you were trying to create a java String for JSON, and you don't have enough heap to do that?
So you probably need to give JVM a bigger heap, or stream results out -- usually there isn't much need to create an intermediate String or byte[], but rather write contents out using OutputStream.

@psindrome
Copy link
Author

The last heap size tried was at about 1200MB, the model was deserialized from xml and the error occurs while trying to serialize it to json. A conversion utility is the target application. 1200MB of heap space to convert a model got from 4 MB xml sounds a bit unsound. Do let me know of your thoughts. I have not done further investigation.

Will try the option of writing to os, if that if sufficient for our current use. Appreciate any further information if available.

@psindrome
Copy link
Author

Note: xstream xml parser does the sesrialization of the same model very efficiently.

@psindrome
Copy link
Author

There is surely something of interesting proportions here, I got a 2.3G file after writing the same model to the output stream.

@cowtowncoder
Copy link
Member

The fundamental problem is that you are trying to create a huge big String here. Why are you doing that? The usual way is to serialize JSON in a file, network connection or whatever the destination is. Using intermediate String is very wasteful, and takes up lots of memory as you have found.

As to size: perhaps you have shared depencies (acyclic graph). If so, by default all referenced values are written out fully.
XStream can optionally indicate shared references using explicit reference semantics.

@psindrome
Copy link
Author

Firstly, The seconds set of comments do not involve creating a string. 2.3G file resulted from writing the object to the stream. Its not a memory heap issue after that.

Secondly, I from much of the discussion surrounding this api, assumed that it would fully compatible with anything Xtream produced representing a model should also be supported by Jackson api.

Correct me if am wrong about either of the above.

@cowtowncoder
Copy link
Member

Ok, with respect to size, initial code, understood. So size difference is present in resulting JSON. And the in-memory building was just exhibiting this odd ballooning of data.

As to second point: I am not sure where the idea of compatibility between XStream and Jackson came from -- as far as I know, XStream produces a strange kind of JSON, mapped to from XML representation (XStream is an XML library with some JSON conversion provided by Jettison). So, no, structures are not going to necessarily be similar. Both will be valid JSON of course.

In fact, model that XStream uses is pretty different, as it starts with XML model. Only in later stages does this get serialized as JSON, which results in some additional "non-JSON" things getting added (due to things like XML distinction between elements and attributes; as well as lack of array concept in XML).

From above I am guessing that the data being serialized does include shared objects (or perhaps even cycles?). If so, that could explain the difference, if XStream is configured to handle cyclic data structures: it can do this by using ids.
Jackson does not do this by default, although with Jackson 2.0 it is possible to handle Object Identity as well.
But this does require using @JsonIdentityInfo annotations.

@psindrome
Copy link
Author

Hi,

The comments from your last response were helpful in terms of resolving our immediate issues. I will post the details of resolution in the context of the api and close the issue at the earliest. Thanks for your patience.

Regards,

@cowtowncoder
Copy link
Member

Ok good -- glad you managed to get things working, and would definitely be interesting in learning what was done.
Usually this is very useful for other users as well, when they encounter similar issues.

@psindrome
Copy link
Author

I will leave the details, after a quick chat with my team. Least I can do for the help offered.

@cowtowncoder
Copy link
Member

I'll close this down to clean up list before release; still interested in details if those can be shared.

@JonasJurczok
Copy link

@cowtowncoder
Copy link
Member

cowtowncoder commented Apr 23, 2020

lol. That is... surprisingly close hit. :)

(I lived in Denver when I chose my nick for OSS work -- alternative to "cowtown coding" was "conman consulting" so I could have become "conmanconsult" had coin flip went the other way)

FWIW, such a stack trace might result from someone trying to parse a gigabyte file with encoding error, in which JSON String value does not end where it should.
Or someone just created malicious payload.

And there is sort of relevant newer issue to maybe add protection settings: #611

@JonasJurczok
Copy link

Man, you passed on conmanconsult? I have to steal that name for the future :D

The problem I have is that the object tree is extremely strangely formed. It is basically a tree but every object has a reference to a list with all objects that are common on some dimension. This is true for all objects for multiple dimensions.

So for every object Jackson finds multiple lists with all objects of the same dimension.
This is a type of infinite recursion that it simply cannot decode.

The source file is 50kb, the serializer crashed at 70gb of output.

I just skip the whole deserialisation stuff and solve the problem differently.

Thanks for the hint, though.

@cowtowncoder
Copy link
Member

@JonasJurczok I am not sure I understand the use case at all, but I am also curious as to how the size could explode that radically. At least with out of the box functionality. Perhaps some interesting custom (de)serializers in the mix?

@JonasJurczok
Copy link

JonasJurczok commented May 2, 2020

Hey, I totally forgot to respond here apparently..

The problem turned out to be the object structure.

To illustrate picture two objects

Obj A
- Name = A
- attributelist = L-1

Obj B
- Name = B
- attributelist = L-1

Obj L-1
- Name = Attributes
- objects = [A, B]

So for every serialisation of A Jackson would also serialize B and for every B it would serialize A.
But I have ~30 Objects, each containting 3-5 lists of this kind with 15-20 objects in each list.
And that permutates to hell and back apparently.

I now solved it by not serializing it at all but instead using the debugger and some code to inspect the object. It's not perfect but better than nothing :)

@cowtowncoder
Copy link
Member

@JonasJurczok ah. Ok that makes sense. For general purpose debugging/troubleshooting, use of @JsonIdentityInfo (which can solve cyclic dep problems, only fully serializing instance once) probably would not work.

@JonasJurczok
Copy link

Yeah, it doesn’t :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants