I need some guide: w/r at maximum speed with Tickstorage #911

vargaspazdaniel · 2021-07-26T12:52:41Z

Arctic Version

'1.80.0'

Arctic Store

# VersionStore, TickStore, or ChunkStore

Platform and version

Windows 10 x64, Intel I7-6700, 32GB RAM.

Description of problem and/or code sample that reproduces the issue

Is not really an issue, but a problem by my side. I have a python script writing tick data from different assets in real time and I'm not having any problem writing every tick... The problem is that I'm writing EVERY tick that I receive in this way and my read speed is VERY LOW:

from arctic import Arctic
from arctic import TICK_STORE

store = Arctic("localhost")
store.initialize_library("Darwinex_Tick_Data", lib_type=TICK_STORE)

lib_tick_dwx = store["Darwinex_Tick_Data"]
lib_tick.write("EURUSD", tick_dataframe)

Notice that tick_dataframe is a dataframe with a single row (the tick) parsed with timestamp and written in MongoDB as a document. I'm not having any problem writting the data in this way, but after read some closed threads here I see that the efficient approach is to save, at least, 100k rows or ticks in just ONE document.

Any advise about how to do it? Keep the tick data saved in a dataframe and if len > 100k = write ticks in a document and then store 100k ticks more and write it in document? I'm and very noob user yet...

How Can I merge all single ticks in just one document for faster reads operations? Maybe this can be a solution for me.

Any other recommendation? Thanks in advanced for your reads and also for this awesome library.

The text was updated successfully, but these errors were encountered:

vargaspazdaniel · 2021-07-28T07:51:38Z

I already took my tick database, loaded entirely in dataframe and then write it again in Mongo (now I have 100k rows, by default, per document and not 1 tick per document as before). It seems that speed has performed better. I'll measure the differences and post it here.

vargaspazdaniel · 2021-07-28T10:01:47Z

Reading speed test

Okay, grouping ticks in chunks of 100k rows, 100k ticks = 1 document, I'm getting 166.57 seconds for 16063229 rows x 2 columns, while without grouping ticks, 1 tick = 1 document, for the same amount of data, I'm getting 1108.18 seconds.

Definetely a huge difference... Now I need to think a way on how to group 100k ticks before save it in a document... Any idea? In my case, tick data is not quick enough to fill quickly that 100k rows so IDK how to save that ticks while they grow as 100k rows, because if something happens and algo gets down, I can lose 99.999 ticks that are not written in Mongo.

PS: also another important advange is that the size of the BD has decreased from 1060 MB to 73 MB, insane...

dominiccooney · 2021-09-09T09:52:11Z

Now I need to think a way on how to group 100k ticks before save it in a document... Any idea?

Collect it in a different data store and then copy it to Arctic when you have accumulated enough rows. For example you could collect it in Redis with journaling.

PS: also another important advange is that the size of the BD has decreased from 1060 MB to 73 MB, insane...

LZ compression works, approximately, by finding backreferences to content it already compressed and emitting a reference to the previous content instead of repeating it. Arctic compresses all the rows in a column. When you only write a single row at a time you are compressing a single value at a time (one row, one column) so there's little context to find repeated content in.

vargaspazdaniel · 2022-01-19T08:53:09Z

Now I need to think a way on how to group 100k ticks before save it in a document... Any idea?

Collect it in a different data store and then copy it to Arctic when you have accumulated enough rows. For example you could collect it in Redis with journaling.

PS: also another important advange is that the size of the BD has decreased from 1060 MB to 73 MB, insane...

LZ compression works, approximately, by finding backreferences to content it already compressed and emitting a reference to the previous content instead of repeating it. Arctic compresses all the rows in a column. When you only write a single row at a time you are compressing a single value at a time (one row, one column) so there's little context to find repeated content in.

Thanks a lot for your advice!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I need some guide: w/r at maximum speed with Tickstorage #911

I need some guide: w/r at maximum speed with Tickstorage #911

vargaspazdaniel commented Jul 26, 2021

vargaspazdaniel commented Jul 28, 2021

vargaspazdaniel commented Jul 28, 2021 •

edited

dominiccooney commented Sep 9, 2021

vargaspazdaniel commented Jan 19, 2022

I need some guide: w/r at maximum speed with Tickstorage #911

I need some guide: w/r at maximum speed with Tickstorage #911

Comments

vargaspazdaniel commented Jul 26, 2021

Arctic Version

Arctic Store

Platform and version

Description of problem and/or code sample that reproduces the issue

vargaspazdaniel commented Jul 28, 2021

vargaspazdaniel commented Jul 28, 2021 • edited

Reading speed test

dominiccooney commented Sep 9, 2021

vargaspazdaniel commented Jan 19, 2022

vargaspazdaniel commented Jul 28, 2021 •

edited