Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New data file size. #697

Closed
RodionNikolaev opened this issue Oct 30, 2018 · 7 comments
Closed

New data file size. #697

RodionNikolaev opened this issue Oct 30, 2018 · 7 comments
Assignees

Comments

@RodionNikolaev
Copy link

Why new data file so big?

Package Minified latest.json
0.5.21 407.8kB 176kB
0.5.23 1.1mB 903kB
@mattjohnsonpint
Copy link
Contributor

Thanks for pointing that out. It was expected, though I didn't anticipate the degree of increase.

With this release, #308 is fixed. That bug was truncating data before certain dates, due to differences in zdump output on various systems. So prior to this release, the "full" versions of the data files weren't actually quite full.

Additionally, if there's even one difference between zones, then our builder won't combine them with links. As such, the "full" version has lots of lines that look like they're new, but simply they were previously combined due to not knowing about earlier data.

The truncated (2012-2022) files still are about the same size, because they are deliberately truncating data for those date ranges. You can build the files yourself if you want specific truncated behavior.

Sorry if this came as a surprise. It's just the nature of correcting this bug.

@prantlf
Copy link

prantlf commented Nov 4, 2018

The new data are 8 times (!) greater. I think, that applications will demand other limited data sets, than the quite short 2012-2022. These numbers I measured with the pure JSON (packed) data. Showing not-minified sizes would be too scary ;-)

Full IANA TZ data:  923 KB minified, 33.3 KB gzipped
Data for 1000-2050: 203 kB minified, 24.9 gzipped
Data for 1650-2050: 203 kB minified, 24.9 gzipped
Data for 1900-2050: 200 KB minified, 23.3 KB gzipped
Data for 1970-2018: 106 KB minified, 13.1 KB gzipped
Data for 2012-2022:  27 KB minified,  6.5 KB gzipped

The huge difference between the full data amd 1000-2050 makes it difficult for me not to doubt, if the limited data are really correct. Isn't the filterLinkPack dropping some "good stuff"? ;-)

const tz = require('./moment-timezone-utils')
const fullData = require('./data/unpacked/2018g.json')
const groupLeaders = require('./tasks/group-leaders.json')
const limitedData = tz.filterLinkPack(fullData, 1900, 2050, groupLeaders)

@sbrandwoo
Copy link

My biggest concern here is that the file size has frequently updated as a patch release and not a minor/major release. 0.5.13 was 190 KB and now 0.5.23 is 1.4 MB, these are patch releases and you could expect any application to happily update to them - but this leads to a breaking change in the size of the resulting application.

@yurikuzn
Copy link

yurikuzn commented Feb 5, 2019

I use a script to build timezone data for custom ranges https://github.com/yurikuzn/moment-timezone-data-build

File size is 123KB for 1970 - 2030 range.

@Colkadome
Copy link

I ended up filtering the dates in the range 1970 - 2030, then replacing the 'indices' and 'untils' data with indexes to an array of values (as a lot of that data is repeated):

https://gist.github.com/Colkadome/7cd3c8111ba13f804908dcb6d06d2dab

@mattjohnsonpint
Copy link
Contributor

I plan to publish a 1970-2030 file in the next release. See comments in #614. Thanks.

@mattjohnsonpint
Copy link
Contributor

This is completed in version 0.5.24. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants