Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does hexo use this database implementation? #13

Open
MegaJ opened this issue Dec 12, 2017 · 4 comments
Open

Why does hexo use this database implementation? #13

MegaJ opened this issue Dec 12, 2017 · 4 comments
Labels

Comments

@MegaJ
Copy link

MegaJ commented Dec 12, 2017

Hello, this might sound rather naive, but why was a custom db made for hexo? (Maybe it's not made for hexo, as it didn't used to be a part of hexojs)

I don't understand why a flat file format was chosen, as opposed to say, an already existing database like mongo or other text storage databases that get more attention. I can see that installations are easier this way especially for users, data saving/retrieval are as simple as the fs api, and perhaps disk reads might be faster than talking to a mongo daemon (I don't know though, I haven't looked at any benchmark data). I can speculate, but I won't know for sure why this decision was made.

The part where we save all the post's content into the db.json seems redundant when file itself holds the data already, and we can just load the content from the file, rather than read a copy of the content from another text file (db.json). Wouldn't we be able to save a lot of space this way?

Thanks for reading and thanks for making hexo!

@tcrowe
Copy link
Contributor

tcrowe commented Dec 12, 2017

Good question. I wont speculate either. (well only a little)

One idea I can think of is that you can index all the assets and get access to it for use synchronously. With that it can support most template engines that people like to use. It just passes them this context:

https://hexo.io/docs/templates.html
https://hexo.io/docs/variables.html

For example a recent posts query:

site.posts.sort('date', -1).limit(10).toArray()

It's possible to do a lot more elaborate querying than that!

It's a big advantage not to have to install and manage a database server. Another advantage is being able to open a json file and search all the data in the raw form.

Scaling up is difficult. There are people using it with over 10,000 posts. It would be interesting to hear how that works for them. If we check hexo issues we might be able to find out.

Another json database is https://github.com/louischatriot/nedb which you may have seen. It doesn't have schemas though. The querying is like MongoDB which many libraries like to emulate.

When hexo was made some years ago was there anything that could even provide this functionality? Warehouse might be the best one!

What do you think the most common use case is for hexo blogs? 100 posts or less?

@kosirm
Copy link

kosirm commented Dec 19, 2017

I was also thinking about this, but from another perspective. Maybe it is even more naive, but I'm thinkig how much would it take, to upgrade datastore with in-memory database (for example LokiJS) for most heavy operations? How much speed improvement woud that give to hexo?

@tomap
Copy link
Contributor

tomap commented Sep 10, 2018

Indeed, I looked at it, and Loki seem to be doing the job: fast, with persistence,...
If we were to do the transition, we would probably need to expose a wrapper over Loki

@tcrowe
Copy link
Contributor

tcrowe commented Jun 26, 2019

I tried LokiJS, nedb, lowdb, and some others. Warehouse has a nicer API. We just need to document it better and support with examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants