Skip to content

michael-simons/goodreads

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

Goodreads

Books

Problemsolver

I'd call those books "Problemsolver". You might not read them from front to back but as a reference for specific problems.

Non-IT

These are non IT specific books, but touch essential things in our business. Either the way we work together or address some common misconceptions, for example sleep being an optional aspect to live:

Work in general

Essays and other articles

Best practices

Architecture

Concepts

Philosophy

My library

all.csv contains an incomplete list of books in my library. The CSV file has 4 columns separated by ;.

Name Description
Author One or more authors, last name, first name separated by &
Title Title of the book
Type R, S, C (Roman (Fiction), Sachbuch (Non-Fiction), Comic)
State R, U (Read, Unread)

Interacting with the CSV file

Using SQLite to query the database

sqlite3 :memory: \
 '.mode csv' \
 '.separator ,' \
 '.import "|curl -s https://raw.githubusercontent.com/michael-simons/goodreads/master/all.csv" books' \
 "SELECT title FROM books WHERE author like '%King%' ORDER by title"

Using DuckDB

DuckDB is an incredible versatile, in-process SQL OLAP database management system and while most likely total overkill for the small dataset, it's fun. Install and start DuckDB:

-- Required to directly import the csv file from Github
INSTALL httpfs;
-- Just query the dataset
SELECT DISTINCT author FROM read_csv('https://raw.githubusercontent.com/michael-simons/goodreads/master/all.csv', header=true, auto_detect=true);
-- Create a table named books
CREATE TABLE books AS SELECT * FROM read_csv('https://raw.githubusercontent.com/michael-simons/goodreads/master/all.csv', header=true, auto_detect=true);
-- Query and manipulate as needed
-- Save the result (overwriting all.csv and sorting it on the way)
COPY (SELECT * FROM books ORDER BY author COLLATE de ASC, title COLLATE de ASC) TO 'all.csv' WITH (header true);

Using Neo4j

I used to run a browseable, interactive list of all books on Heroku using a free Neo4j AuraDB instance, but Heroku stopped offering a free service a while ago. The repository containing the application code (based on Quarkus) is still available: neo4j-aura-quarkus-graphql project. Follow the instruction in the README.

The essential query to import the CSV into Neo4j looks like this

LOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/michael-simons/goodreads/master/all.csv' AS row FIELDTERMINATOR ',' 
MERGE (b:Book {
  title: trim(row.Title)
})
SET b.type = row.Type, b.state = row.State
WITH b, row
UNWIND split(row.Author, '&') AS author
WITH b, split(author, ',') AS author
WITH b, ((trim(coalesce(author[1], '')) + ' ') + trim(author[0])) AS author
MERGE (a:Person {
  name: trim(author)
})
MERGE (a)-[r:WROTE]->(b)
WITH b, a
WITH b, collect(a) AS authors
RETURN id(b) AS id, b.title, b.state, authors

Using xsv

xsv is a powerful tool for manipulating CSV. Here's an example how to get a list of unique authors

curl -s https://raw.githubusercontent.com/michael-simons/goodreads/master/all.csv | \
  xsv select -d "," Author |\
  uniq

Using the webui

If you have JBang installed you can start an admin "UI" like this:

jbang admin.java

Access the page at localhost:8080.

About

A collection of articles, books and quotes that may help to become a better developer

Resources

Stars

Watchers

Forks

Languages