blog

Why PostgreSQL is awesome

I was supposed to post this update on Friday (September 9th), but I forgot, so I decided to post it on Saturday (September 10th), but I forgot again; so then I decided to post it on Sunday (September 11th) and I forgot again; so I'll post it today.

One of the most important (or maybe the only one) features behind the popularity of NoSQL is its ability to store data without a schema, in other words: forget about tables, you store anything you want in JSON. This flexibility comes in really handy when the data you need to store is a bit unpredictable, but still needs to be indexed and searched - normally we can overcome this by doing some complicated workarounds with the schemas, but that is where NoSQL really shines. Where it doesn't shine, however, is where you actually need relational schemas and organizing data in a very cohesive way.

Myself, I've never been a big fan of NoSQL: I love JSON, and I love to store information in JSON, but NoSQL never gave me the confidence of actually being reliable; thankfully, newer databases already support similar features for storing JSON. PostgreSQL accepts the data formats JSON and JSONB, which recognizes and treats JSON objects as actual columns.

For instance, the entry below contains a JSON object with the data of a person called John Doe, 54 years old, that lives in Toronto/Ontario.

TABLE people

 id | doc
----+----------------------------------------------------------------
1   | {
    |   “city”: “Toronto”,
    |   “province”: “Ontario”,
    |   “person”: {
    |     “name”: { “first”: “John”, “last”: “Doe” },
    |     “age”: 54
    |   }
    | }

His first and last name could be retrieved using the following SQL query:

SELECT doc->'person'->'name'->>'first', doc->'person'->'name'->>'last'
FROM people WHERE id=1;

The syntax is fairly simple, and almost self explanatory, only with one detail: the arrow '->' is used to retrieve a value of the object, while '->>' retrieves it as a string.

The nice thing about this feature is that SQL can now be a mix of both worlds, it also means that instead of pulling query results from the database and computing/filtering them in the API, if necessary, this can be done directly in the SQL statement.