SQL Information

Abusing XML and JSON

سه شنبه, ۱۱ خرداد ۱۳۹۵، ۰۸:۱۹ ق.ظ

First off

sql-tricks-slide-226

JSON is just XML with less features and less syntax

Now, everyone knows that XML is awesome. The corollary is thus:

JSON is less awesome

Don’t use JSON.

Now that we’ve settled this, we can safely ignore the ongoing JSON-in-the-database-hype (which most of you will regret in five years anyway), and move on to the final example. How to do XML in the database.

This is what we want to do:

sql-tricks-slide-231

Given the original XML document, we want to parse that document, unnest the comma-separated list of films per actor, and produce a denormalised representation of actors/films in a single relation.

Ready. Set. Go. This is the idea. We have three CTE:

1
2
3
4
5
6
7
8
9
10
11
WITH RECURSIVE
  x(v) AS (SELECT '...'::xml),
  actors(
    actor_id, first_name, last_name, films
  ) AS (...),
  films(
    actor_id, first_name, last_name,
    film_id, film
  ) AS (...)
SELECT *
FROM films

In the first one, we simply parse the XML. Here with PostgreSQL:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
WITH RECURSIVE
  x(v) AS (SELECT '
<actors>
  <actor>
    <first-name>Bud</first-name>
    <last-name>Spencer</last-name>
    <films>God Forgives... I Don’t, Double Trouble, They Call Him Bulldozer</films>
  </actor>
  <actor>
    <first-name>Terence</first-name>
    <last-name>Hill</last-name>
    <films>God Forgives... I Don’t, Double Trouble, Lucky Luke</films>
  </actor>
</actors>'::xml),
  actors(actor_id, first_name, last_name, films) AS (...),
  films(actor_id, first_name, last_name, film_id, film) AS (...)
SELECT *
FROM films

Easy.

Then, we do some XPath magic to extract the individual values from the XML structure and put those into columns:

1
2
3
4
5
6
7
8
9
10
11
12
13
WITH RECURSIVE
  x(v) AS (SELECT '...'::xml),
  actors(actor_id, first_name, last_name, films) AS (
    SELECT
      row_number() OVER (),
      (xpath('//first-name/text()', t.v))[1]::TEXT,
      (xpath('//last-name/text()' , t.v))[1]::TEXT,
      (xpath('//films/text()'     , t.v))[1]::TEXT
    FROM unnest(xpath('//actor', (SELECT v FROM x))) t(v)
  ),
  films(actor_id, first_name, last_name, film_id, film) AS (...)
SELECT *
FROM films

Still easy.

Finally, just a bit of recursive regular expression pattern matching magic, and we’re done!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
WITH RECURSIVE
  x(v) AS (SELECT '...'::xml),
  actors(actor_id, first_name, last_name, films) AS (...),
  films(actor_id, first_name, last_name, film_id, film) AS (
    SELECT actor_id, first_name, last_name, 1,
      regexp_replace(films, ',.+', '')
    FROM actors
    UNION ALL
    SELECT actor_id, a.first_name, a.last_name, f.film_id + 1,
      regexp_replace(a.films, '.*' || f.film || ', ?(.*?)(,.+)?', '\1')
    FROM films AS f
    JOIN actors AS a USING (actor_id)
    WHERE a.films NOT LIKE '%' || f.film
  )
SELECT *
FROM films

Let’s conclude:


  • سمیرا امیری

نظرات  (۰)

هیچ نظری هنوز ثبت نشده است

ارسال نظر

ارسال نظر آزاد است، اما اگر قبلا در بیان ثبت نام کرده اید می توانید ابتدا وارد شوید.
شما میتوانید از این تگهای html استفاده کنید:
<b> یا <strong>، <em> یا <i>، <u>، <strike> یا <s>، <sup>، <sub>، <blockquote>، <code>، <pre>، <hr>، <br>، <p>، <a href="" title="">، <span style="">، <div align="">
تجدید کد امنیتی