Designing SQL query using pipeSQL
I was reading the following article “Designing Efficient SQL: A Visual Approach” by Jonathan Lewis. It is an excellent article and a must read for SQL developers. In the article, he explains the steps of visual approach to create a SQL which solves a complex report problem.
Here is the problem that he chose to demonstrate:
For example, say you have a schema which defines a fairly simple order-processing system:
customers place orders, orders can have multiple order lines, an order line is for a product,
and products come from suppliers; some products can be substituted by other products.
One day you are asked to report on
“orders placed over the last week by customers based in London, for products supplied by producers
based in Leeds that could have been supplied from an alternative location”.
And as a solution, he created the below given image. It is the outcome of a visual approach:
What if I could create the SQL as described in the above image. Well, it's possible by using pipeSQL.
Here is the pipeSQL solution.
-- product alternatives from suppliers who are not from 'LEEDS' sup2 := Suppliers location != 'LEEDS' |> p2 := Products filter p2.id_supplier = sup2.id |> alt := Alternatives filter alt.id_product_sub = p2.id column id_product ; --Orders placed on last week by customers who are from 'LONDON' c := Customers location = 'LONDON' |> ord := Orders ord.date_placed >= dateadd(day, -7, getdate()), ord.date_placed <= getdate() filter ord.id_customer = c.id column id, date_placed |> orl := Order_lines filter orl.order_id = ord.id column ord.id, ord.date_placed |> p := Products join Supplier sup on p.id_supplier = sup.id, sup.location='LEEDS' filter p.id = orl.id_product column p.id, orl.id id_order, orl.date_placed date_placed ; -- Orders placed over the last week by customers based in London, for products supplied by producers -- based in Leeds that could have been supplied from an alternative location. alt2 := <- alt |> all_ord := <- p filter alt2.id_product = all_ord.id column id_order, id id_product, date_placed ;
I have divided the problem into smaller sections. Now, for me, it is easy to understand what I'm trying to do even after couple of days/weeks.
And the generated query is :
select id_order, id id_product, date_placed from ( select p.id, orl.id id_order, orl.date_placed date_placed from Products p join Supplier sup on ( p.id_supplier = sup.id and sup.location = 'LEEDS' ) join ( select ord.id, ord.date_placed from Order_lines orl join ( select id, date_placed from Orders ord join ( select * from Customers c where location = 'LONDON' ) c on ( ord.id_customer = c.id ) where ord.date_placed >= dateadd(day, -7, getdate()) and ord.date_placed <= getdate() ) ord on ( orl.order_id = ord.id ) ) orl on ( p.id = orl.id_product ) ) all_ord join ( select * from ( select id_product from Alternatives alt join ( select * from Products p2 join ( select * from Suppliers sup2 where location != 'LEEDS' ) sup2 on ( p2.id_supplier = sup2.id ) ) p2 on ( alt.id_product_sub = p2.id ) ) alt2 ) alt2 on ( alt2.id_product = all_ord.id )
How can I make sure this is correct without running via real database engine? or atleast a quick way to test it.
Like for example, quickly create a set of tables with random data, in SQLite, and test it.
I should work on this :)