PipeSQL |>

A lightweight tool and a language to design SQL queries, in bottom-up approach, using pipes and filters

Why pipeSQL?

SQL is ubiquitous. But SQL is hard to use: It’s complicated, confusing, and error prone. In order to have any hope of writing SQL code that you can be sure is accurate, therefore—meaning it does exactly what it’s supposed to do, no more and no less—you must follow some appropriate discipline.

C. J. Date on "SQL and Relational Theory" / preface to first edition

SQL is the standard and declarative language for use with relational databases. In other words, SQL simply says what need to be done and it is upto the database management system (RDBMS) on how it need to be done.

Let's say that we design a SQL query, to create a complex sales report, and submit to the database. First, database query engine will parse the query and convert it to a executions plan. Then it begins to execute and project data according to the plan.

By chance looking at the executions plan, we understand that plan contains operations in a tree structure and the execution order begins from leaf node or from bottom.

What if we could design SQL queries similar to the executions plan?
In other words, designing a SQL query in bottom-up approach. One possible answer is pipeSQL.

Helloworld in pipeSQL

Find all the employees who have lastname Helloworld.

e := Employees 
	e.lastname='HelloWorld';
And the generated SQL is
select  *
from Employees e 
where e.lastname = 'Helloworld'
				
APP Play

Let's look at the following example:

Say, an imaginary company and it's database schema is same as Northwind Database. It is a simple order processing database.
At the end of the sales year, they want a following sales report.

Get top 10 sales amount and the employees who made that sales.

To design the query, I have to access 3 tables and they are: Order Details, Orders, and Employees.

Designing query in PipeSQL

--step 1: od := [Order Details] summary sum( od.UnitPrice * od.Quantity * (1 - od.Discount) ) by OrderID |> ord := Orders filter ord.OrderID = od.OrderID column ord.OrderID, ord.ShippedDate, ord.EmployeeID, od.summary1 SaleAmount |> emp := Employees filter emp.EmployeeID = ord.EmployeeID column emp.Country, emp.LastName, emp.FirstName, ord.OrderID, ord.ShippedDate, ord.SaleAmount sort ord.SaleAmount - top 10;

You can run this query on a live database from here: play

Let's look at the pipeSQL explanation:

Step 1: Fetch sum of sales for each orders
Summarize the [Order Details] table rows by grouping on OrderID and calculate the total sales amount for each orders.
Pipe (|>) the result to the next step.


Step 2: Fetch EmployeeID for each orders
This step is going to receive the step 1 result as an input dataset. We have to specify the filter rule on how to process this input dataset.
Filter rule, filter ord.OrderID = od.OrderID, is actually a SQL JOIN condition (or SQL EXISTS condition) and the begins with keyword filter.
Under this step, [Order Details] and Orders table are joined based on the filter condition.
Keyword column specifies the columns, for example EmployeeID, which needs to be included in the result.
Pipe (|>) the result to the next step.
Step 3: Fetch top 10 max sales employees
This step is going to receive the step 2 result as an input dataset and the filter rule is filter emp.EmployeeID = ord.EmployeeID.
This steps joins Orders and Employees table based on the filter condition. Since we need the max sales amount, we have to sort it in descending order. To do that, we use sort statement which is noted by keyword sort. Here sort ord.SaleAmount - will sort the results on descending and top 10 will only fetch the first 10 rows.
And the pipe chains ends with a ;.

Finally, pipeSQL query will be generated to appropriate SQL dialect, for this example, into Oracle SQL.

select * 
from (
select 
  emp.Country,
  emp.LastName,
  emp.FirstName,
  ord.OrderID,
  ord.ShippedDate,
  ord.SaleAmount
from Employees emp 
join (
  select 
    ord.OrderID,
    ord.ShippedDate,
    ord.EmployeeID,
    od.summary1 SaleAmount
  from Orders ord 
  join (
      select 
        OrderID,
        sum(
        od.UnitPrice * od.Quantity * (1 - od.Discount)
        ) summary1
      from "Order Details" od 
      group by
      OrderID
) od on 
  (
  ord.OrderID = od.OrderID
  )
) ord on 
(
emp.EmployeeID = ord.EmployeeID
)
order by ord.SaleAmount desc
) toplist 
where rownum <= 10

For all available features of PipeSQL, please check the documentation


Copyright © 2014 pipesql.com . All Right Reserved