> On 18 Apr 2022, at 11:56, Pól Ua Laoínecháin <linehanp@xxxxxx> wrote: (…) > All of the code below is available on the fiddle here: > > https://dbfiddle.uk/?rdbms=postgres_13&fiddle=0cc20c9081867131260e6e3550bd08ab (…) > OK, grand, now I wish to perform a RECURSIVE CTE on it. So, I start by > trying something (I thought was) very simple. Obviously, I plan to do > more, but I wanted to get the "mechanics" correct to start with. So, > my query is: > > WITH RECURSIVE cte1 (n, ln) AS > ( > SELECT 1 AS n, string > FROM line Here is your first problem, this will yield a result for each row in your line table, numbering it ‘1’. You seem to have expected just a single result here, but that is something that you need to take care of in your query. This part is called the base case, base step or initial step. > UNION ALL > SELECT n + 1, ln > FROM cte1 > WHERE n < (SELECT COUNT(*) FROM line) And then for each of those rows, it will add all those rows (from the same CTE!) again. This part is called the recursive step. You did add a termination condition here, which indeed manages to terminate, but it does so too late. It seems that you do understand some of the concepts of recursive CTE’s, but you appear to be missing some crucial knowledge. For example, it is actually possible to query multiple trees with a single recursive CTE. It is not limited to a single tree. How many trees the CTE will navigate depends on how you selected the rows in the base case. > ) > SELECT * FROM cte1; > > i.e. have a counter variable and a string from the line table My first question is why you’re using a recursive CTE here? This doesn’t appear to be hierarchical data (such as a tree), unless perhaps you intended to actually traverse the HTML document hierarchy? > > But, then to my horror, the result of this query is > > 1with t(x) as (values( XMLPARSE(DOCUMENT > ('<root><NotificationServiceDetails NotificationNo="0" > AlarmCode="mail" AlarmStartTime="10:00:00" AlarmTime="0" Id ="2" >> <NotificationServiceDetail > Id="2"><Title><![CDATA[aaaaaaaaaaaaa]]></Title><ContentJson><![CDATA[ > 1 <html lang="en"> > 1 <head> > 1 <meta charset="utf-8"/> > 1 more stuff > 1 more stuff > 1 </table> > 1 </body> > 1 </html> > ... > ... snipped for brevity > ... > > 256 rows! <<=== note 256! > > > So, my simple recursive CTE is > > a) not incrementing n and It should be, but only after the first 16 rows from your base case. > b) obviously doing some sort of CROSS JOIN (16^2 = 256). Not quite, it’s selecting the same 16 rows 16 times, incrementing the numbering each time. Effectively it is indeed a Cartesian product. More on that below. > I would be grateful if somebody could explain what's going on here. As > shown in the fiddle, I can do the basic 1 - 10 RCTE and I've done way > more complex ones before, so I'm a bit baffled as to what's going on > here. For a recursive CTE to work, you need two things, and that seems to be where you have misunderstood some things: 1. You need a base case that selects the _initial_ rows to start the recursion from. 2. You need a recursive step, where you usually join the results of your CTE to your table to figure out what the next result should be. (It is very similar to how this works in procedural languages, where you first call a function that initiates the recursion and then that calls a second function (with more parameters usually) that calls itself again and again, with updated input parameters, until some termination condition is reached and the function returns a value.) For the base case, you will want to make sure that you select the correct row(s) from your line table for the initial set. My guess is that it should be the row containing the string '<root>'. For the recursive step, you will want to select the next child row from your line table that is (directly) related to the one(s) you selected in the base case, or is (directly) related to the previous iteration of the recursive case. It is unclear to me what that should be in your case, I don’t understand the purpose of your CTE. Your termination condition, WHERE n < (SELECT COUNT(*) FROM line), isn’t quite doing what you expect either I think. Let’s say that c = (SELECT COUNT(*) FROM line), so c = 16 in your example. - The first iteration of the CTE, the base case, will select 16 lines with n=1. - The second iteration is the first time in the recursive step, and will select all lines from the base case where n < c, or 1 < 16, which is all of them. - The next iteration will select all rows where 2 < 16, then 3 < 16, … until finally in the 16th iteration it stops with the test 16 < 16. So, you managed to make the recursion terminate at least, but probably not how you intended it to work. Something along the lines of: WITH RECURSIVE cte (n, ln) AS ( SELECT 1, string FROM line WHERE string LIKE '%<root>%' UNION ALL SELECT n+1, string FROM cte JOIN line ON iHaveNoClueHere ) SELECT * FROM cte; > Any explanations, references to URLs or other advice gratefully received. There are plenty of explanations about recursive CTE’s in SQL on the Internet. I would probably have a good look at how it’s described in the PG documentation. There are a couple of good articles on the topic by some PG bloggers too. You’ll find plenty of articles related to how to do this in MSSQL or later Oracle versions. Those are totally fine for understanding the principles, and even the syntax is almost entirely the same. For a PG specific explanation, this one looks pretty thorough: https://towardsdatascience.com/recursive-sql-queries-with-postgresql-87e2a453f1b Regards, Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest.