Rem 3 Preparation for possible recursion Call
What we want to do
At this point, in many real applications we will likely have most of the rows sorted but will have possibly some rows with similar values in the column used in the initial sort.
We need some way to send arrange that the array as now sorted is resent again sorted by another running copy of the recursion routine over the duplicate rows.
We do not want to over simplify this coding. We did a bit of simplifying: For the sake of explanation of the general coding strategy it was OK to choose just numbers and all Ascending order. But to give the coding some worth in showing typical recursion coding workings, we should allow for the case of multiple duplicate row sections. We will discus that this is the sort of thing, that is to say going “back and forth” or “up and down” in a semi automatic way is what a recursion code is often best at.
There is no single way I know of, to write such a coding. It simply takes some careful thought to arrange that the Call of a new copy is such that on termination of that copy, the way the original copy resumes will allow for the possibility of things such as for “going back down” if, in this case, we need once again to resort multiple rows. Sometimes there are a couple of general characteristics which can help in the design of such a coding…
Main outer loop, and multiple copies of Copys/ “Levels”
Usually we will have a main outer loop, which in our case will be approximately across the entire range of current rows to be sorted. Often, but not always, towards the end, or last half, of the loop, will be the recursion Call, that is to say the code line which causes the current running routine to pause, whilst a new independent copy of the routine starts.
Having this Call part in a loop is probably what usually gives this ability to go “back and forth” between different independent copies of the recursion routine: You see , when a Called recursion routine copy ends, then the previous which had paused restarts, and depending on what your loop is doing, it will possibly cause the Call to be done again, so “you go back down” a copy level, or up a copy level depending on how you like to visualise it. It should be noted that each start and End of a recursion is an independent routine. The variable CopyNumber that we use tells us at “what level” we are, or how “far down” the recursion change of events. In our case that translates in the practice to which column in our list of columns to use , 1 3 2
CopyNo 1 : we are using column 1 to determine the sort order
CopyNo 2 : we are using column 3 to determine the sort order
CopyNo 3 : we are using column 2 to determine the sort order
There will only ever be one copy number 1 started, but that will likely pause a number of times. For further copy number , several may start and End: There can only ever be one running at any one time, but several independent copies could be ran from start to completion. The copy number is an indication of the “level” or in our case the column being used to determine the sort order. ( With hindsight… probably CopyLevel would be a better name for the variable… )
Ending
A typical, but not essential, characteristic of recursion routines is that not much usually goes on after the loop and / or last Call code line area. Further, a typical characteristic is that Ends, especially the final few Ends tend to occur one after the other. It tends to go unnoticed , for example when using debug F8 step code progression mode , since there Are not many code lines there. I find it therefore very demonstrative to have a Message box or Debug.Print code line immediately before the code end, with a message like “Ending a copy number “ & CopyNo & “”
Our specific case: What do we need to do
We already have the array we need for a possible required further sort in the case of duplicated values in the column used in the last bubble sort.
What we need to do is determine any rows with those duplicate values, and then pause the current routine whilst we run another copy of the routine after we have changed the indices in Rs() to just the sub set of rows to be sorted.
I guess there are many ways to do this. I expect I will come back some time and try a few.
The basic strategy to take advantage of recursion in this case will probably usually be the same
Basic recursion using strategy
This is not so difficult as the original sort will mean that rows for duplicated values in the current/ last sort column will be grouped together. So the basic strategy is to loop “down” and whilst we do this noting the indicies for these rows for duplicated values in the current/ last sort column. Once we have such a group use the Call Sub Bubbles(__ recursion starting code line to pause the current routine and start a new copy of the recursion routine to sort these rows which have duplicated values in the current/ last sort column.
Actual working application of the Basic recursion using strategy
I find it very convenient, as well as making debug easier , to “collect things” like row indices in a long string. This is because
it makes it easier to print or message box out what I have, so lends itself nicely to user interfaces to see what is going on,
and
there are very many string manipulating function and methods available
In fact, I already use the variable, strRws , for the range of rows, so it makes sense to use it to build my duplicated rows indicies in this sort of form like " 1 2 3 ". In other words similar I use it similar to how it was used in its first use , where, in this example, it was built like this in the Calling routine, " 1 2 3 4 5 6 ". For its first use it has all the indicies, and now , subsequently it will hold the indices corresponding to duplicate row values in the current/ last column used to define the sort order, in the latest version of arsRef() :
Just to clarify as it is easy to get lost in recursion routines: At this Rem 3 stage, I have most likely , just re ordered the array, arsRef(). It may have some rows that could not be put in any specific order if there were identical values in those rows in the column used in the last/ current sort.
One of the main workings of this section, Rem 3 , will be to obtain a string of the form " 1 2 3 " which represents duplicate row values in the current/ last column used to define the sort order.
( In a more realistic example , I might have several groups of such rows, so would have strRws = " 7 8 9 10 " , strRws = " 17 18 " , strRws = " 72 73 74 75 76 77 78 " , ..etc… Furthermore some of those rows might still need to be sorted using a different column / that is to say, using the next user given column, -- hence the reason for trying a recursion routine which “keeps going further and further” as needed
_ Let strRws = ""
I initialise my variable so that I can use it to build up my new duplicate variable rows list
__For rOuter = Left(Trim(tempStr), InStr(1, Trim(tempStr), " ", vbBinaryCompare) - 1) To …………………………
I begin a main **Outer Loop** for all but the last row, where the actual rows are determined by the upper and lower values in strRws
____ If strRws = "" Or InStr(1, Trim(strRws),…………….
____ I have a condition which should catch the situation of starting looking for a set of duplicates, so this will be at the start, strRws = "" , or if the last loop produced no addition to the string and so is left at a single indicia, meaning that no in between space is present, and consequently this will be 0 , InStr(1, Trim(strRws), " ", vbBinaryCompare). With these conditions met my string will become like " 1 " ( with more realistic data this could be any row number )
____ If Trim(UCase(CStr(arsRef(rOuter, Clm)))) = Trim(UCase(CStr(arsRef(rOuter + 1, Clm)))) …………………
____ This code line looks to see if we have a duplicate at the next row in last/ current sort column. ( This is why we loop in the main outer loop to 1 less than the last row, to prevent an error of array index out of range here )
_____ with the last condition met we add the indicie to the current string, strRws
_____ Let strRws = strRws & rOuter + 1 & " "
____ Else ' without the last condition met
____ without the last condition met, we might have the end of a group of duplicate rows, in which case it would be time to organise a recursion run so
____ If Not InStr(1, Trim(strRws), " ", vbBinaryCompare) = 0 Then
______we this we check for this situation needing a recursion run,
______with the condition met its time to organise recursion run
' Now its time to organise a recursion run
Because of how we built our string, strRws , we have nothing to do other than Call a new copy of the routine, Sub Bubbles( __ , with the appropriate arguments,
After that new copy of the recursion routine Ends, I will come back to just after the Call line and in a normal practical use I might still find another group of rows in this look at the first full row sorted array, so I set strRws=””
______ '+++*** this would be end of loop for most cases
Oh Fuck
In most cases I am finished after I am towards the end of the main Loop here.
But we have one slight problem: there is one small imperfection with our logic: In our logic, the end of a group of duplicates is determined by the two conditions:
_ firstly the next row is not a duplicate, and
_ secondly the current strRws has at least two indicies in it.
The problem comes if we have a group of duplicate rows that include the last row: In such an occurrence we will never reach a point where the next line being not a duplicate causes us to do the first then second check and subsequent Call of a new copy of the routine, Sub Bubbles( __.
To overcome this problem we include a last check which covers such an occurrence an allows a last Call of a new copy of the routine
I think for realistic real life data, this might be one of those situations whereby you ignore the last problem so as to either
use it effectively later to cause some mischief and then demand a high ransom to get correct it, or
or
you might consider arranging your data, such that the problem scenario would not occur. This might be a more efficient solution than having the extra check in every looping
In the next post is a summary of a run of the routines
( Final coding here:
http://www.excelfox.com/forum/showth...ll=1#post11067 )




Reply With Quote
Bookmarks