OEIS/Collatz Story: Difference between revisions

From tehowiki
Jump to navigation Jump to search
imported>Gfis
save from OEIS/3x+1 Problem
 
imported>Gfis
Playground for 3x+1 Problem
Line 1: Line 1:
previous version in [[Collatz Streetmap]]
==Introduction==
==Introduction==
Collatz sequences are sequences of non-negative integer numbers with a simple construction rule: even elements a halved, and odd elements are multiplied by 3 and then incremented by 1. Since decades it is unknown whether the final cyle 4 - 2 - 1 is always reached for any start value. This problem is the '''Collatz conjecture''', for which the [https://en.wikipedia.org/wiki/Collatz_conjecture english Wikipedia] states:  
'''Collatz sequences''' (also called  ''trajectories'') are sequences of integer numbers > 0. For any start value > 0 the elements of the sequence are constructed with two simple rules:
# Even numbers are halved.
# Odd numbers are multiplied by 3 and then incremented by 1.  
Since decades it is unknown whether the final cyle 4 - 2 - 1 is always reached for all start values. This problem is the '''Collatz conjecture''', for which the [https://en.wikipedia.org/wiki/Collatz_conjecture english Wikipedia] states:  
: It is also known as the 3n + 1 conjecture, the Ulam conjecture (after Stanisław Ulam), Kakutani's problem (after Shizuo Kakutani), the Thwaites conjecture (after Sir Bryan Thwaites), Hasse's algorithm (after Helmut Hasse), or the Syracuse problem; the sequence of numbers involved is referred to as the hailstone sequence or hailstone numbers (because the values are usually subject to multiple descents and ascents like hailstones in a cloud), or as wondrous numbers.
: It is also known as the 3n + 1 conjecture, the Ulam conjecture (after Stanisław Ulam), Kakutani's problem (after Shizuo Kakutani), the Thwaites conjecture (after Sir Bryan Thwaites), Hasse's algorithm (after Helmut Hasse), or the Syracuse problem; the sequence of numbers involved is referred to as the hailstone sequence or hailstone numbers (because the values are usually subject to multiple descents and ascents like hailstones in a cloud), or as wondrous numbers.


When we speak of ''numbers'' in this article, we normally mean natural integer numbers > 0. The zero is sometimes mentioned explicetely.
Straightforward visualizations of Collatz sequences show no obvious structure. The sequences for the first dozen of start values are rather short, but the sequence for 27 suddenly has 112 elements.  
===References===
===References===
* Jeffry C. Lagarias, Ed.: ''The Ultimate Challenge: The 3x+1 Problem'', Amer. Math. Soc., 2010, ISBN 978-8218-4940-8. [http://www.ams.org/bookpages/mbk-78 MBK78]
* Jeffry C. Lagarias, Ed.: ''The Ultimate Challenge: The 3x+1 Problem'', Amer. Math. Soc., 2010, ISBN 978-8218-4940-8. [http://www.ams.org/bookpages/mbk-78 MBK78]
* OEIS A07165: [http://oeis.org/A070165/a070165.txt  File of first 10K Collatz sequences], ascending start values, with lengths
* OEIS A07165: [http://oeis.org/A070165/a070165.txt  File of first 10K Collatz sequences], ascending start values, with lengths
* Gottfried Helms: ''[http://go.helms-net.de/math/collatz/aboutloop/collatzgraphs.htm The Collatz-Problem]''. A view into some 3x+1-trees and a new fractal graphic representation. Univ. Kassel.
* Manfred Trümper: ''The Collatz Problem in the Light of an Infinite Free Semigroup''. Chinese Journal of Mathematics, Vol. 2014, [http://dx.doi.org/10.1155/2014/756917 Article ID 756917], 21 p.
* Klaus Brennecke: ''[https://de.wikibooks.org/wiki/Collatzfolgen_und_Schachbrett Collatzfolgen und Schachbrett]'', on Wikibooks
==Collatz Graph==
===Collatz graph===
When all Collatz sequences are read backwards, they form the '''Collatz graph''' starting with 1, 2, 4, 8 ... . At each node m > 4 in the graph, the path from the root (4) can be continued
When all Collatz sequences are read backwards, they form the '''Collatz graph''' starting with 1, 2, 4, 8 ... . At each node n > 4 in the graph, the path from the root (4) can be continued
* always to m * 2, and  
* always to n * 2, and  
* to (m - 1) / 3 if m ≡ 1 mod 3.
* sometimes also to (n - 1) / 3
The Collatz conjecture claims that the graphs contains all numbers, and that - except for the trivial, leading cycle 1 - 2 - 4 - 1 - 2 - 4 ... - it has the form of a tree (without cycles). We will not consider the trivial cycle, and we start the graph with node 4, the '''root'''.  
When n ≡ 0 mod 3, the path will continue with duplications only, since these maintain the divisibility by 3.
Moreover, another trivial type of path starts when m ≡ 0 mod 3. We call such a path  a ''sprout'', and it contains duplications only. A sprout must be added to the graph for any node divisible by 3, therefore we will not consider them for the moment.
 
===Graph Operations===
The conjecture claims that the graphs contains all numbers, and that - except for the leading cycle 1 - 2 - 4 - 1 - 2 - 4 ... - it has the form of a tree without cylces.
Following [http://dx.doi.org/10.1155/2014/756917 Trümper], we use abbreviations for the elementary operations which transform a node (element, number) in the Collatz graph according to the following table (T1):
 
Straightforward visualizations of the Collatz graph show no obvious structure. The sequences for the first dozen of start values seem to be rather harmless, but the sequence for 27 suddenly has 112 elements.
==The 3x+1 Story==
Many years ago there was a big country with a ''capital'' city and infinitely many other, numbered ''locations''. The capital (with number 1) had incorporated locations 2 and 4. There were ''towns'' which had numbers of the form ''6 * n - 2'', and (less interesting) ''villages''.  
===Road net===
The country had an established net of one-way ''roads'' between the locations. Each location ''x'' had between 2 and 4 roads coming from a neighbour location or leading to one, which were named as follows:
{| class="wikitable" style="text-align:center"
{| class="wikitable" style="text-align:center"
!Name   !! Mnemonic  !! Direction    !!  Neighbour location    !! Condition           
!Name     !! Mnemonic  !! Distance to root !!  Mapping            !! Condition           
|-
| d      || down      || -1            ||  m ↦ m / 2          || m ≡ 0 mod 2 
|-
|-
| d      || down      || east -> west ||  y = x / 2              || x ≡ 0 mod 2
| u      || up        || -1            ||  m ↦ 3 * m + 1      || (m ≡ 1 mod 2)             
|-
|-
| u      || up        || east -> west ||  y = 3 * x + 1         || (none)             
| s := ud || spike    || -2            ||  m ↦ (3 * m + 1) / 2) || m ≡ 1 mod 2           
|-
|-
| δ|| divide    || west -> east ||  y = (x - 1) / 3       || x ≡ 1 mod 3     
| δ || divide    || +1            ||  m ↦ (m - 1) / 3     || m ≡ 1 mod 3     
|-
|-
| µ|| multiply  || west -> east ||  y = x * 2             || (none)
| µ || multiply  || +1            ||  m ↦ m * 2           || (none)
|-
| σ := δµ|| squeeze || +2 ||  m ↦ ((m - 1) / 3) * 2 || m ≡ 1 mod 3
|}
|}
Some villages (those numbered x &#x2261; 0 mod 3) were lined up on a straight path leading to the far east. Other villages (numbered x = 2<sup>n</sup>) lay on a straight path directly leading to the capital.
We will mainly be interested in the reverse mappings (denoted with greek letters) which move away from the root of the graph.
===The Problem===
===3-by-2 Replacement===
During the time people in many locations found out that they could travel to the capital using this road net, but they felt not happy with it. While a path of 11 roads had to be used to travel from village 26 to the capital, people of village 27 had to use a path of 112 roads to get there. There were rumors that there were distant villages which were not connected to the road net at all. They had the same 2-4 roads, but these would form a cycle.  
The &sigma; operation, applied to numbers of the form 6 * m - 2, has an interesting property:
(6 * (3 * n) - 2) &sigma; = 4 * 3 * n - 2 =  6 * (2 * n) - 2
In other words, as long as m contains a factor 3, the &sigma; operation maintains the form 6 * x - 2, and it  replaces the factor 3 by 2 (it "squeezes" a 3 into a 2). In the opposite direction, the s operation replaces a factor 2 in m by 3.
<!--
=== Trivial paths===
There are two types of paths whose descriptions are very simple:
(n = 2<sup>k</sup>) ddd ... d 8 d 4 d 2 d 1  - powers of 2
(n &#x2261; 0 mod 3) uuu ... u (n * 2<sup>k</sup>) ... - multiples of 3
===Kernels===
By the ''kernel'' of a number n = 6 * m - 2 we denote the "2-3-free" factor of m, that is the factor which remains when all powers of 2 and 3 have been removed from m.
* The kernel is not affected by &sigma; and s operations.
-->
===Motivation: Patterns in sequences with the same length===
A closer look at the Collatz sequences shows a lot of pairs of adjacent start values which have the same sequence length, for example (from [https://oeis.org/A070165 OEIS A070165]):
142/104: 142 d  71 u 214 d 107 u 322 d 161 u 484 d  242 d 121 u 364 ] 182, 91, ... 4, 2, 1
143/104: 143 u 430 d 215 u 646 d 323 u 970 d 485 u 1456 d 728 d 364 ] 182, 91, ... 4, 2, 1
            +1  *6+4    +1  *6+4    +1  *6+4    +1  *6+4  *6+2    +0    +0 ...
The third line tells how the second line could be computed from the first.
Walking from right to left, the step pattern is:
&delta; &micro; &micro; &delta; &micro; &delta; &micro; &delta; &micro;
&micro; &micro; &delta; &micro; &delta; &micro; &delta; &micro; &delta;
The alternating pattern of operations can be continued to the left with 4 additional pairs of steps:
  q? u [ 62 d  31 u  94 d  47 u 142 d ...
126 d [ 63 u 190 d  95 u 286 d 143 u ...
        +1  *6+4    +1  *6+4   +1 
The pattern stops here since there is no number q such that q * 3 + 1 = 62.  


A better transportation network was highly desired, and there were even prizes set up for a solution. Many proposals were made, but none was accepted.
==Segment Construction==
===The Railway Proposal===
These patterns lead us to the construction of special subsets of paths in the Collatz graph which we call ''segments''. They lead away from the root, and they always start with a node m &#x2261; -2 mod 6. Then they split and follow two subpaths in a prescribed sequence of operations. The segment construction process is stopped when the next node in one of the two subpaths becomes divisible by 3, resp. when a &delta; operation is no more possible. We assemble the segments as rows of an infinite array <nowiki>C[i,j]</nowiki>, the so-called ''segment directory''.
It was known that often two locations were connected to the capital by the same number of roads, for example 142 and 143 both needed a path of 104 roads, and starting at town 364, they could even use the same path.
: Informally, and in the two examples above, we consider the terms betweeen the square brackets. For the moment, we only take those which are which are &#x2261; 4 mod 6 (for "compressed" segments, below there are also "detailed" segments where we take all). We start at the right and with the lower line, and we interleave the terms &#x2261; 4 mod 6 of the two lines to get a segment.  
 
The columns in one row i of the array C are constructed as described in the following table (T2):
An retired engineer had the idea to make use of this fact, and he came up with a rather strange proposal. Each town ''x'' should start to build a double-track ''railway'' alongside the road net, but only special types of roads would be followed.
{| class="wikitable" style="text-align:left"
====Railway construction rules====
!Column j                !! Operation                !! Formula                 !! Condition            !! Sequence         
* For the ''northern track'':
|-
** &micro;&micro; leading to town y = 4 * x, then
| 1 ||                                               || 6 * i - 2               ||                      || 4, 10, 16, 22, 28, ...
** &delta; (if present) &micro; leading to town z (which, compared with y, has one factor 3 replaced by 2),
|-
** additional &delta;&micro; paths visiting other towns as long as &delta; is possible.
| 2 || <nowiki>C[i,1]</nowiki> &micro;&micro;        || 24 * (i - 1)   + 16      ||                      || 16, 40, 64, 88, 112, ...
* For the ''southern track'':
|-
** &delta;, which is always possible out of a town,
| 3 || <nowiki>C[i,1]</nowiki> &delta;&micro;&micro;  || 24 * (i - 1) / 3 +  4    || i &#x2261; 1 mod 3  || 4, 28, 52, 76, 100, ...
** &micro;&micro; leading to
|-
*** a village &#x2261; 0 mod 3, in which case the construction stops, otherwise
| 4 || <nowiki>C[i,2]</nowiki> &sigma;                || 48 * (i - 1) / 3 + 10    || i &#x2261; 1 mod 3  || 10, 58, 106, 134, ...  
*** a town z,
|-
** additional &delta;&micro; paths visiting other towns as long as &delta; is possible.
| 5 || <nowiki>C[i,3]</nowiki> &sigma;               || 48 * (i - 7) / 9 + 34    || i &#x2261; 7 mod 9  || 34, 82, 130, 178, ...  
It was obvious that such railways could be started from every town, and that there would be only a finite number of &delta;&micro; paths (until all factors of 3 were exhausted).  
====Railway directory====
Therefore the engineer could attach a complete '''[http://www.teherba.org/fasces/oeis/collatz/roads.html railway directory]''' to his proposal. There, any double-track railway started at the blue town in column 1, followed in the same row by the northern track. The southern track was showed in the next row. All towns were highlighted.
 
The proposal also included a number of claims which should convince people that the railway net would more reliably transport them from any location to the capital.
====Connectivity for villages====
 
====Street construction rules====
The following table shows the rules for the construction of the first 9 columns <nowiki>S[n,1..9]</nowiki> of row n (n = 1, 2, 3 ...) in the street directory:
{| class="wikitable" style="text-align:left"                                                    
|-     
!Column!! Steps !! Expression || Formula               !! Condition            !! Coverage
|-                                                                                                                                  
| 1||     ||     || 6n-2                             ||                      || 4,10,16,22 mod 24
|-  
| 2|| d    || (6n-2-1)/3 || 2n-1                        ||                      || all odd numbers
|-                                                                           
| 3|| m    || (6n-2)*2 || 12n-4                        ||                      || 8,20 mod 24
|-                                                                           
| 4|| dm  || ((6n-2-1)/3)*2 || 4n-2                    ||                      || 2,6,10,14,18,22 mod 24
|-                                                                            
| 5|| mm  || (6n-2)*2*2 || 24n-8                      ||                      || 16 mod 24
|-                                                                               
| 6|| dmm  || ((6n-2-1)/3)*2*2 || 8n-4                  ||                      || 4,12,20 mod 24
|-                                                                               
| 7|| mmd  || ((6n-2)*2*2-1)/3 || 8n-3                  ||                      || 5,13,21 mod 24
|-
| 8|| dmmd || ((6n-2-1)/3)*2*2-1)/3 || (8n-5)/3        || n &#x2261; 1 mod 3  || (1,9,17,25 ...)
|-  
| 9|| mmdm || ((6n-2)*2*2-1)/3)*2  || 16n-6            || n &#x2261; 1 mod 3  || (10,58,106,154, ...)
|}
The first 6 columns of the table cover the odd numbers and all numbers &#x2261; 2,4,6,8,10,12,16,18,20,22 mod 24.
 
It is ''not shown'' so far that all multiples of 24 are contained in the table.
<!-- ???
All odd multiples of 3 are contained in column 2. All multiples of 24 can be reached by duplicating them 3 times (3 m-steps).
-->
====Highlighted numbers====
The numbers of the form 6p-2 were highlighted in the example above. They have the special property that, when p &gt; 0 and p &#x2261; 0 mod 3, a dm-step yields a number of the same form, but with one factor 3 in p replaced by 2:
6(3q)-2 dm ((6(3q)-2)-1)/3*2 = (6q-1)*2 = 6(2q)-2
This implies that a dm-step decreases any number by about one third.
====Street lengths &gt; 7====
* Columns 4(k+1)+1 result by dm-steps from columns 4k+1 for k=1,2,... (and so do columns 4(k+1)+2 result from columns 4k+2). Sequences of dm-steps decrease the numbers. Therefore the lengths of all streets are finite.
* Column 5 is 24n-8, and the lengths depend on the power of 3 contained in that n.
<!--
* The street lengths show a repeating pattern for the start values mod 54. The fixed lengths 3, 4, 5 can probably be explained from the street construction rule.
{| class="wikitable" style="text-align:center"
| 4 mod 54
| 10 mod 54
| 16 mod 54
| 22 mod 54
| 28 mod 54
| 34 mod 54
| 40 mod 54
| 46 mod 54
| 52 mod 54
|-
|-
| 5
| 6 || <nowiki>C[i,2]</nowiki> &sigma;&sigma;        || 96 * (i - 7) / 9 + 70    || i &#x2261; 7 mod || 70, 166, 262, 358, ...
| 3
| 3
| 4
| 3
| 3
| n
| 3
| 3
|}
-->
* At the starting values 4, 40, 364, 3280, 29524 ([http://oeis.org/A191681 OEIS A191681]) the street lengths  have high values 5, 9, 13, 17, 21 which did not occur before. Those starting values are (9<sup>n+1</sup> - 1) / 2, or 4 * Sum(9<sup>i</sup>, i=0..n).
====Coverage====
The elements of the streets are strongly interconnected, and the table "obviously" shows all positive integers which are not multiples of 24:
{| class="wikitable"
| r<sub>1</sub> &#x2261; 4 mod 6
| style="text-align:right" | &#x2261; 4,10,16,22 mod 24
|-
|-
| r<sub>2</sub> &#x2261; 1 mod 2
| 7 || <nowiki>C[i,3]</nowiki> &sigma;&sigma;        || 96 * (i - 7) / 27 + 22  || i &#x2261; 7 mod 27  || 22, 118, 214, 310, ...
| all odd numbers
|-
|-
| r<sub>3</sub> &#x2261; 8 mod 12
| 8 || <nowiki>C[i,2]</nowiki> &sigma;&sigma;&sigma; || 192 * (i - 7) / 27  + 46 || i &#x2261; 7  mod 27 || 46, 238, 430, 622, ...
| style="text-align:right" | &#x2261; 8,20 mod 24
|-
|-
| r<sub>4</sub> &#x2261; 2 mod 4
| 9 || <nowiki>C[i,3]</nowiki> &sigma;&sigma;&sigma; || 192 * (i - 61) / 81 + 142|| i &#x2261; 61 mod 81 || 142, 334, ...       
| style="text-align:right" | &#x2261; 2,6,10,14,18,22 mod 24
|-
|-
| r<sub>5</sub> &#x2261; 16 mod 24
| ... || ... || ... || ... || ...
| style="text-align:right" | &#x2261; 16 mod 24
|-
|-
| r<sub>6</sub> &#x2261; 4 mod 8
| style="text-align:right" | &#x2261; 4,12,20 mod 24
|}
|}
The first column(s) <nowiki>C[i,1]</nowiki> will be denoted as the '''left side''' of the segment (or of the whole directory), while the columns <nowiki>C[i,j], j &gt; 1</nowiki> will be the '''right part'''. The first few lines of the segment directory are the following:


 
<table style="border-collapse: collapse; ">
So if we can show that we reach all start values &#x2261; 4 mod 6, we have a proof that all positive integers are reached.
<tr>
 
<td style="text-align:center"> </td>
Starting with 4, it seems possible that a continuous expansion of all numbers &#x2261; 4 mod 6 into streets would finally yield all streets up to some start value. Experiments show that there are limits for the numbers involved. Streets above the ''clamp'' value are not necessary in order to obtain all streets below and including the ''start'' value:
<td style="text-align:center">1</td>
{| class="wikitable"
<td style="text-align:center">2</td>
! start value
<td style="text-align:center">3</td>
! clamp value
<td style="text-align:center">4</td>
|- style="text-align:right"
<td style="text-align:center">5</td>
| 4  || 4
<td style="text-align:center">6</td>
|- style="text-align:right"
<td style="text-align:center">7</td>
| 40 || 76
<td style="text-align:center">8</td>
|- style="text-align:right"
<td style="text-align:center">9</td>
| 364 || 2308
<td style="text-align:center">10</td>
|- style="text-align:right"
<td style="text-align:center">11</td>
| 3280 || 143248
<td style="text-align:center">...</td>
|}
<td style="text-align:center">2*j</td>
==Subset table S==
<td style="text-align:center">2*j+1</td>
We may build derived table from the table of streets. We take columns r<sub>0</sub> and r<sub>5</sub> ff., and therein we keep the highlighted entries (those which are &#x2261; 4 mod 6) only, add 2 to them and divide them by 6. The resulting subset table S starts as follows:
</tr>
  s0  s1  s2  s3  s4  s5  s6  s7  s8  ...
<tr>
  n  len 
<td style="border:1px solid gray;text-align:right" >&nbsp;&nbsp;i&nbsp;&nbsp;</td>
  1   3    3    1    2
<td style="border:1px solid gray;text-align:right" >6*i&#8209;2</td>
  2   1    7
<td style="border:1px solid gray;text-align:right" >&micro;&micro;</td>
  3   1  11
<td style="border:1px solid gray;text-align:right" >&delta;&micro;&micro;</td>
  4   3  15    5   10
<td style="border:1px solid gray;text-align:right" >&micro;&micro;&sigma;</td>
  5   1  19
<td style="border:1px solid gray;text-align:right" >&delta;&micro;&micro;&sigma;</td>
  6   1  23
<td style="border:1px solid gray;text-align:right" >&micro;&micro;&sigma;&sigma;</td>
  7   7  27    9  18    6  12    4    8
<td style="border:1px solid gray;text-align:right" >&delta;&micro;&micro;&sigma;&sigma;</td>
  8  1   31
<td style="border:1px solid gray;text-align:right" >&micro;&micro;&sigma;<sup>3</sup></td>
  9  1   35
<td style="border:1px solid gray;text-align:right" >&delta;&micro;&micro;&sigma;<sup>3</sup></td>
  10  3  39  13  26
<td style="border:1px solid gray;text-align:right" >&micro;&micro;&sigma;<sup>4</sup></td>
  11  1   43
<td style="border:1px solid gray;text-align:right" >&delta;&micro;&micro;&sigma;<sup>4</sup></td>
12  1   47
<td style="border:1px solid gray;text-align:right" >...</td>
  13  3  51  17  34
<td style="border:1px solid gray;text-align:right" >&micro;&micro;&sigma;<sup>j-1</sup></td>
14  1   55
<td style="border:1px solid gray;text-align:right" >&delta;&micro;&micro;&sigma;<sup>j-1</sup></td>
  15  1   59
</tr>
16  5  63  21  42  14  28
<tr><td>&nbsp;&nbsp;1&nbsp;&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp; 4&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;16&nbsp;</td><td style="border:1px solid gray;text-align:right"  title="1.0:0" id="4" class="d4 bor seg">&nbsp;4&nbsp;</td><td style="border:1px solid gray;text-align:right" title="1.1:0" id="10" class="d4 bor seg">&nbsp;10&nbsp;</td></tr>
...
<tr><td>&nbsp;&nbsp;2&nbsp;&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;10&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;40&nbsp;</td></tr>
This table can be described by simple rules which are hopefully provable from the construction rule for the streets:  
<tr><td>&nbsp;&nbsp;3&nbsp;&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;16&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;64&nbsp;</td></tr>
* s<sub>2</sub> is always s<sub>0</sub> * 4 - 1.  
<tr><td>&nbsp;&nbsp;4&nbsp;&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;22&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;88&nbsp;</td><td style="border:1px solid gray;text-align:right"  title="5.0:0" id="28" class="d4 bor seg">&nbsp;28&nbsp;</td><td style="border:1px solid gray;text-align:right"  title="5.1:0" id="58" class="d4 bor seg">&nbsp;58&nbsp;</td></tr>
* When s<sub>2</sub> &#x2261; 0 mod 3, the following columns s<sub>3</sub>, s<sub>4</sub> ... are obtained by an alternating sequence of steps
<tr><td>&nbsp;&nbsp;5&nbsp;&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;28&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;112&nbsp;</td></tr>
** s<sub>i+1</sub> = s<sub>i</sub> / 3 and
<tr><td>&nbsp;&nbsp;6&nbsp;&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;34&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;136&nbsp;</td></tr>
** s<sub>i+2</sub> = s<sub>i+1</sub> * 2,
<tr><td>&nbsp;&nbsp;7&nbsp;&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;40&nbsp;</td><td style="border:1px solid gray;text-align:right" >&nbsp;160&nbsp;</td><td style="border:1px solid gray;text-align:right"  title="1.0:2" id="52" class="d4 bor seg">&nbsp;52&nbsp;</td><td style="border:1px solid gray;text-align:right"  title="1.1:2" id="106" class="d4 bor seg">&nbsp;106&nbsp;</td><td style="border:1px solid gray;text-align:right" title="1.1:1" id="34" class="d4 bor seg">&nbsp;34&nbsp;</td><td style="border:1px solid gray;text-align:right" title="1.2:1" id="70" class="d4 bor seg">&nbsp;70&nbsp;</td><td style="border:1px solid gray;text-align:right" title="1.2:0" id="22" class="d4 bor seg">&nbsp;22&nbsp;</td><td style="border:1px solid gray;text-align:right" title="1.3:0" id="46" class="d4 bor seg">&nbsp;46&nbsp;</td></tr>
** until all factors 3 in s<sub>2</sub> are replaced by factors 2.  
</table>
 
There is a more elaborated '''[http://www.teherba.org/fasces/oeis/collatz/comp.html segment directory] with 5000 rows'''.
===Does S contain all positive integers?===
===Properties of the segment directory===
The answer is yes. As above, we can regard the increments in successive columns:
We make a number of claims for segments:
{| class="wikitable"
* (C1) All nodes in the segment directory are of the form 6 * n - 2.
| s<sub>s</sub> &#x2261; 3 mod 4
: This follows from the formula for columns <nowiki>C[i,1..3]</nowiki>, and for any higher column numbers from the 3-by-2 replacement property of the &sigma; operation.
| style="text-align:right" | half of the odd numbers
* (C2) All segments have a finite length.
: At some point the &sigma; operations will have replaced all factors 3 by 2.
* (C3) All nodes in the right part of a segment have the form 6 * (3<sup>n</sup> * 2<sup>m</sup> * f) - 2 with the same "3-2-free" factor f.
: This follows from the operations for columns <nowiki>C[i,1..3]</nowiki>, and from the fact that the &sigma; operation maintains this property.
* (C4) All nodes in the right part of a particular segment are
** different among themselves, and
** different from the left side of that segment (except for the first segment for the root 4).
: For <nowiki>C[i,1..2]</nowiki> we see that the values modulo 24 are different. For the remaining columns, we see that the exponents of the factors 2 and 3 are different. They are shifted by the &sigma; operations, but they alternate, for example (in the segment with left part 40):
160 = 6 * (3<sup>3</sup> * 2<sup>0</sup> * 1) - 2
  52 = 6 * (3<sup>2</sup> * 2<sup>0</sup> * 1) - 2
106 = 6 * (3<sup>2</sup> * 2<sup>1</sup> * 1) - 2
  34 = 6 * (3<sup>1</sup> * 2<sup>1</sup> * 1) - 2
  70 = 6 * (3<sup>1</sup> * 2<sup>2</sup> * 1) - 2
  22 = 6 * (3<sup>0</sup> * 2<sup>2</sup> * 1) - 2
  46 = 6 * (3<sup>0</sup> * 2<sup>3</sup> * 1) - 2
* (C5) There is no cycle in a segment (except for the first segment for the root 4).
===Segment Lengths===
The segment directory is obviously very structured. The lengths of the compressed segments follow the pattern
4 2 2 4 2 2 L<sub>1</sub> 2 2 4 2 2 4 2 2 L<sub>2</sub> 2 2 4 2 2 ...
with two ''fixed lengths'' 2 and 4 and some ''variable lengths'' L<sub>1</sub>, L<sub>2</sub> ... &gt; 4. For the left parts 4, 40, 364, 3280, 29524 ([http://oeis.org/A191681 OEIS A191681]), the segment lengths have high values 4, 8, 12, 16, 20 which did not occur before. Those left parts are (9<sup>n+1</sup> - 1) / 2, or 4 * Sum(9<sup>i</sup>, i = 0..n).
===Coverage of the Right Part===
We now examine the modular conditions which result from the segment construction table in order to find out how the numbers of the form 6 * n - 2 are covered by the right part of the segment directory, as shown in the following table (T3):
{| class="wikitable" style="text-align:left"
!Columns j  !! Covered !! Remaining         
|-
| 2-3 ||   4, 16 mod 24 || 10, 22, 34, 46 mod 48
|-
|-
| s<sub>3</sub> &#x2261; 1 mod 4
| 3-4 ||  10, 34 mod 48 || 22, 46, 70, 94 mod 96
| style="text-align:right" | other half of odd numbers
|-
|-
| s<sub>4</sub> &#x2261; 2 mod 8
| 5-6 ||  70, 22 mod 96 || 46, 94, 142, 190 mod 192
| style="text-align:right" | &#x2261; 2,10 mod 16
|-
|-
| s<sub>5</sub> &#x2261; 6 mod 8
| 7-8 ||  46, 142 mod 192 || 94, 190, 286, 382 mod 384
| style="text-align:right" | &#x2261; 6,14 mod 16
|-
|-
| s<sub>6</sub> &#x2261; 12 mod 16
| ... ||  ... || ...
| style="text-align:right" | &#x2261; 12 mod 16
|}
We can always exclude the first and the third element remaining so far by looking in the next two columns of segments with sufficient length.
* (C6) There is no limit on the length of a segment.
: We only need to take a segment which, in its right part, has a factor of 3 with a sufficiently high power, and the &sigma; operations will stretch out the segment accordingly.
Therefore we can continue the modulus table above indefinitely, which leads us to the claim:
* (C7) '''All numbers of the form 6 * n - 2 occur exactly once''' in the right part of the segment directory, and once as a left side. There is a bijective mapping between the left sides and the elements of the right parts.
: The sequences defined by the columns in the right part all have different modulus conditions. Therefore they are all disjoint.
==Forest directory==
We construct a ''forest directory'' F which initially is a copy of the segment directory C. F lists all the small trees with two branches which are represented by the right parts in the segment directory. These trees are ''labelled'' by the left sides.
Then we start a ''gedankenexperiment'' analogous to [https://en.wikipedia.org/wiki/Hilbert%27s_paradox_of_the_Grand_Hotel Hilbert's hotel]. We consider simultaneously all rows i &gt; 1 (omitting the root segment) in F which fulfill some modularity condition (the ''source'' row in F), and we ''attach'' (identify, connect) them to their unique occurrence in the right part of F (''target'' row and column).
* (C8) The attachment process does not create any new cycle (in addition to the one in the root segment).
: Let tree t1 with label n1 and right part R1 be attached to node n1 in the right part R2 of the unique tree t2 which is labelled by n2. t1 and t2 are disjoint trees, therefore the result of such a single attachment step is a tree again (t2', still labelled by n2).
: A technical implementation of the process (which is impossible for infinite sets) would perhaps replace the target node by a pointer to the source tree.
We ''sieve'' the trees in F: Whenever we attach t1 to n1 in t2, we remove the row for n1 resp. t1 in F.  The attachment rules are shown in the following table '''(T4)''':
{| class="wikitable" style="text-align:left"
|-
|-
| s<sub>7</sub> &#x2261; 4 mod 16
!Source Row i        !! First left<br>sides  !! Target<br>Row !! Target<br>Column      !! Remaining Rows                !! Fraction
| style="text-align:right" | &#x2261; 4 mod 16
|-                     
|i &#x2261; 3 mod 4  || 16, 40, 64, 88 ...    || ((i -  3) /  4) * 3<sup>0</sup> + 1 || 2||i &#x2261; 0, 1, 2 mod 4      ||3/4
|-                                                           
|i &#x2261; 1 mod 4  || (4), 28, 52, 76 ...  ||((i -  1) /  4) * 3<sup>1</sup> + 1 || 3||i &#x2261; 0, 2, 4, 6 mod 8    ||1/2
|-                                                           
|i &#x2261; 2 mod 8  || 10, 58, 106, 154 ...  || ((i -  2) /  8) * 3<sup>1</sup> + 1 || 4||i &#x2261; 0, 4, 6 mod 8      ||3/8
|-                                                           
|i &#x2261; 6 mod 8  || 34, 82, 130, 178 ...  || ((i -  6) /  8) * 3<sup>2</sup> + 7 || 5||i &#x2261; 0, 4, 8, 12 mod 16  ||1/4
|-                                                           
|i &#x2261; 12 mod 16|| 70, 166, 262, 358 ... || ((i - 12) / 16) * 3<sup>2</sup> + 7 || 6||i &#x2261; 0, 4, 8 mod 16      ||3/16
|-                                                           
|i &#x2261; 4  mod 16|| 22, 118, 214, 310 ... || ((i -  4) / 16) * 3<sup>3</sup> + 7 || 7||i &#x2261; 0, 8, 16, 24 mod 32 ||1/8
|-                                                           
|i &#x2261; 8  mod 32|| 46, 238, 430, 682 ... || ((i -  8) / 32) * 3<sup>3</sup> + 7 || 8||i &#x2261; 0, 16, 24 mod 32    ||3/32
|-                                                           
|i &#x2261; 24 mod 32|| 142, 334 ...          || ((i - 24) / 32) * 3<sup>4</sup> + 61|| 9||i &#x2261; 0, 16, 32, 48 mod 64||1/16
|-                                                            
|i &#x2261; 48 mod 64|| 286, 670 ...          || ((i - 48) / 64) * 3<sup>4</sup> + 61||10||i &#x2261; 0, 16, 32 mod 64    ||3/64
|-                                                           
|i &#x2261; 16 mod 64|| 94, 478 ...          || ((i - 16) / 64) * 3<sup>5</sup> + 61||11||i &#x2261; 0, 32, 64, 96 mod 128 ||1/32
|-                     
| ...                ||  ...                              ||  ... || ...                      || ...
|-                     
|}
It should be obvious how the next rows of this table should be filled: The residues of 2<sup>k</sup> in the first column are 3 * 2<sup>k-2</sup>, 1 * 2<sup>k-2</sup> in an alternating sequence. The additive constants in the second column are the indexes of the variable length segments with left parts (4), 40, 364, 3280, 29524 ([http://oeis.org/A191681 OEIS A191681]) mentioned above. They are repeated 4 times since the corresponding lengths "jump" by 4.
It should be noted that it does not matter in which order the single attachment steps are performed.
===Tree connectivity===
* (C9) The tree of any source row with arbitrarily large left side n1 will eventually be be attached to another tree which contains n1 in its node set.
: Similiar to the arguments for coverage of segments, we have to apply the rules from table T4 one after the other up to a sufficiently high row (corresponding to a sufficiently long variable segment).
* (C9a) In the end, all subtrees will be attached to the root segment.
: Suppose there is a set U of subtrees left which are not connected to the root segment. We consider the subtree t in U  with the smallest label. We know that it should have been attached - to where? Either to the root segment, or to another tree in U. ''Is this  already a contradiction?''
:If not, then in both cases the number of trees is reduced by 1. We repeat the argument until there is only one subtree t left in U. The label of t is not contained in t's node set, so it must be contained in the node set of the tree already attached to the root segment, and that is the node where t must also be attached.
: And if U is infinite? Then the two trees attached to the root segment would have a finite number of nodes. Then they have leaves which have no tree attached to them. But by T4 we could always determine the subtree which should have been attached, so we have a contradiction.
We denote the final tree resulting from the sieving process by '''compressed tree'''.
==The Collatz Tree==
* (C10) The compressed tree is a subgraph of the Collatz graph.
: The edges of the compressed tree carry combined operations &micro;&micro;, &delta;&micro;&micro; and &sigma; = &delta;&micro;.
So far, numbers of the form x &#x2261; 0, 1, 2, 3, 5 mod 6 are missing from the compressed tree.
 
We insert intermediate nodes into the compressed tree by applying operations on the left parts of the segments as shown in the following table (T5):
{| class="wikitable" style="text-align:left"
|-
|-
| s<sub>8</sub> &#x2261; 8 mod 32
! Operation            !! Condition            !! Resulting Nodes !! Remaining Nodes
| style="text-align:right" | 8, 40, 72, ...
|-                   
|&delta;              ||                      || 2 * i - 1      || i &#x2261; 0, 2, 6, 8 mod 12
|-                    
|&micro;              ||                       || 12 * i - 4      || i &#x2261; 0, 2, 6 mod 12
|-                   
|&delta;&micro;        || i &#x2261; 1, 2 mod 3 || 4 * i - 2      || i &#x2261; 0, 12 mod 24
|-
|-
| s<sub>9</sub> &#x2261; 24 mod 32
|&delta;&micro;&micro; || i &#x2261; 2 mod 3    || 8 * i - 4      || i &#x2261; 0 mod 24
| style="text-align:right" | 24, 56, 88, ...
|-
|-
| s<sub>10</sub> &#x2261; 48 mod 64
|&delta;&micro;&micro;&micro; || i &#x2261; 2 mod 3 || 16 * i - 8  || (none)
| style="text-align:right" | 48, 112, 176, 240 ...
|-
|-
| s<sub>11</sub> &#x2261; 16 mod 64
| style="text-align:right" | 16, 80, ...
|}
|}
This shows that the columns s<sub>4</sub> ... s<sub>7</sub> contain all numbers &#x2261; 2,4,6,10,12,14 mod 16, but those &#x2261; 0,8 mod 16 are missing so far. The ones &#x2261; 8 mod 16 show up in s<sub>8</sub> resp. s<sub>9</sub>, half of the multiples of 16 are in s<sub>10</sub> resp. s<sub>11</sub> but &#x2261; 0,32 mod 64 are missing, etc.  
The first three rows in T5 care for the intermediate nodes at the beginning of the segment construction with columns 1, 2, 3. Rows 4 and 5 generate the sprouts (starting at multiples of 3) which are not contained in the segment directory. 
 
We call such a construction a ''detailed segment'' (in contrast to the ''compressed segments'' described above).
:: A '''[http://www.teherba.org/fasces/oeis/collatz/rails.html detailed segment directory]''' can be created by the same [https://github.com/gfis/fasces/blob/master/oeis/collatz/collatz_rails.pl Perl program]. In that directory, the two subpaths of a segment are shown in two lines. Only the highlighted nodes are unique.


Since s<sub>2</sub> contains arbitray high powers of 3, S has rows of arbitrary length, and for the missing multiples of powers of 2 the exponents can be driven above all limits.
* (C11) The connectivity of the compressed tree remains unaffected by the insertions.
: Thus S contains all positive integers.
* (C12) With the insertions of T5, the compressed tree covers the whole Collatz graph.
===Can S be generated starting at 1?===
* (C13) '''The Collatz graph is a tree''' (except for the trivial cycle 4-2-1).
We ask for an iterative process which starts with the row of S for index 1:
  1:    3    1    2
Then, all additional rows for the elements obtained so far are generated:
  2:    7
  3:  11
These rows are also expanded:
  7:  27    9  18    6  12    4    8
11:  43
Since we want to cover all indexes, we would first generate the rows for lower indexes. This process fills all rows up to s<sub>0</sub> = 13 rather quickly, but the first 27 completely filled rows involve start numbers s<sub>0</sub> up to 1539, and for the first 4831 rows, start values up to 4076811 are involved.

Revision as of 09:25, 7 November 2018

Introduction

Collatz sequences (also called trajectories) are sequences of integer numbers > 0. For any start value > 0 the elements of the sequence are constructed with two simple rules:

  1. Even numbers are halved.
  2. Odd numbers are multiplied by 3 and then incremented by 1.

Since decades it is unknown whether the final cyle 4 - 2 - 1 is always reached for all start values. This problem is the Collatz conjecture, for which the english Wikipedia states:

It is also known as the 3n + 1 conjecture, the Ulam conjecture (after Stanisław Ulam), Kakutani's problem (after Shizuo Kakutani), the Thwaites conjecture (after Sir Bryan Thwaites), Hasse's algorithm (after Helmut Hasse), or the Syracuse problem; the sequence of numbers involved is referred to as the hailstone sequence or hailstone numbers (because the values are usually subject to multiple descents and ascents like hailstones in a cloud), or as wondrous numbers.

Straightforward visualizations of Collatz sequences show no obvious structure. The sequences for the first dozen of start values are rather short, but the sequence for 27 suddenly has 112 elements.

References

  • Jeffry C. Lagarias, Ed.: The Ultimate Challenge: The 3x+1 Problem, Amer. Math. Soc., 2010, ISBN 978-8218-4940-8. MBK78
  • OEIS A07165: File of first 10K Collatz sequences, ascending start values, with lengths
  • Manfred Trümper: The Collatz Problem in the Light of an Infinite Free Semigroup. Chinese Journal of Mathematics, Vol. 2014, Article ID 756917, 21 p.

Collatz Graph

When all Collatz sequences are read backwards, they form the Collatz graph starting with 1, 2, 4, 8 ... . At each node m > 4 in the graph, the path from the root (4) can be continued

  • always to m * 2, and
  • to (m - 1) / 3 if m ≡ 1 mod 3.

The Collatz conjecture claims that the graphs contains all numbers, and that - except for the trivial, leading cycle 1 - 2 - 4 - 1 - 2 - 4 ... - it has the form of a tree (without cycles). We will not consider the trivial cycle, and we start the graph with node 4, the root. Moreover, another trivial type of path starts when m ≡ 0 mod 3. We call such a path a sprout, and it contains duplications only. A sprout must be added to the graph for any node divisible by 3, therefore we will not consider them for the moment.

Graph Operations

Following Trümper, we use abbreviations for the elementary operations which transform a node (element, number) in the Collatz graph according to the following table (T1):

Name Mnemonic Distance to root Mapping Condition
d down -1 m ↦ m / 2 m ≡ 0 mod 2
u up -1 m ↦ 3 * m + 1 (m ≡ 1 mod 2)
s := ud spike -2 m ↦ (3 * m + 1) / 2) m ≡ 1 mod 2
δ divide +1 m ↦ (m - 1) / 3 m ≡ 1 mod 3
µ multiply +1 m ↦ m * 2 (none)
σ := δµ squeeze +2 m ↦ ((m - 1) / 3) * 2 m ≡ 1 mod 3

We will mainly be interested in the reverse mappings (denoted with greek letters) which move away from the root of the graph.

3-by-2 Replacement

The σ operation, applied to numbers of the form 6 * m - 2, has an interesting property:

(6 * (3 * n) - 2) σ = 4 * 3 * n - 2 =  6 * (2 * n) - 2

In other words, as long as m contains a factor 3, the σ operation maintains the form 6 * x - 2, and it replaces the factor 3 by 2 (it "squeezes" a 3 into a 2). In the opposite direction, the s operation replaces a factor 2 in m by 3.

Motivation: Patterns in sequences with the same length

A closer look at the Collatz sequences shows a lot of pairs of adjacent start values which have the same sequence length, for example (from OEIS A070165):

142/104: 142 d  71 u 214 d 107 u 322 d 161 u 484 d  242 d 121 u 364 ] 182, 91, ... 4, 2, 1
143/104: 143 u 430 d 215 u 646 d 323 u 970 d 485 u 1456 d 728 d 364 ] 182, 91, ... 4, 2, 1
           +1  *6+4    +1  *6+4    +1  *6+4    +1   *6+4  *6+2    +0    +0 ...

The third line tells how the second line could be computed from the first. Walking from right to left, the step pattern is:

δ µ µ δ µ δ µ δ µ 
µ µ δ µ δ µ δ µ δ

The alternating pattern of operations can be continued to the left with 4 additional pairs of steps:

 q? u [ 62 d  31 u  94 d  47 u 142 d ...
126 d [ 63 u 190 d  95 u 286 d 143 u ...
        +1  *6+4    +1  *6+4    +1  

The pattern stops here since there is no number q such that q * 3 + 1 = 62.

Segment Construction

These patterns lead us to the construction of special subsets of paths in the Collatz graph which we call segments. They lead away from the root, and they always start with a node m ≡ -2 mod 6. Then they split and follow two subpaths in a prescribed sequence of operations. The segment construction process is stopped when the next node in one of the two subpaths becomes divisible by 3, resp. when a δ operation is no more possible. We assemble the segments as rows of an infinite array C[i,j], the so-called segment directory.

Informally, and in the two examples above, we consider the terms betweeen the square brackets. For the moment, we only take those which are which are ≡ 4 mod 6 (for "compressed" segments, below there are also "detailed" segments where we take all). We start at the right and with the lower line, and we interleave the terms ≡ 4 mod 6 of the two lines to get a segment.

The columns in one row i of the array C are constructed as described in the following table (T2):

Column j Operation Formula Condition Sequence
1 6 * i - 2 4, 10, 16, 22, 28, ...
2 C[i,1] µµ 24 * (i - 1) + 16 16, 40, 64, 88, 112, ...
3 C[i,1] δµµ 24 * (i - 1) / 3 + 4 i ≡ 1 mod 3 4, 28, 52, 76, 100, ...
4 C[i,2] σ 48 * (i - 1) / 3 + 10 i ≡ 1 mod 3 10, 58, 106, 134, ...
5 C[i,3] σ 48 * (i - 7) / 9 + 34 i ≡ 7 mod 9 34, 82, 130, 178, ...
6 C[i,2] σσ 96 * (i - 7) / 9 + 70 i ≡ 7 mod 9 70, 166, 262, 358, ...
7 C[i,3] σσ 96 * (i - 7) / 27 + 22 i ≡ 7 mod 27 22, 118, 214, 310, ...
8 C[i,2] σσσ 192 * (i - 7) / 27 + 46 i ≡ 7 mod 27 46, 238, 430, 622, ...
9 C[i,3] σσσ 192 * (i - 61) / 81 + 142 i ≡ 61 mod 81 142, 334, ...
... ... ... ... ...

The first column(s) C[i,1] will be denoted as the left side of the segment (or of the whole directory), while the columns C[i,j], j > 1 will be the right part. The first few lines of the segment directory are the following:

1 2 3 4 5 6 7 8 9 10 11 ... 2*j 2*j+1
  i   6*i‑2 µµ δµµ µµσ δµµσ µµσσ δµµσσ µµσ3 δµµσ3 µµσ4 δµµσ4 ... µµσj-1 δµµσj-1
  1    4  16  4  10 
  2   10  40 
  3   16  64 
  4   22  88  28  58 
  5   28  112 
  6   34  136 
  7   40  160  52  106  34  70  22  46 

There is a more elaborated segment directory with 5000 rows.

Properties of the segment directory

We make a number of claims for segments:

  • (C1) All nodes in the segment directory are of the form 6 * n - 2.
This follows from the formula for columns C[i,1..3], and for any higher column numbers from the 3-by-2 replacement property of the σ operation.
  • (C2) All segments have a finite length.
At some point the σ operations will have replaced all factors 3 by 2.
  • (C3) All nodes in the right part of a segment have the form 6 * (3n * 2m * f) - 2 with the same "3-2-free" factor f.
This follows from the operations for columns C[i,1..3], and from the fact that the σ operation maintains this property.
  • (C4) All nodes in the right part of a particular segment are
    • different among themselves, and
    • different from the left side of that segment (except for the first segment for the root 4).
For C[i,1..2] we see that the values modulo 24 are different. For the remaining columns, we see that the exponents of the factors 2 and 3 are different. They are shifted by the σ operations, but they alternate, for example (in the segment with left part 40):
160 = 6 * (33 * 20 * 1) - 2
 52 = 6 * (32 * 20 * 1) - 2
106 = 6 * (32 * 21 * 1) - 2
 34 = 6 * (31 * 21 * 1) - 2
 70 = 6 * (31 * 22 * 1) - 2
 22 = 6 * (30 * 22 * 1) - 2
 46 = 6 * (30 * 23 * 1) - 2
  • (C5) There is no cycle in a segment (except for the first segment for the root 4).

Segment Lengths

The segment directory is obviously very structured. The lengths of the compressed segments follow the pattern

4 2 2 4 2 2 L1 2 2 4 2 2 4 2 2 L2 2 2 4 2 2 ...

with two fixed lengths 2 and 4 and some variable lengths L1, L2 ... > 4. For the left parts 4, 40, 364, 3280, 29524 (OEIS A191681), the segment lengths have high values 4, 8, 12, 16, 20 which did not occur before. Those left parts are (9n+1 - 1) / 2, or 4 * Sum(9i, i = 0..n).

Coverage of the Right Part

We now examine the modular conditions which result from the segment construction table in order to find out how the numbers of the form 6 * n - 2 are covered by the right part of the segment directory, as shown in the following table (T3):

Columns j Covered Remaining
2-3 4, 16 mod 24 10, 22, 34, 46 mod 48
3-4 10, 34 mod 48 22, 46, 70, 94 mod 96
5-6 70, 22 mod 96 46, 94, 142, 190 mod 192
7-8 46, 142 mod 192 94, 190, 286, 382 mod 384
... ... ...

We can always exclude the first and the third element remaining so far by looking in the next two columns of segments with sufficient length.

  • (C6) There is no limit on the length of a segment.
We only need to take a segment which, in its right part, has a factor of 3 with a sufficiently high power, and the σ operations will stretch out the segment accordingly.

Therefore we can continue the modulus table above indefinitely, which leads us to the claim:

  • (C7) All numbers of the form 6 * n - 2 occur exactly once in the right part of the segment directory, and once as a left side. There is a bijective mapping between the left sides and the elements of the right parts.
The sequences defined by the columns in the right part all have different modulus conditions. Therefore they are all disjoint.

Forest directory

We construct a forest directory F which initially is a copy of the segment directory C. F lists all the small trees with two branches which are represented by the right parts in the segment directory. These trees are labelled by the left sides. Then we start a gedankenexperiment analogous to Hilbert's hotel. We consider simultaneously all rows i > 1 (omitting the root segment) in F which fulfill some modularity condition (the source row in F), and we attach (identify, connect) them to their unique occurrence in the right part of F (target row and column).

  • (C8) The attachment process does not create any new cycle (in addition to the one in the root segment).
Let tree t1 with label n1 and right part R1 be attached to node n1 in the right part R2 of the unique tree t2 which is labelled by n2. t1 and t2 are disjoint trees, therefore the result of such a single attachment step is a tree again (t2', still labelled by n2).
A technical implementation of the process (which is impossible for infinite sets) would perhaps replace the target node by a pointer to the source tree.

We sieve the trees in F: Whenever we attach t1 to n1 in t2, we remove the row for n1 resp. t1 in F. The attachment rules are shown in the following table (T4):

Source Row i First left
sides
Target
Row
Target
Column
Remaining Rows Fraction
i ≡ 3 mod 4 16, 40, 64, 88 ... ((i - 3) / 4) * 30 + 1 2 i ≡ 0, 1, 2 mod 4 3/4
i ≡ 1 mod 4 (4), 28, 52, 76 ... ((i - 1) / 4) * 31 + 1 3 i ≡ 0, 2, 4, 6 mod 8 1/2
i ≡ 2 mod 8 10, 58, 106, 154 ... ((i - 2) / 8) * 31 + 1 4 i ≡ 0, 4, 6 mod 8 3/8
i ≡ 6 mod 8 34, 82, 130, 178 ... ((i - 6) / 8) * 32 + 7 5 i ≡ 0, 4, 8, 12 mod 16 1/4
i ≡ 12 mod 16 70, 166, 262, 358 ... ((i - 12) / 16) * 32 + 7 6 i ≡ 0, 4, 8 mod 16 3/16
i ≡ 4 mod 16 22, 118, 214, 310 ... ((i - 4) / 16) * 33 + 7 7 i ≡ 0, 8, 16, 24 mod 32 1/8
i ≡ 8 mod 32 46, 238, 430, 682 ... ((i - 8) / 32) * 33 + 7 8 i ≡ 0, 16, 24 mod 32 3/32
i ≡ 24 mod 32 142, 334 ... ((i - 24) / 32) * 34 + 61 9 i ≡ 0, 16, 32, 48 mod 64 1/16
i ≡ 48 mod 64 286, 670 ... ((i - 48) / 64) * 34 + 61 10 i ≡ 0, 16, 32 mod 64 3/64
i ≡ 16 mod 64 94, 478 ... ((i - 16) / 64) * 35 + 61 11 i ≡ 0, 32, 64, 96 mod 128 1/32
... ... ... ... ...

It should be obvious how the next rows of this table should be filled: The residues of 2k in the first column are 3 * 2k-2, 1 * 2k-2 in an alternating sequence. The additive constants in the second column are the indexes of the variable length segments with left parts (4), 40, 364, 3280, 29524 (OEIS A191681) mentioned above. They are repeated 4 times since the corresponding lengths "jump" by 4. It should be noted that it does not matter in which order the single attachment steps are performed.

Tree connectivity

  • (C9) The tree of any source row with arbitrarily large left side n1 will eventually be be attached to another tree which contains n1 in its node set.
Similiar to the arguments for coverage of segments, we have to apply the rules from table T4 one after the other up to a sufficiently high row (corresponding to a sufficiently long variable segment).
  • (C9a) In the end, all subtrees will be attached to the root segment.
Suppose there is a set U of subtrees left which are not connected to the root segment. We consider the subtree t in U with the smallest label. We know that it should have been attached - to where? Either to the root segment, or to another tree in U. Is this already a contradiction?
If not, then in both cases the number of trees is reduced by 1. We repeat the argument until there is only one subtree t left in U. The label of t is not contained in t's node set, so it must be contained in the node set of the tree already attached to the root segment, and that is the node where t must also be attached.
And if U is infinite? Then the two trees attached to the root segment would have a finite number of nodes. Then they have leaves which have no tree attached to them. But by T4 we could always determine the subtree which should have been attached, so we have a contradiction.

We denote the final tree resulting from the sieving process by compressed tree.

The Collatz Tree

  • (C10) The compressed tree is a subgraph of the Collatz graph.
The edges of the compressed tree carry combined operations µµ, δµµ and σ = δµ.

So far, numbers of the form x ≡ 0, 1, 2, 3, 5 mod 6 are missing from the compressed tree.

We insert intermediate nodes into the compressed tree by applying operations on the left parts of the segments as shown in the following table (T5):

Operation Condition Resulting Nodes Remaining Nodes
δ 2 * i - 1 i ≡ 0, 2, 6, 8 mod 12
µ 12 * i - 4 i ≡ 0, 2, 6 mod 12
δµ i ≡ 1, 2 mod 3 4 * i - 2 i ≡ 0, 12 mod 24
δµµ i ≡ 2 mod 3 8 * i - 4 i ≡ 0 mod 24
δµµµ i ≡ 2 mod 3 16 * i - 8 (none)

The first three rows in T5 care for the intermediate nodes at the beginning of the segment construction with columns 1, 2, 3. Rows 4 and 5 generate the sprouts (starting at multiples of 3) which are not contained in the segment directory.

We call such a construction a detailed segment (in contrast to the compressed segments described above).

A detailed segment directory can be created by the same Perl program. In that directory, the two subpaths of a segment are shown in two lines. Only the highlighted nodes are unique.
  • (C11) The connectivity of the compressed tree remains unaffected by the insertions.
  • (C12) With the insertions of T5, the compressed tree covers the whole Collatz graph.
  • (C13) The Collatz graph is a tree (except for the trivial cycle 4-2-1).