OEIS/Collatz Story: Difference between revisions

From tehowiki
Jump to navigation Jump to search
imported>Gfis
save 6
imported>Gfis
No edit summary
Line 1: Line 1:
==Abstract==
Small, finite trees with two branches are constructed with the operations defined by Collatz for his 3x+1 problem, These trees are connected to form bigger graphs in an iterative process. It is shown that this process finally builds a single graph which is a tree except for one cycle at the root. This graph is then embedded into the Collatz graph, and it is thereby shown that the latter is also a tree except for the cycle 4-2-1.
==Introduction==
==Introduction==
'''Collatz sequences''' (also called  ''trajectories'') are sequences of integer numbers > 0. For any start value > 0 the elements of the sequence are constructed with two simple rules:
'''Collatz sequences''' (also called  ''trajectories'') are sequences of integer numbers > 0. For any start value > 0 the elements of the sequence are constructed with two simple rules:
Line 4: Line 6:
# Odd numbers are multiplied by 3 and then incremented by 1.
# Odd numbers are multiplied by 3 and then incremented by 1.
Since decades it is unknown whether the final cyle 4 - 2 - 1 is always reached for all start values. This problem is the '''Collatz conjecture''', for which the [https://en.wikipedia.org/wiki/Collatz_conjecture english Wikipedia] states:
Since decades it is unknown whether the final cyle 4 - 2 - 1 is always reached for all start values. This problem is the '''Collatz conjecture''', for which the [https://en.wikipedia.org/wiki/Collatz_conjecture english Wikipedia] states:
: It is also known as the 3n + 1 conjecture, the Ulam conjecture (after Stanisław Ulam), Kakutani's problem (after Shizuo Kakutani), the Thwaites conjecture (after Sir Bryan Thwaites), Hasse's algorithm (after Helmut Hasse), or the Syracuse problem; the sequence of numbers involved is referred to as the hailstone sequence or hailstone numbers (because the values are usually subject to multiple descents and ascents like hailstones in a cloud), or as wondrous numbers.
: It is also known as the 3n + 1 conjecture, the Ulam conjecture (after Stanisław Ulam), Kakutani's problem (after Shizuo Kakutani), the Thwaites conjecture (after Sir Bryan Thwaites), Hasse's algorithm (after Helmut Hasse), or the Syracuse problem; the sequence of numbers involved is referred to as the hailstone sequence or hailstone numbers (because the values are usually subject to multiple descents and ascents like hailstones in a cloud), or as wondrous numbers.
 
Simple visualizations of Collatz sequences show no obvious structure. The sequences for the first dozen of start values are rather short, but the sequence for 27 suddenly has 112 elements.
<p align="right">Da sieht man den Wald vor lauter B&auml;men nicht.<br />German proverb:<br />"There you cannot see the wood for the trees."
</p>


Straightforward visualizations of Collatz sequences show no obvious structure. The sequences for the first dozen of start values are rather short, but the sequence for 27 suddenly has 112 elements.
===References===
===References===
* Jeffry C. Lagarias, Ed.: ''The Ultimate Challenge: The 3x+1 Problem'', Amer. Math. Soc., 2010, ISBN 978-8218-4940-8. [http://www.ams.org/bookpages/mbk-78 MBK78]
* Jeffry C. Lagarias, Ed.: ''The Ultimate Challenge: The 3x+1 Problem'', Amer. Math. Soc., 2010, ISBN 978-8218-4940-8. [http://www.ams.org/bookpages/mbk-78 MBK78]
* OEIS A07165: [http://oeis.org/A070165/a070165.txt  File of first 10K Collatz sequences], ascending start values, with lengths
* OEIS A07165: [http://oeis.org/A070165/a070165.txt  File of first 10K Collatz sequences], ascending start values, with lengths
* Manfred Trümper: ''The Collatz Problem in the Light of an Infinite Free Semigroup''. Chinese Journal of Mathematics, Vol. 2014, [http://dx.doi.org/10.1155/2014/756917 Article ID 756917], 21 p.
* Manfred Tr&uuml;mper: ''The Collatz Problem in the Light of an Infinite Free Semigroup''. Chinese Journal of Mathematics, Vol. 2014, [http://dx.doi.org/10.1155/2014/756917 Article ID 756917], 21 p.
 
==Collatz Graph==
==Collatz Graph==
When all Collatz sequences are read backwards, they form the '''Collatz graph''' starting with 1, 2, 4, 8 ... . At each node m > 4 in the graph, the path from the root (4) can be continued
When all Collatz sequences are read backwards, they form the '''Collatz graph''' starting with 1, 2, 4, 8 ... . At each node m > 4 in the graph, the path from the root (4) can be continued
* always to m * 2, and
* always to m * 2, and
* to (m - 1) / 3 if m &#x2261; 1 mod 3.
* to (m - 1) / 3 if m &#x2261; 1 mod 3.
The Collatz conjecture claims that the graphs contains all numbers, and that - except for the trivial, leading cycle 1 - 2 - 4 - 1 - 2 - 4 ... - it has the form of a tree (without cycles). We will not consider the trivial cycle, and we start the graph with node 4, the '''root'''.
 
Moreover, another trivial type of path starts when m &#x2261; 0 mod 3. We call such a path a ''sprout'', and it contains duplications only. A sprout must be added to the graph for any node divisible by 3, therefore we will not consider them for the moment.
The Collatz conjecture claims that the graphs contains all numbers, and that - except for the leading cycle 1 - 2 - 4 - 1 - 2 - 4 ... - it has the form of a tree (without cycles). We will not consider the leading cycle, and we start the graph with node 4, the '''root'''.
Moreover, another trivial type of path starts when m &#x2261; 0 mod 3. We call such a path a ''sprout'', and it contains duplications only. Sprout must be added to the graph for any node divisible by 3, therefore we will not consider them for the moment.
 
===Graph Operations===
===Graph Operations===
Following [http://dx.doi.org/10.1155/2014/756917 Trümper], we use abbreviations for the elementary operations which transform a node (element, number) in the Collatz graph according to the following table (T1):
Following [http://dx.doi.org/10.1155/2014/756917 Tr&uuml;mper], we use abbreviations for the elementary operations which transform a node (element, number) in the Collatz graph according to the following table (T1):
{| class="wikitable" style="text-align:center"
{| class="wikitable" style="text-align:center"
!Name    !! Mnemonic  !! Distance to root !!  Mapping            !! Condition
!Name    !! Mnemonic  !! Distance to root !!  Mapping            !! Condition
Line 39: Line 47:
  (6 * (3 * n) - 2) &sigma; = 4 * 3 * n - 2 =  6 * (2 * n) - 2
  (6 * (3 * n) - 2) &sigma; = 4 * 3 * n - 2 =  6 * (2 * n) - 2
In other words, as long as m contains a factor 3, the &sigma; operation maintains the form 6 * x - 2, and it  replaces the factor 3 by 2 (it "squeezes" a 3 into a 2). In the opposite direction, the s operation replaces a factor 2 in m by 3.
In other words, as long as m contains a factor 3, the &sigma; operation maintains the form 6 * x - 2, and it  replaces the factor 3 by 2 (it "squeezes" a 3 into a 2). In the opposite direction, the s operation replaces a factor 2 in m by 3.
<!--
<!--
=== Trivial paths===
=== Trivial paths===
Line 48: Line 57:
* The kernel is not affected by &sigma; and s operations.
* The kernel is not affected by &sigma; and s operations.
-->
-->
===Motivation: Patterns in sequences with the same length===
===Motivation: Patterns in sequences with the same length===
A closer look at the Collatz sequences shows a lot of pairs of adjacent start values which have the same sequence length, for example (from [https://oeis.org/A070165 OEIS A070165]):
A closer look at the Collatz sequences shows a lot of pairs of adjacent start values which have the same sequence length, for example (from [https://oeis.org/A070165 OEIS A070165]):
Line 63: Line 73:
The pattern stops here since there is no number q such that q * 3 + 1 = 62.
The pattern stops here since there is no number q such that q * 3 + 1 = 62.


==Segment Construction==
==Segments==
These patterns lead us to the construction of special subsets of paths in the Collatz graph which we call ''segments''. They lead away from the root, and they always start with a node m &#x2261; -2 mod 6. Then they split and follow two subpaths in a prescribed sequence of operations. The segment construction process is stopped when the next node in one of the two subpaths becomes divisible by 3, resp. when a &delta; operation is no more possible. We assemble the segments as rows of an infinite array <nowiki>C[i,j]</nowiki>, the so-called ''segment directory''.
These patterns lead us to the construction of special subsets of paths in the Collatz graph which we call ''segments''. They lead away from the root, and they always start with a node m &#x2261; -2 mod 6. Then they split and follow two subpaths in a prescribed sequence of operations. The segment construction process is stopped when the next node in one of the two subpaths becomes divisible by 3, resp. when a &delta; operation is no more possible.  
 
===Segment Directory Construction===
We list the segments as rows of an infinite array <nowiki>C[i,j]</nowiki>, the so-called ''segment directory''.
: Informally, and in the two examples above, we consider the terms betweeen the square brackets. For the moment, we only take those which are which are &#x2261; 4 mod 6 (for "compressed" segments, below there are also "detailed" segments where we take all). We start at the right and with the lower line, and we interleave the terms &#x2261; 4 mod 6 of the two lines to get a segment.
: Informally, and in the two examples above, we consider the terms betweeen the square brackets. For the moment, we only take those which are which are &#x2261; 4 mod 6 (for "compressed" segments, below there are also "detailed" segments where we take all). We start at the right and with the lower line, and we interleave the terms &#x2261; 4 mod 6 of the two lines to get a segment.
The columns in one row i of the array C are constructed as described in the following table (T2):
 
The following table '''(T2)''' tells how the columns in one row i of the array C must be constructed if the condition is fulfilled:  
{| class="wikitable" style="text-align:left"
{| class="wikitable" style="text-align:left"
!Column j                 !! Operation               !! Formula                  !! Condition            !! Sequence
!Column                !! Operation                 !! Formula                  !! Condition            !! Sequence
|-
|-
| 1 ||                                                || 6 * i - 2                ||                      || 4, 10, 16, 22, 28, ...
| 1 ||                                                || 6 * i - 2                ||                      || 4, 10, 16, 22, 28, ...
|-
|-
| 2 || <nowiki>C[i,1]</nowiki> &micro;&micro;        || 24 * (i - 1)   + 16     ||                      || 16, 40, 64, 88, 112, ...
| 2 || <nowiki>C[i,1]</nowiki> &micro;&micro;        || 24 * (i - 1) / 1    + 16||                      || 16, 40, 64, 88, 112, ...
|-                                                                           
| 3 || <nowiki>C[i,1]</nowiki> &delta;&micro;&micro;  || 24 * (i - 1) / 3    +  4|| i &#x2261; 1 mod 3  ||  4, 28, 52, 76, 100, ...
|-                                                                           
| 4 || <nowiki>C[i,2]</nowiki> &sigma;                || 48 * (i - 1) / 3    + 10|| i &#x2261; 1 mod 3  || 10, 58, 106, 134, ...
|-                                                                           
| 5 || <nowiki>C[i,3]</nowiki> &sigma;                || 48 * (i - 7) / 9    + 34|| i &#x2261; 7 mod 9  || 34, 82, 130, 178, ...
|-                                                                           
| 6 || <nowiki>C[i,2]</nowiki> &sigma;&sigma;        || 96 * (i - 7) / 9    + 70|| i &#x2261; 7 mod 9  || 70, 166, 262, 358, ...
|-                                                                           
| 7 || <nowiki>C[i,3]</nowiki> &sigma;&sigma;        || 96 * (i - 7) / 27    + 22|| i &#x2261; 7 mod 27  || 22, 118, 214, 310, ...
|-                                                                           
| 8 || <nowiki>C[i,2]</nowiki> &sigma;&sigma;&sigma;  || 192 * (i - 7) / 27  + 46|| i &#x2261; 7  mod 27 || 46, 238, 430, 622, ...
|-
|-
| 3 || <nowiki>C[i,1]</nowiki> &delta;&micro;&micro;  || 24 * (i - 1) / 3 + 4    || i &#x2261; 1 mod || 4, 28, 52, 76, 100, ...
| 9 || <nowiki>C[i,3]</nowiki> &sigma;&sigma;&sigma;  || 192 * (i - 61) / 81 + 142|| i &#x2261; 61 mod 81 || 142, 334, ...
|-
|-
| 4 || <nowiki>C[i,2]</nowiki> &sigma;                || 48 * (i - 1) / 3 + 10    || i &#x2261; 1 mod 3  || 10, 58, 106, 134, ...
|...|| ... || ... || ... || ...
|-
|-
| 5 || <nowiki>C[i,3]</nowiki> &sigma;               || 48 * (i - 7) / 9 + 34    || i &#x2261; 7 mod 9  || 34, 82, 130, 178, ...
|...|| <nowiki>C[i,2]</nowiki> &sigma;<sup>j-1</sup> || 3 * 2<sup>j+2</sup> * (i - m) / 3<sup>k</sup> + x || i &#x2261; m mod 3<sup>k</sup> || x, ...
|-
| 6 || <nowiki>C[i,2]</nowiki> &sigma;&sigma;        || 96 * (i - 7) / 9 + 70    || i &#x2261; 7 mod 9  || 70, 166, 262, 358, ...
|-
| 7 || <nowiki>C[i,3]</nowiki> &sigma;&sigma;        || 96 * (i - 7) / 27 + 22  || i &#x2261; 7 mod 27  || 22, 118, 214, 310, ...
|-
| 8 || <nowiki>C[i,2]</nowiki> &sigma;&sigma;&sigma;  || 192 * (i - 7) / 27  + 46 || i &#x2261; mod 27 || 46, 238, 430, 622, ...
|-
| 9 || <nowiki>C[i,3]</nowiki> &sigma;&sigma;&sigma;  || 192 * (i - 61) / 81 + 142|| i &#x2261; 61 mod 81 || 142, 334, ...
|-
| ... || ... || ... || ... || ...
|-
|-
|}
|}
The first column(s) <nowiki>C[i,1]</nowiki> will be denoted as the '''left side''' of the segment (or of the whole directory), while the columns <nowiki>C[i,j], j &gt; 1</nowiki> will be the '''right part'''. The first few lines of the segment directory are the following:
The first column(s) ''<nowiki>C[i,1]</nowiki>'' will be denoted as the '''left side''' of the segment (or of the whole directory), while the columns ''<nowiki>C[i,j], j &gt; 1</nowiki>'' are the '''right part'''.  
 
In general, the rows of T2 must be filled as follows:
* ''j'' and ''k'' are increased in every second row. 
* The residues ''m'' of ''3<sup>k</sup>'' in the condition column are are the indexes 1, 7, 61, 547, 4921 ... ([http://oeis.org/A066443 OEIS A066443]) of the variable length segments with left sides (4), 40, 364, 3280, 29524 ([http://oeis.org/A191681 OEIS A191681]). They increase (''jump'') after 4 rows.
* The additive constants ''x'' in the formula are the values appearing first in columns 2-4 (in segment 1), 5-8 (in segment 7), 9-12 (in segment 61) etc.
 
The first few lines of the segment directory are the following:


<table style="border-collapse: collapse; ">
<table style="border-collapse: collapse; ">
Line 137: Line 160:
</table>
</table>
There is a more elaborated '''[http://www.teherba.org/fasces/oeis/collatz/comp.html segment directory] with 5000 rows'''.
There is a more elaborated '''[http://www.teherba.org/fasces/oeis/collatz/comp.html segment directory] with 5000 rows'''.
===Properties of the segment directory===
 
We make a number of claims for segments:
====Properties of the Segment Directory====
* (C1) All nodes in the segment directory are of the form 6 * n - 2.
We make a number of claims for the segment directory C:
:: This follows from the formula for columns <nowiki>C[i,1..3]</nowiki>, and for any higher column numbers from the 3-by-2 replacement property of the &sigma; operation.
* (C1) All nodes in the segment directory are of the form ''6 * n - 2''.
:: This follows from the formula for columns ''<nowiki>C[i,1..3]</nowiki>'', and for any higher column numbers from the 3-by-2 replacement property of the &sigma; operation.
* (C2) All segments have a finite length.
* (C2) All segments have a finite length.
:: At some point the &sigma; operations will have replaced all factors 3 by 2.
:: At some point the &sigma; operations will have replaced all factors 3 by 2.
* (C3) All nodes in the right part of a segment have the form 6 * (3<sup>n</sup> * 2<sup>m</sup> * f) - 2 with the same "3-2-free" factor f.
* (C3) All nodes in the right part of a segment have the form ''6 * (3<sup>n</sup> * 2<sup>m</sup> * f) - 2'' with the same "3-2-free" factor ''f''.
:: This follows from the operations for columns <nowiki>C[i,1..3]</nowiki>, and from the fact that the &sigma; operation maintains this property.
:: This follows from the operations for columns <nowiki>C[i,1..3]</nowiki>, and from the fact that the &sigma; operation maintains this property.
* (C4) All nodes in the right part of a particular segment are
* (C4) All nodes in the right part of a particular segment are
** different among themselves, and
** different among themselves, and
** different from the left side of that segment (except for the first segment for the root 4).
** different from the left side of that segment (except for the first segment for the root 4).
:: For <nowiki>C[i,1..2]</nowiki> we see that the values modulo 24 are different. For the remaining columns, we see that the exponents of the factors 2 and 3 are different. They are shifted by the &sigma; operations, but they alternate, for example (in the segment with left part 40):
:: For ''<nowiki>C[i,1..2]</nowiki>'' we see that the values modulo 24 are different. For the remaining columns, we see that the exponents of the factors 2 and 3 are different. They are shifted by the &sigma; operations, but they alternate, for example (in the segment with left part 40):
  160 = 6 * (3<sup>3</sup> * 2<sup>0</sup> * 1) - 2
  160 = 6 * (3<sup>3</sup> * 2<sup>0</sup> * 1) - 2
   52 = 6 * (3<sup>2</sup> * 2<sup>0</sup> * 1) - 2
   52 = 6 * (3<sup>2</sup> * 2<sup>0</sup> * 1) - 2
Line 157: Line 181:
   46 = 6 * (3<sup>0</sup> * 2<sup>3</sup> * 1) - 2
   46 = 6 * (3<sup>0</sup> * 2<sup>3</sup> * 1) - 2
* (C5) There is no cycle in a segment (except for the first segment for the root 4).
* (C5) There is no cycle in a segment (except for the first segment for the root 4).
===Segment Lengths===
 
The segment directory is obviously very structured. The lengths of the compressed segments follow the pattern
====Segment Lengths====
Oviously the segment directory is very structured. The lengths of the compressed segments follow the pattern
  4 2 2 4 2 2 L<sub>1</sub> 2 2 4 2 2 4 2 2 L<sub>2</sub> 2 2 4 2 2 ...
  4 2 2 4 2 2 L<sub>1</sub> 2 2 4 2 2 4 2 2 L<sub>2</sub> 2 2 4 2 2 ...
with two ''fixed lengths'' 2 and 4 and some ''variable lengths'' L<sub>1</sub>, L<sub>2</sub> ... &gt; 4. For the left parts 4, 40, 364, 3280, 29524 ([http://oeis.org/A191681 OEIS A191681]), the segment lengths have high values 4, 8, 12, 16, 20 which did not occur before. Those left parts are (9<sup>n+1</sup> - 1) / 2, or 4 * Sum(9<sup>i</sup>, i = 0..n).
with two fixed lengths 2 and 4 and some variable lengths ''L<sub>1</sub>, L<sub>2</sub> ... &gt; 4''. For the left parts 4, 40, 364, 3280, 29524 ([http://oeis.org/A191681 OEIS A191681]), the segment lengths have high values 4, 8, 12, 16, 20 which did not occur before. Those left parts are ''(9<sup>n+1</sup> - 1) / 2'', or ''4 * Sum(9<sup>i</sup>, i = 0..n)''.
===Coverage of the Right Part===
 
We now examine the modular conditions which result from the segment construction table in order to find out how the numbers of the form 6 * n - 2 are covered by the right part of the segment directory, as shown in the following table (T3):
====Coverage of the Right Part====
We now examine the modular conditions which result from the segment construction table (T2) in order to find out how the numbers of the form ''6 * n - 2'' are covered by the right part of the segment directory. The following table (T3) shows the result:
{| class="wikitable" style="text-align:left"
{| class="wikitable" style="text-align:left"
!Columns j   !! Covered !! Remaining
!Columns j !! Covered         !! Remaining
|-
| 2-3 ||  4, 16 mod 24 || 10, 22, 34, 46 mod 48
|-
| 3-4 ||  10, 34 mod 48 || 22, 46, 70, 94 mod 96
|-
| 5-6 ||  70, 22 mod 96 || 46, 94, 142, 190 mod 192
|-
| 7-8 ||  46, 142 mod 192 || 94, 190, 286, 382 mod 384
|-
|-
| ... ||  ... || ...
| 2-3      ||  4, 16 mod 24  || 10, 22, 34, 46 mod 48
|-       
| 3-4      ||  10, 34 mod 48  || 22, 46, 70, 94 mod 96
|-       
| 5-6      ||  70, 22 mod 96  || 46, 94, 142, 190 mod 192
|-       
| 7-8      ||  46, 142 mod 192|| 94, 190, 286, 382 mod 384
|-       
| ...     ||  ...           || ...
|}
|}
We can always exclude the first and the third element remaining so far by looking in the next two columns of segments with sufficient length.
We can always exclude the first and the third element remaining so far by looking in the next two columns of segments with sufficient length.
Line 180: Line 206:
:: We only need to take a segment which, in its right part, has a factor of 3 with a sufficiently high power, and the &sigma; operations will stretch out the segment accordingly.
:: We only need to take a segment which, in its right part, has a factor of 3 with a sufficiently high power, and the &sigma; operations will stretch out the segment accordingly.
Therefore we can continue the modulus table above indefinitely, which leads us to the claim:
Therefore we can continue the modulus table above indefinitely, which leads us to the claim:
* '''(C7)''' All numbers of the form 6 * n - 2 occur exactly once in the right part of the segment directory, and once as a left side. There is a bijective mapping between the left sides and the elements of the right parts.
* '''(C7)''' All numbers of the form ''6 * n - 2'' occur exactly once in the right part of the segment directory, and once as a left side. There is a bijective mapping between the left sides and the elements of the right parts.
:: The sequences defined by the columns in the right part all have different modulus conditions. Therefore they are all disjoint. The left sides are disjoint by construction.
:: The sequences defined by the columns in the right part all have different modulus conditions. Therefore they are all disjoint. The left sides are disjoint by construction.
==Attachment Directory==
So far - by the segments - we possess an infinite set of small - cyclefree - trees with disjoint nodes and two branches. We now want to connect them to a single big tree which will later become the ''backbone'' of the Collatz tree.


Parallel to the segment directory we maintain a so-called ''attachment directory'' A which only contains a flag telling
==Segment Tree==
* whether the tree corresponding to segment was already attached to the tree of some other segment, or
So far we possess the segment directory C which represents the root segment and an infinite set of small trees with disjoint nodes and two branches. We know that the segments represent trees, and that their right parts are all disjoint and different from the left side.
* whether it is still unattached.
 
Initially all trees are unattached.
We now want to ''attach'' (or ''connect'') the segments to other graphs until we get a single big graph which will later become the ''backbone'' of the Collatz graph. Ideally the attachment process should maintain the tree property of the graphs all the time.
:The verb ''attach'' emphasizes the direction of the operation better than the verb ''connect''.


===Tree attachment rules===
=== Attachment Directory Construction===
We attach a set of source rows (segments, trees) in C to appropriate target rows and columns by starting a ''gedankenexperiment'' analogous to [https://en.wikipedia.org/wiki/Hilbert%27s_paradox_of_the_Grand_Hotel Hilbert's hotel].
Parallel to the segment directory we maintain the ''attachment directory'' A which, for any source segment in C:
# tells whether the tree corresponding to the segment was already attached to the graph represented by some other segment, and if so, 
# tells the target row and column numbers ''i, j'' in the segment directory C where the source segment was attached.
Initially all segments are unattached.


We operate on A as follows. Considering simultaneously all source rows ''i &gt; 1'' - omitting the root segment - which are so far unattached, and which fulfill some modularity condition (the ''source'' row set),   we  ''attach'' (or ''identify, connect'') their trees in parallel to the unique occurrences of their left side in the right part of C (''target'' rows and target column). Afterwards, we set the flag of those source rows/segments in A to ''attached'', and we do not try to attach those segments anymore.
===Branch Levels===
In general, when dealing with the 3x+1 problem, it seems difficult to introduce a ''measure'', that is a numerically ordered property of some object related to the Collatz graph. This would be desireable in order to conduct a proof by induction, infinite descent, leading a minimal element to a contradiction etc.


The following table '''(T4)''' shows the unique position where source rows must be attached to target rows and columns:
Here we use the ''branch level'', that is the column index ''j'' of the unique position  ''<nowiki>C[i, j]</nowiki>'' in a segment where a second segment should be attached.
 
===Attachment rules===
The following table '''(T4)''' tells the computation rules for the target position, depending on the modularity condition of the source row index ''i''. We identify and denote these attachment rules by the target column number respectively the branch level.
{| class="wikitable" style="text-align:left"
{| class="wikitable" style="text-align:left"
|-
|-
!Source<br>rows ''i'' !!First source<br>rows !! Target<br>rows||Target<br>column          !!First<br>target rows!!New<br>pos.!! Remaining<br>rows         !! Remaining<br>Fraction
!Branch<br>Level!!Source<br>rows ''i''!!First source<br>rows !! Target<br>rows!!First<br>target rows !!New<br>pos.!! Remaining<br>rows         !! Remaining<br>Fraction
|-                                                                                           
|-                                                                                           
|i &#x2261; 3 mod 4  ||3, 7, 11, 15 ...    ||3<sup>0</sup> * (i -  3) /  4 +  1|| '''2'''||1, 2, 3, 4 ...      || &lt; ||i &#x2261; 0, 1, 2 mod 4        ||3/4
| '''1'''||  (unused rule)    ||                    ||                                  ||                    ||      ||i &#x2261; 0, 1, 2, 3 mod 4    ||2/2
|-                                                                                         
| '''2'''||i &#x2261; 3 mod 4  ||3, 7, 11, 15 ...    ||3<sup>0</sup> * (i -  3) /  4 +  1||1, 2, 3, 4 ...      || &lt; ||i &#x2261; 0, 1, 2 mod 4        ||3/4
|-                                                                                         
|-                                                                                         
|i &#x2261; 1 mod 4  ||(1), 5, 9, 13 ...    ||3<sup>1</sup> * (i -  1) /  4 +  1|| '''3'''||(1), 4, 7, 10 ...    || &lt;(=) ||i &#x2261; 0, 2, 4, 6 mod 8  ||1/2
| '''3'''||i &#x2261; 1 mod 4  ||(1), 5, 9, 13 ...    ||3<sup>1</sup> * (i -  1) /  4 +  1||1, 4, 7, 10 ...    || &lt;(=) ||i &#x2261; 0, 2, 4, 6 mod 8  ||2/4
|-                                                                                         
|-                                                                                         
|i &#x2261; 2 mod 8  ||2, 10, 18, 26 ...    ||3<sup>1</sup> * (i -  2) /  8 +  1|| '''4'''||1, 4, 7, 10 ...      || &lt; ||i &#x2261; 0, 4, 6 mod 8        ||3/8
| '''4'''||i &#x2261; 2 mod 8  ||2, 10, 18, 26 ...    ||3<sup>1</sup> * (i -  2) /  8 +  1||1, 4, 7, 10 ...      || &lt; ||i &#x2261; 0, 4, 6 mod 8        ||3/8
|-                                                                                         
|-                                                                                         
|i &#x2261; 6 mod 8  ||6, 14, 22, 30 ...    ||3<sup>2</sup> * (i -  6) /  8 +  7|| '''5'''||7, 16, 25, 34 ...    || &gt; ||i &#x2261; 0, 4, 8, 12 mod 16  ||1/4
| '''5'''||i &#x2261; 6 mod 8  ||6, 14, 22, 30 ...    ||3<sup>2</sup> * (i -  6) /  8 +  7||7, 16, 25, 34 ...    || &gt; ||i &#x2261; 0, 4, 8, 12 mod 16  ||2/8
|-                                                                                         
|-                                                                                         
|i &#x2261; 12 mod 16||12, 28, 44, 60 ...  ||3<sup>2</sup> * (i - 12) / 16 +  7|| '''6'''||7, 16, 25, 34 ...    || &lt; ||i &#x2261; 0, 4, 8 mod 16      ||3/16
| '''6'''||i &#x2261; 12 mod 16||12, 28, 44, 60 ...  ||3<sup>2</sup> * (i - 12) / 16 +  7||7, 16, 25, 34 ...    || &lt; ||i &#x2261; 0, 4, 8 mod 16      ||3/16
|-                                                                                         
|-                                                                                         
|i &#x2261; 4  mod 16||4, 20, 36, 52 ...    ||3<sup>3</sup> * (i -  4) / 16 +  7|| '''7'''||7, 34, 61, 88 ...    || &gt; ||i &#x2261; 0, 8, 16, 24 mod 32  ||1/8
| '''7'''||i &#x2261; 4  mod 16||4, 20, 36, 52 ...    ||3<sup>3</sup> * (i -  4) / 16 +  7||7, 34, 61, 88 ...    || &gt; ||i &#x2261; 0, 8, 16, 24 mod 32  ||2/16
|-                                                                                         
|-                                                                                         
|i &#x2261; 8  mod 32||8, 40, 72, 104 ...  ||3<sup>3</sup> * (i -  8) / 32 +  7|| '''8'''||7, 34, 61, 88 ...    || &lt; ||i &#x2261; 0, 16, 24 mod 32    ||3/32
| '''8'''||i &#x2261; 8  mod 32||8, 40, 72, 104 ...  ||3<sup>3</sup> * (i -  8) / 32 +  7||7, 34, 61, 88 ...    || &lt; ||i &#x2261; 0, 16, 24 mod 32    ||3/32
|-                                                                                     
|-                                                                                     
|i &#x2261; 24 mod 32||24, 56, 88, 120 ...  ||3<sup>4</sup> * (i - 24) / 32 + 61|| '''9'''||61, 142, 223, 304 ...|| &gt; ||i &#x2261; 0, 16, 32, 48 mod 64 ||1/16
| '''9'''||i &#x2261; 24 mod 32||24, 56, 88, 120 ...  ||3<sup>4</sup> * (i - 24) / 32 + 61||61, 142, 223, 304 ...|| &gt; ||i &#x2261; 0, 16, 32, 48 mod 64 ||2/32
|-                                                                                     
|-                                                                                     
|i &#x2261; 48 mod 64||48, 112, 176, 240 ...||3<sup>4</sup> * (i - 48) / 64 + 61||'''10'''||61, 142, 223, 304 ...|| &gt; ||i &#x2261; 0, 16, 32 mod 64    ||3/64
|'''10'''||i &#x2261; 48 mod 64||48, 112, 176, 240 ...||3<sup>4</sup> * (i - 48) / 64 + 61||61, 142, 223, 304 ...|| &gt; ||i &#x2261; 0, 16, 32 mod 64    ||3/64
|-                                                                                     
|-                                                                                     
|i &#x2261; 16 mod 64||16, 80, 144, 208 ... ||3<sup>5</sup> * (i - 16) / 64 + 61||'''11'''||61, 304, 547, 790 ...|| &gt; ||i &#x2261; 0, 32, 64, 96 mod 128||1/32
|'''11'''||i &#x2261; 16 mod 64||16, 80, 144, 208 ... ||3<sup>5</sup> * (i - 16) / 64 + 61||61, 304, 547, 790 ...|| &gt; ||i &#x2261; 0, 32, 64, 96 mod 128||2/64
|-                                                                                   
|...    ||...                ||...                  ||...                              ||...                  || ...  ||...                            ||...
|-                                                                                       
|-                                                                                       
| ...               || ...                || ...                             ||         || ...                 || &gt; || ...                           || ...
|...     ||i &#x2261; {3, 1} * 2<sup>j-2</sup> mod 2<sup>j</sup>||{3, 1} * 2<sup>j-2</sup> ...||3<sup>k</sup> * (i - {1, 3} * 2<sup>j-2</sup>) / 2<sup>j</sup> + m|| m ...|| &gt; || ... mod 2<sup>k+2</sup>  || {3, 2}/2<sup>j</sup>
|-
|-
|}
|}
It should be obvious how the next rows of this infinite table must be filled:  
It should be obvious how the rows of this infinite table must be filled in general:  
* The residues of 2<sup>k</sup> in the first column are 3 * 2<sup>k-2</sup>, 1 * 2<sup>k-2</sup> in an alternating sequence.  
* ''j'' and ''m'' are increased in every second row. 
* The additive constants in the second column are the indexes of the variable length segments with left parts (4), 40, 364, 3280, 29524 ([http://oeis.org/A191681 OEIS A191681]) mentioned above. They are repeated 4 times since the corresponding lengths "jump" by 4.
* The residues of ''2<sup>k</sup>'' in the source row column are ''3 * 2<sup>k-2</sup>, 1 * 2<sup>k-2</sup>'' with the pattern 3 1 1 3 3 1 1 3 3 ... for the factor. (This are also the values to be subtracted from ''i'' in the target row colum.)
* (C8) The odd source rows are covered by rules 2 and 3. All even source rows are covered by rules 4 and following.
* The additive constants ''n'' in the target row column are the indexes 1, 7, 61, 547, 4921 ... ([http://oeis.org/A066443 OEIS A066443]: a(n) = a(n-1) * 9 - 2) of the variable length segments with left sides (4), 40, 364, 3280, 29524 ([http://oeis.org/A191681 OEIS A191681]) mentioned above. They increase after 4 rows.
===Moving up or down===
There are three groups of attachment rules:
* Rules 2, 3 and 4 attach to a row with a lower index.
* Rules 5-8 attach to higher or lower indexes.
* Rules 9 and above attach to higher indexes.
:: This can be seen from the powers of 2 and 3 in the source and target row columns. Starting at rule 9, we have ''3<sup>k</sup &gt; 2<sup>k+2</sup'' for ''k &gt;= 4''.
 
===Properties of the Attachment Directory===
For the attachment directory A we note respectively claim:
* (A1) The source rows met by the conditions in the rules are all disjoint.
* (A2) The odd source rows are covered by rules 2 and 3. All even source rows are covered by rules 4 and following.
* (A3) The construction is such that the target column always exists in the target rows.
:: Table (T4) is derived from (T2) which has similiar modularity conditions.
* (A4) The target column (or rule number or branch level) depends on the modularity condition for ''i'' alone, but not on the value of ''i''.
:: This can be shown by the graph operations (&delta / &micro; / &sigma;) which are tied to the columns.
* (A5) It does not matter in which order the attachment rules are applied.
:: The rules may well ''hit'' the same target rows, but they always do so in different columns. It does not matter whether the target row is already attached.
 
===Attachment Process===
We will now use the rules of T4 to reduce the set of unattached segments in C in an iterative process. (A5) ensures that we can apply the rules in any order. The final goal is to show that all segments are attached, and that only the root segment could not be attached to a different segment.
 
We operate on A as follows. Considering simultaneously a set of source rows ''i &gt; 1'' (i.e. omitting the root segment) in C - which fulfill some modularity condition (the ''source'' row set), and which are so far unattached,  we attach their segments in parallel to the unique occurrences of their left side in the right part of C (''target row'' set and ''target column'').
:These operations on A involve infinite sets. They are similiar to the ''gedankenexperiment'' of [https://en.wikipedia.org/wiki/Hilbert%27s_paradox_of_the_Grand_Hotel Hilbert's hotel].
 
We first apply all rules 3 and above. That leaves us only with possibly unattached rows of the form ''i &#x2261; 1 mod 3''.
 
We then want to attach all even source rows.
 
By (G7) we have he reduction is rather easy. We first apply all rules 3 and above.


We identify and denote the attachment rules with the - bold -  target column number. The construction is such that the target column always exists.
* (C9) It does not matter in which order the single attachment steps are performed.
::Even if, during the application of one rule, a source row becomes a target row or vice versa, the target row and column is still found, and the attachment of the segment/tree can be performed.


* (C10) The target column depends on the modularity condition for ''i'' alone, but not on ''i''.
:: This is provable from the graph operations (&delta, &micro;, &sigma;) which are tied to the columns.
===No Cycles===
* '''(C11)''' The attachment process does not create any new cycle (in addition to the one in the root segment).
:: Let a segment/tree ''t<sub>1</sub>'' with left side ''n<sub>1</sub>'' and right part ''R<sub>1</sub>'' be attached to node ''n<sub>1</sub>'' in the right part ''R<sub>2</sub>'' of the unique segment/tree ''t<sub>2</sub>'' which has the left side by ''n<sub>2</sub>''. ''t<sub>1</sub>'' and ''t<sub>2</sub>'' are disjoint trees by (C4), therefore the result of such a single attachment step is a tree again (''u<sub>2</sub>'', still with left side ''n<sub>2</sub>'').
===Reduction of T4===
We will now use the rules of T4 to reduce the set of unattached segments in A. <!--We note that most of the rules attach each source row to a target row with a higher number (new position "&gt;" = ''move up''), while a few initial rules - 2, 3, 4, 6, 8 - ''move down'' (new position "&lt;").
:This property depends on the exponents of 2 and 3 in the rules.-->
====Contraction of pairs====
====Contraction of pairs====
The target rows for the rules 5 and 6 are the same (and also for 7 and 8, 9 and 10, etc.) We may safely apply rules 5, 7, 9 and all following odd rules. We are left with the source rows for which the rules 2, 3, 4 and all following even numbers are applicable.
The target rows for the rules 5 and 6 are the same (and also for 7 and 8, 9 and 10, etc.) We may safely apply rules 5, 7, 9 and the following odd rules (all ''moving up''). Afterwards we are left only with the source rows for which the rules 2, 3, 4 and all following rules with even numbers are applicable.
====Rules &gt; 3 attach to rows &#x2261; 1 mod 3====
====Rules &gt; 3 attach to rows &#x2261; 1 mod 3====
We see that the target rows for the rules with numbers &gt; 3 are all &#x2261; 1 mod 3. We may safely apply all these rules (with rule 4 we do not attach source row 1), and thereafter we are left with the odd source rows &#x2261; 1, 3 mod 4 only, for which rules 2, 3 are applicable. Both rules move down.
We see that the target rows for the rules with numbers &gt; 3 are all &#x2261; 1 mod 3. We may safely apply all these rules (with rule 4 we do not attach source row 1), and thereafter we are left with the odd source rows &#x2261; 1, 3 mod 4 only, for which rules 2, 3 are applicable. Both rules move down.
====Rule 3====
====Rule 3====
====Rule 2====
====Rule 2====
===No Cycles===
* '''(C13)''' The attachment process does not create any new cycle (in addition to the one in the root segment).
:: Let a segment/tree ''t<sub>1</sub>'' with left side ''n<sub>1</sub>'' and right part ''R<sub>1</sub>'' be attached to node ''n<sub>1</sub>'' in the right part ''R<sub>2</sub>'' of the unique segment/tree ''t<sub>2</sub>'' which has the left side by ''n<sub>2</sub>''. ''t<sub>1</sub>'' and ''t<sub>2</sub>'' are disjoint trees by (C4), therefore the result of such a single attachment step is a tree again (''u<sub>2</sub>'', still with left side ''n<sub>2</sub>'').
==The Collatz Tree==
==The Collatz Tree==
* (C11) The remaining single tree is a subgraph of the Collatz graph.
* (C11) The remaining single tree is a subgraph of the Collatz graph.

Revision as of 21:37, 9 November 2018

Abstract

Small, finite trees with two branches are constructed with the operations defined by Collatz for his 3x+1 problem, These trees are connected to form bigger graphs in an iterative process. It is shown that this process finally builds a single graph which is a tree except for one cycle at the root. This graph is then embedded into the Collatz graph, and it is thereby shown that the latter is also a tree except for the cycle 4-2-1.

Introduction

Collatz sequences (also called trajectories) are sequences of integer numbers > 0. For any start value > 0 the elements of the sequence are constructed with two simple rules:

  1. Even numbers are halved.
  2. Odd numbers are multiplied by 3 and then incremented by 1.

Since decades it is unknown whether the final cyle 4 - 2 - 1 is always reached for all start values. This problem is the Collatz conjecture, for which the english Wikipedia states:

It is also known as the 3n + 1 conjecture, the Ulam conjecture (after Stanisław Ulam), Kakutani's problem (after Shizuo Kakutani), the Thwaites conjecture (after Sir Bryan Thwaites), Hasse's algorithm (after Helmut Hasse), or the Syracuse problem; the sequence of numbers involved is referred to as the hailstone sequence or hailstone numbers (because the values are usually subject to multiple descents and ascents like hailstones in a cloud), or as wondrous numbers.

Simple visualizations of Collatz sequences show no obvious structure. The sequences for the first dozen of start values are rather short, but the sequence for 27 suddenly has 112 elements.

Da sieht man den Wald vor lauter Bämen nicht.
German proverb:
"There you cannot see the wood for the trees."

References

  • Jeffry C. Lagarias, Ed.: The Ultimate Challenge: The 3x+1 Problem, Amer. Math. Soc., 2010, ISBN 978-8218-4940-8. MBK78
  • OEIS A07165: File of first 10K Collatz sequences, ascending start values, with lengths
  • Manfred Trümper: The Collatz Problem in the Light of an Infinite Free Semigroup. Chinese Journal of Mathematics, Vol. 2014, Article ID 756917, 21 p.

Collatz Graph

When all Collatz sequences are read backwards, they form the Collatz graph starting with 1, 2, 4, 8 ... . At each node m > 4 in the graph, the path from the root (4) can be continued

  • always to m * 2, and
  • to (m - 1) / 3 if m ≡ 1 mod 3.

The Collatz conjecture claims that the graphs contains all numbers, and that - except for the leading cycle 1 - 2 - 4 - 1 - 2 - 4 ... - it has the form of a tree (without cycles). We will not consider the leading cycle, and we start the graph with node 4, the root. Moreover, another trivial type of path starts when m ≡ 0 mod 3. We call such a path a sprout, and it contains duplications only. Sprout must be added to the graph for any node divisible by 3, therefore we will not consider them for the moment.

Graph Operations

Following Trümper, we use abbreviations for the elementary operations which transform a node (element, number) in the Collatz graph according to the following table (T1):

Name Mnemonic Distance to root Mapping Condition
d down -1 m ↦ m / 2 m ≡ 0 mod 2
u up -1 m ↦ 3 * m + 1 (m ≡ 1 mod 2)
s := ud spike -2 m ↦ (3 * m + 1) / 2) m ≡ 1 mod 2
δ divide +1 m ↦ (m - 1) / 3 m ≡ 1 mod 3
µ multiply +1 m ↦ m * 2 (none)
σ := δµ squeeze +2 m ↦ ((m - 1) / 3) * 2 m ≡ 1 mod 3

We will mainly be interested in the reverse mappings (denoted with greek letters) which move away from the root of the graph.

3-by-2 Replacement

The σ operation, applied to numbers of the form 6 * m - 2, has an interesting property:

(6 * (3 * n) - 2) σ = 4 * 3 * n - 2 =  6 * (2 * n) - 2

In other words, as long as m contains a factor 3, the σ operation maintains the form 6 * x - 2, and it replaces the factor 3 by 2 (it "squeezes" a 3 into a 2). In the opposite direction, the s operation replaces a factor 2 in m by 3.


Motivation: Patterns in sequences with the same length

A closer look at the Collatz sequences shows a lot of pairs of adjacent start values which have the same sequence length, for example (from OEIS A070165):

142/104: 142 d  71 u 214 d 107 u 322 d 161 u 484 d  242 d 121 u 364 ] 182, 91, ... 4, 2, 1
143/104: 143 u 430 d 215 u 646 d 323 u 970 d 485 u 1456 d 728 d 364 ] 182, 91, ... 4, 2, 1
           +1  *6+4    +1  *6+4    +1  *6+4    +1   *6+4  *6+2    +0    +0 ...

The third line tells how the second line could be computed from the first. Walking from right to left, the step pattern is:

δ µ µ δ µ δ µ δ µ
µ µ δ µ δ µ δ µ δ

The alternating pattern of operations can be continued to the left with 4 additional pairs of steps:

 q? u [ 62 d  31 u  94 d  47 u 142 d ...
126 d [ 63 u 190 d  95 u 286 d 143 u ...
        +1  *6+4    +1  *6+4    +1

The pattern stops here since there is no number q such that q * 3 + 1 = 62.

Segments

These patterns lead us to the construction of special subsets of paths in the Collatz graph which we call segments. They lead away from the root, and they always start with a node m ≡ -2 mod 6. Then they split and follow two subpaths in a prescribed sequence of operations. The segment construction process is stopped when the next node in one of the two subpaths becomes divisible by 3, resp. when a δ operation is no more possible.

Segment Directory Construction

We list the segments as rows of an infinite array C[i,j], the so-called segment directory.

Informally, and in the two examples above, we consider the terms betweeen the square brackets. For the moment, we only take those which are which are ≡ 4 mod 6 (for "compressed" segments, below there are also "detailed" segments where we take all). We start at the right and with the lower line, and we interleave the terms ≡ 4 mod 6 of the two lines to get a segment.

The following table (T2) tells how the columns in one row i of the array C must be constructed if the condition is fulfilled:

Column Operation Formula Condition Sequence
1 6 * i - 2 4, 10, 16, 22, 28, ...
2 C[i,1] µµ 24 * (i - 1) / 1 + 16 16, 40, 64, 88, 112, ...
3 C[i,1] δµµ 24 * (i - 1) / 3 + 4 i ≡ 1 mod 3 4, 28, 52, 76, 100, ...
4 C[i,2] σ 48 * (i - 1) / 3 + 10 i ≡ 1 mod 3 10, 58, 106, 134, ...
5 C[i,3] σ 48 * (i - 7) / 9 + 34 i ≡ 7 mod 9 34, 82, 130, 178, ...
6 C[i,2] σσ 96 * (i - 7) / 9 + 70 i ≡ 7 mod 9 70, 166, 262, 358, ...
7 C[i,3] σσ 96 * (i - 7) / 27 + 22 i ≡ 7 mod 27 22, 118, 214, 310, ...
8 C[i,2] σσσ 192 * (i - 7) / 27 + 46 i ≡ 7 mod 27 46, 238, 430, 622, ...
9 C[i,3] σσσ 192 * (i - 61) / 81 + 142 i ≡ 61 mod 81 142, 334, ...
... ... ... ... ...
... C[i,2] σj-1 3 * 2j+2 * (i - m) / 3k + x i ≡ m mod 3k x, ...

The first column(s) C[i,1] will be denoted as the left side of the segment (or of the whole directory), while the columns C[i,j], j > 1 are the right part.

In general, the rows of T2 must be filled as follows:

  • j and k are increased in every second row.
  • The residues m of 3k in the condition column are are the indexes 1, 7, 61, 547, 4921 ... (OEIS A066443) of the variable length segments with left sides (4), 40, 364, 3280, 29524 (OEIS A191681). They increase (jump) after 4 rows.
  • The additive constants x in the formula are the values appearing first in columns 2-4 (in segment 1), 5-8 (in segment 7), 9-12 (in segment 61) etc.

The first few lines of the segment directory are the following:

1 2 3 4 5 6 7 8 9 10 11 ... 2*j 2*j+1
  i   6*i‑2 µµ δµµ µµσ δµµσ µµσσ δµµσσ µµσ3 δµµσ3 µµσ4 δµµσ4 ... µµσj-1 δµµσj-1
  1    4  16  4  10 
  2   10  40 
  3   16  64 
  4   22  88  28  58 
  5   28  112 
  6   34  136 
  7   40  160  52  106  34  70  22  46 

There is a more elaborated segment directory with 5000 rows.

Properties of the Segment Directory

We make a number of claims for the segment directory C:

  • (C1) All nodes in the segment directory are of the form 6 * n - 2.
This follows from the formula for columns C[i,1..3], and for any higher column numbers from the 3-by-2 replacement property of the σ operation.
  • (C2) All segments have a finite length.
At some point the σ operations will have replaced all factors 3 by 2.
  • (C3) All nodes in the right part of a segment have the form 6 * (3n * 2m * f) - 2 with the same "3-2-free" factor f.
This follows from the operations for columns C[i,1..3], and from the fact that the σ operation maintains this property.
  • (C4) All nodes in the right part of a particular segment are
    • different among themselves, and
    • different from the left side of that segment (except for the first segment for the root 4).
For C[i,1..2] we see that the values modulo 24 are different. For the remaining columns, we see that the exponents of the factors 2 and 3 are different. They are shifted by the σ operations, but they alternate, for example (in the segment with left part 40):
160 = 6 * (33 * 20 * 1) - 2
 52 = 6 * (32 * 20 * 1) - 2
106 = 6 * (32 * 21 * 1) - 2
 34 = 6 * (31 * 21 * 1) - 2
 70 = 6 * (31 * 22 * 1) - 2
 22 = 6 * (30 * 22 * 1) - 2
 46 = 6 * (30 * 23 * 1) - 2
  • (C5) There is no cycle in a segment (except for the first segment for the root 4).

Segment Lengths

Oviously the segment directory is very structured. The lengths of the compressed segments follow the pattern

4 2 2 4 2 2 L1 2 2 4 2 2 4 2 2 L2 2 2 4 2 2 ...

with two fixed lengths 2 and 4 and some variable lengths L1, L2 ... > 4. For the left parts 4, 40, 364, 3280, 29524 (OEIS A191681), the segment lengths have high values 4, 8, 12, 16, 20 which did not occur before. Those left parts are (9n+1 - 1) / 2, or 4 * Sum(9i, i = 0..n).

Coverage of the Right Part

We now examine the modular conditions which result from the segment construction table (T2) in order to find out how the numbers of the form 6 * n - 2 are covered by the right part of the segment directory. The following table (T3) shows the result:

Columns j Covered Remaining
2-3 4, 16 mod 24 10, 22, 34, 46 mod 48
3-4 10, 34 mod 48 22, 46, 70, 94 mod 96
5-6 70, 22 mod 96 46, 94, 142, 190 mod 192
7-8 46, 142 mod 192 94, 190, 286, 382 mod 384
... ... ...

We can always exclude the first and the third element remaining so far by looking in the next two columns of segments with sufficient length.

  • (C6) There is no limit on the length of a segment.
We only need to take a segment which, in its right part, has a factor of 3 with a sufficiently high power, and the σ operations will stretch out the segment accordingly.

Therefore we can continue the modulus table above indefinitely, which leads us to the claim:

  • (C7) All numbers of the form 6 * n - 2 occur exactly once in the right part of the segment directory, and once as a left side. There is a bijective mapping between the left sides and the elements of the right parts.
The sequences defined by the columns in the right part all have different modulus conditions. Therefore they are all disjoint. The left sides are disjoint by construction.

Segment Tree

So far we possess the segment directory C which represents the root segment and an infinite set of small trees with disjoint nodes and two branches. We know that the segments represent trees, and that their right parts are all disjoint and different from the left side.

We now want to attach (or connect) the segments to other graphs until we get a single big graph which will later become the backbone of the Collatz graph. Ideally the attachment process should maintain the tree property of the graphs all the time.

The verb attach emphasizes the direction of the operation better than the verb connect.

Attachment Directory Construction

Parallel to the segment directory we maintain the attachment directory A which, for any source segment in C:

  1. tells whether the tree corresponding to the segment was already attached to the graph represented by some other segment, and if so,
  2. tells the target row and column numbers i, j in the segment directory C where the source segment was attached.

Initially all segments are unattached.

Branch Levels

In general, when dealing with the 3x+1 problem, it seems difficult to introduce a measure, that is a numerically ordered property of some object related to the Collatz graph. This would be desireable in order to conduct a proof by induction, infinite descent, leading a minimal element to a contradiction etc.

Here we use the branch level, that is the column index j of the unique position C[i, j] in a segment where a second segment should be attached.

Attachment rules

The following table (T4) tells the computation rules for the target position, depending on the modularity condition of the source row index i. We identify and denote these attachment rules by the target column number respectively the branch level.

Branch
Level
Source
rows i
First source
rows
Target
rows
First
target rows
New
pos.
Remaining
rows
Remaining
Fraction
1 (unused rule) i ≡ 0, 1, 2, 3 mod 4 2/2
2 i ≡ 3 mod 4 3, 7, 11, 15 ... 30 * (i - 3) / 4 + 1 1, 2, 3, 4 ... < i ≡ 0, 1, 2 mod 4 3/4
3 i ≡ 1 mod 4 (1), 5, 9, 13 ... 31 * (i - 1) / 4 + 1 1, 4, 7, 10 ... <(=) i ≡ 0, 2, 4, 6 mod 8 2/4
4 i ≡ 2 mod 8 2, 10, 18, 26 ... 31 * (i - 2) / 8 + 1 1, 4, 7, 10 ... < i ≡ 0, 4, 6 mod 8 3/8
5 i ≡ 6 mod 8 6, 14, 22, 30 ... 32 * (i - 6) / 8 + 7 7, 16, 25, 34 ... > i ≡ 0, 4, 8, 12 mod 16 2/8
6 i ≡ 12 mod 16 12, 28, 44, 60 ... 32 * (i - 12) / 16 + 7 7, 16, 25, 34 ... < i ≡ 0, 4, 8 mod 16 3/16
7 i ≡ 4 mod 16 4, 20, 36, 52 ... 33 * (i - 4) / 16 + 7 7, 34, 61, 88 ... > i ≡ 0, 8, 16, 24 mod 32 2/16
8 i ≡ 8 mod 32 8, 40, 72, 104 ... 33 * (i - 8) / 32 + 7 7, 34, 61, 88 ... < i ≡ 0, 16, 24 mod 32 3/32
9 i ≡ 24 mod 32 24, 56, 88, 120 ... 34 * (i - 24) / 32 + 61 61, 142, 223, 304 ... > i ≡ 0, 16, 32, 48 mod 64 2/32
10 i ≡ 48 mod 64 48, 112, 176, 240 ... 34 * (i - 48) / 64 + 61 61, 142, 223, 304 ... > i ≡ 0, 16, 32 mod 64 3/64
11 i ≡ 16 mod 64 16, 80, 144, 208 ... 35 * (i - 16) / 64 + 61 61, 304, 547, 790 ... > i ≡ 0, 32, 64, 96 mod 128 2/64
... ... ... ... ... ... ... ...
... i ≡ {3, 1} * 2j-2 mod 2j {3, 1} * 2j-2 ... 3k * (i - {1, 3} * 2j-2) / 2j + m m ... > ... mod 2k+2 {3, 2}/2j

It should be obvious how the rows of this infinite table must be filled in general:

  • j and m are increased in every second row.
  • The residues of 2k in the source row column are 3 * 2k-2, 1 * 2k-2 with the pattern 3 1 1 3 3 1 1 3 3 ... for the factor. (This are also the values to be subtracted from i in the target row colum.)
  • The additive constants n in the target row column are the indexes 1, 7, 61, 547, 4921 ... (OEIS A066443: a(n) = a(n-1) * 9 - 2) of the variable length segments with left sides (4), 40, 364, 3280, 29524 (OEIS A191681) mentioned above. They increase after 4 rows.

Moving up or down

There are three groups of attachment rules:

  • Rules 2, 3 and 4 attach to a row with a lower index.
  • Rules 5-8 attach to higher or lower indexes.
  • Rules 9 and above attach to higher indexes.
This can be seen from the powers of 2 and 3 in the source and target row columns. Starting at rule 9, we have 3k</sup > 2k+2</sup for k >= 4.

Properties of the Attachment Directory

For the attachment directory A we note respectively claim:

  • (A1) The source rows met by the conditions in the rules are all disjoint.
  • (A2) The odd source rows are covered by rules 2 and 3. All even source rows are covered by rules 4 and following.
  • (A3) The construction is such that the target column always exists in the target rows.
Table (T4) is derived from (T2) which has similiar modularity conditions.
  • (A4) The target column (or rule number or branch level) depends on the modularity condition for i alone, but not on the value of i.
This can be shown by the graph operations (&delta / µ / σ) which are tied to the columns.
  • (A5) It does not matter in which order the attachment rules are applied.
The rules may well hit the same target rows, but they always do so in different columns. It does not matter whether the target row is already attached.

Attachment Process

We will now use the rules of T4 to reduce the set of unattached segments in C in an iterative process. (A5) ensures that we can apply the rules in any order. The final goal is to show that all segments are attached, and that only the root segment could not be attached to a different segment.

We operate on A as follows. Considering simultaneously a set of source rows i > 1 (i.e. omitting the root segment) in C - which fulfill some modularity condition (the source row set), and which are so far unattached, we attach their segments in parallel to the unique occurrences of their left side in the right part of C (target row set and target column).

These operations on A involve infinite sets. They are similiar to the gedankenexperiment of Hilbert's hotel.

We first apply all rules 3 and above. That leaves us only with possibly unattached rows of the form i ≡ 1 mod 3.

We then want to attach all even source rows.

By (G7) we have he reduction is rather easy. We first apply all rules 3 and above.


Contraction of pairs

The target rows for the rules 5 and 6 are the same (and also for 7 and 8, 9 and 10, etc.) We may safely apply rules 5, 7, 9 and the following odd rules (all moving up). Afterwards we are left only with the source rows for which the rules 2, 3, 4 and all following rules with even numbers are applicable.

Rules > 3 attach to rows ≡ 1 mod 3

We see that the target rows for the rules with numbers > 3 are all ≡ 1 mod 3. We may safely apply all these rules (with rule 4 we do not attach source row 1), and thereafter we are left with the odd source rows ≡ 1, 3 mod 4 only, for which rules 2, 3 are applicable. Both rules move down.

Rule 3

Rule 2

No Cycles

  • (C13) The attachment process does not create any new cycle (in addition to the one in the root segment).
Let a segment/tree t1 with left side n1 and right part R1 be attached to node n1 in the right part R2 of the unique segment/tree t2 which has the left side by n2. t1 and t2 are disjoint trees by (C4), therefore the result of such a single attachment step is a tree again (u2, still with left side n2).

The Collatz Tree

  • (C11) The remaining single tree is a subgraph of the Collatz graph.
The edges of the compressed tree carry combined operations µµ, δµµ and σ = δµ.

So far, numbers of the form x ≡ 0, 1, 2, 3, 5 mod 6 are missing from the compressed tree.

We insert intermediate nodes into the compressed tree by applying operations on the left parts of the segments as shown in the following table (T5):

Operation Condition Resulting Nodes Remaining Nodes
δ 2 * i - 1 i ≡ 0, 2, 6, 8 mod 12
µ 12 * i - 4 i ≡ 0, 2, 6 mod 12
δµ i ≡ 1, 2 mod 3 4 * i - 2 i ≡ 0, 12 mod 24
δµµ i ≡ 2 mod 3 8 * i - 4 i ≡ 0 mod 24
δµµµ i ≡ 2 mod 3 16 * i - 8 (none)

The first three rows in T5 care for the intermediate nodes at the beginning of the segment construction with columns 1, 2, 3. Rows 4 and 5 generate the sprouts (starting at multiples of 3) which are not contained in the segment directory.

We call such a construction a detailed segment (in contrast to the compressed segments described above).

A detailed segment directory can be created by the same Perl program. In that directory, the two subpaths of a segment are shown in two lines. Only the highlighted nodes are unique.
  • (C11) The connectivity of the compressed tree remains unaffected by the insertions.
  • (C12) With the insertions of T5, the compressed tree covers the whole Collatz graph.
  • (C13) The Collatz graph is a tree (except for the trivial cycle 4-2-1).