Unfoldable Self-Avoiding Walks Christophe Guyeux - Luigi Marangio Femto-ST Institute, Université de Bourgogne Franche-Comté 06/11/2017 Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 1 / 26
Contents 1 Introduction 2 Self-Avoiding Walks 3 The Pivot Algorithm 4 Foldable Self-Avoiding Walks 5 Conclusion Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 2 / 26
The Protein Folding Process Proteins, polymers formed by dierent kinds of amino acids, fold to form a specic three-dimensional shape; this geometric pattern (native state) denes the majority of functionality within an organism. Contrary to the mapping from DNA to the amino acids sequence, the complex folding of this last sequence still remains not well-understood. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 3 / 26
The Protein Folding Process Proteins, polymers formed by dierent kinds of amino acids, fold to form a specic three-dimensional shape; this geometric pattern (native state) denes the majority of functionality within an organism. Contrary to the mapping from DNA to the amino acids sequence, the complex folding of this last sequence still remains not well-understood. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 3 / 26
The 2D HP model Hydrophilic-hydrophobic 2D square lattice model (Lau and Dill, 1989): A protein conformation is a self-avoiding walk (SAW) on a Z 2 lattice (low resolution model); its free energy E must be minimal; E depends on contacts between hydrophobic amino acids that are not contiguous in the primary structure. Hydrophobic interactions dominate protein folding: Protein core freeing up energy is formed by hydrophobic amino acids; hydrophilic a.a. tend to move in the outer surface. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 4 / 26
The 2D HP model Hydrophilic-hydrophobic 2D square lattice model (Lau and Dill, 1989): A protein conformation is a self-avoiding walk (SAW) on a Z 2 lattice (low resolution model); its free energy E must be minimal; E depends on contacts between hydrophobic amino acids that are not contiguous in the primary structure. Hydrophobic interactions dominate protein folding: Protein core freeing up energy is formed by hydrophobic amino acids; hydrophilic a.a. tend to move in the outer surface. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 4 / 26
The 2D HP model Hydrophilic-hydrophobic 2D square lattice model (Lau and Dill, 1989): A protein conformation is a self-avoiding walk (SAW) on a Z 2 lattice (low resolution model); its free energy E must be minimal; E depends on contacts between hydrophobic amino acids that are not contiguous in the primary structure. Hydrophobic interactions dominate protein folding: Protein core freeing up energy is formed by hydrophobic amino acids; hydrophilic a.a. tend to move in the outer surface. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 4 / 26
The 2D HP model The goal is to map this straight line (black squares are hydrophobic residues): in this latter, having more black neighbours: Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 5 / 26
Resolving the PSP problem P. Crescenzi etc. On the complexity of protein folding (1998) The Protein Structure Prediction (PSP) problem is a NP-complete one, thus the optimal conformation(s) cannot be found exactly for large n's. Prediction strategies These conformations are thus predicted using articial intelligence tools, usually 1 start by predicting the 2D backbone, 2 then rene the obtained conformation in a 3D shape. At least two strategies for 2D backbone prediction: Method 1: iterating ±90 pivot moves on the straight line. Method 2: stretching 1 amino acid until obtaining an n-steps conformation. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 6 / 26
Resolving the PSP problem P. Crescenzi etc. On the complexity of protein folding (1998) The Protein Structure Prediction (PSP) problem is a NP-complete one, thus the optimal conformation(s) cannot be found exactly for large n's. Prediction strategies These conformations are thus predicted using articial intelligence tools, usually 1 start by predicting the 2D backbone, 2 then rene the obtained conformation in a 3D shape. At least two strategies for 2D backbone prediction: Method 1: iterating ±90 pivot moves on the straight line. Method 2: stretching 1 amino acid until obtaining an n-steps conformation. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 6 / 26
Resolving the PSP problem P. Crescenzi etc. On the complexity of protein folding (1998) The Protein Structure Prediction (PSP) problem is a NP-complete one, thus the optimal conformation(s) cannot be found exactly for large n's. Prediction strategies These conformations are thus predicted using articial intelligence tools, usually 1 start by predicting the 2D backbone, 2 then rene the obtained conformation in a 3D shape. At least two strategies for 2D backbone prediction: Method 1: iterating ±90 pivot moves on the straight line. Method 2: stretching 1 amino acid until obtaining an n-steps conformation. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 6 / 26
Various method for solving PSP: Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 7 / 26
Self-Avoiding Walks Denition (Self-Avoiding walk) Let d > 1. A n-step self-avoiding walk (SAW) from x Z d to y Z d is a map w : [0, n] Z d such that: w(0) = x and w(n) = y, w(i + 1) w(i) = 1, i, j [0, n], i j w(i) w(j) (self-avoiding property). Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 8 / 26
Self-avoiding walk encoding An absolute encoding for Saws: Movement Encoding Forward 0 Down 1 Backward 2 Up 3 Figure: Absolute encoding: 00011123322101 Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 9 / 26
What we (do not) know about Saws Let c n denote the number of n-step self-avoiding walks; when m-step SAWs are concatenated to n-step SAWs, we found all (m + n)-saws and other walks having intersections, thus Proposition m, n N, c m+n c m c n i.e., {logc n } is a sub-additive sequence. Connective constant Fekete's Lemma lim n c 1 n n = µ The exact value of µ is only known for the hexagonal lattice, where it is equal to 2 + 2 Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 10 / 26
What we (do not) know about Saws Let c n denote the number of n-step self-avoiding walks; when m-step SAWs are concatenated to n-step SAWs, we found all (m + n)-saws and other walks having intersections, thus Proposition m, n N, c m+n c m c n i.e., {logc n } is a sub-additive sequence. Connective constant Fekete's Lemma lim n c 1 n n = µ The exact value of µ is only known for the hexagonal lattice, where it is equal to 2 + 2 Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 10 / 26
What we (do not) know about Saws Let c n denote the number of n-step self-avoiding walks; when m-step SAWs are concatenated to n-step SAWs, we found all (m + n)-saws and other walks having intersections, thus Proposition m, n N, c m+n c m c n i.e., {logc n } is a sub-additive sequence. Connective constant Fekete's Lemma lim n c 1 n n = µ The exact value of µ is only known for the hexagonal lattice, where it is equal to 2 + 2 Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 10 / 26
A formula for Saws Numerical simulations give us a formula for c n : Conjecture c n µ n n 11 23 However such a formula is useless in this form: we do not know the value of µ; numerical simulations are not a mathematical proof. Finding a mathematical proof of such a formula, means to understand the behavour of SAW's, and thus the behaviour of natural structures that can be formalized with this kind of walks (such as proteins). Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 11 / 26
A formula for Saws Numerical simulations give us a formula for c n : Conjecture c n µ n n 11 23 However such a formula is useless in this form: we do not know the value of µ; numerical simulations are not a mathematical proof. Finding a mathematical proof of such a formula, means to understand the behavour of SAW's, and thus the behaviour of natural structures that can be formalized with this kind of walks (such as proteins). Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 11 / 26
Pivot move of ±90 The anticlockwise fold function is the function f : Z/4Z Z/4Z f (x) = x 1 (4) Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 12 / 26
The Pivot Algorithm The pivot algorithm is a dynamic Monte Carlo algorithm: generates SAWs (ω 0,..., ω N ) in a canonical ensemble (xed number of steps N), one endpoint xed at the origin (ω 0 = 0) and the other endpoint free. Randomly choose a pivot point k along the walk; randomly choose an element of the symmetry group of the lattice; then apply the symmetry-group element to ω k+1... ω N using ω k as a pivot i.e., as the temporary origin. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 13 / 26
The Pivot Algorithm The pivot algorithm is a dynamic Monte Carlo algorithm: generates SAWs (ω 0,..., ω N ) in a canonical ensemble (xed number of steps N), one endpoint xed at the origin (ω 0 = 0) and the other endpoint free. Randomly choose a pivot point k along the walk; randomly choose an element of the symmetry group of the lattice; then apply the symmetry-group element to ω k+1... ω N using ω k as a pivot i.e., as the temporary origin. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 13 / 26
The Pivot Algorithm The pivot algorithm is a dynamic Monte Carlo algorithm: generates SAWs (ω 0,..., ω N ) in a canonical ensemble (xed number of steps N), one endpoint xed at the origin (ω 0 = 0) and the other endpoint free. Randomly choose a pivot point k along the walk; randomly choose an element of the symmetry group of the lattice; then apply the symmetry-group element to ω k+1... ω N using ω k as a pivot i.e., as the temporary origin. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 13 / 26
The Pivot Algorithm The pivot algorithm is a dynamic Monte Carlo algorithm: generates SAWs (ω 0,..., ω N ) in a canonical ensemble (xed number of steps N), one endpoint xed at the origin (ω 0 = 0) and the other endpoint free. Randomly choose a pivot point k along the walk; randomly choose an element of the symmetry group of the lattice; then apply the symmetry-group element to ω k+1... ω N using ω k as a pivot i.e., as the temporary origin. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 13 / 26
Consider, for example, the case d = 2. Then the symmetry group of Z 2 is the dihedral group D 4, which has eight elements: identity (1) ±90 rotations (2) 180 rotation (1) axis reections (2) diagonal reections (2) Theorem (Madras and Sokal (1988)) The pivot algorithm is ergodic for self-avoiding walks on Z d provided that all axis reections, and: either all ±90 rotations; or all diagonal reections, are given with non zero probability. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 14 / 26
Consider, for example, the case d = 2. Then the symmetry group of Z 2 is the dihedral group D 4, which has eight elements: identity (1) ±90 rotations (2) 180 rotation (1) axis reections (2) diagonal reections (2) Theorem (Madras and Sokal (1988)) The pivot algorithm is ergodic for self-avoiding walks on Z d provided that all axis reections, and: either all ±90 rotations; or all diagonal reections, are given with non zero probability. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 14 / 26
Madras and Sokal example Ergodicity is lost when considering single ±90 pivot moves Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 15 / 26
Introducing the foldable SAWs Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 16 / 26
A graph structure for SAWs Dene the graph σ n as follows: its vertices are the n-step self-avoiding walks, described in absolute encoding; there is an edge between two vertices s i, s j if and only if s j can be obtained by one pivot move of ±90 on s i. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 17 / 26
A graph structure for SAWs Dene the graph σ n as follows: its vertices are the n-step self-avoiding walks, described in absolute encoding; there is an edge between two vertices s i, s j if and only if s j can be obtained by one pivot move of ±90 on s i. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 17 / 26
Examples of σ n for n = 2, 3 Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 18 / 26
Method 1 vs Method 2 Let us consider an equivalence relation on σ n w 1 Rw 2 w 1 is in the same connected component that w 2 Denition The subset of the foldable SAWs, fsaw n, is the connected component of the straight line 00... 00, in σ n. fsaw n σ n It is a trivial consequence of Madras example; usually this fact is not known by bioinformaticians; method 1 and method 2 do not produce the same set of conformations Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 19 / 26
Method 1 vs Method 2 Let us consider an equivalence relation on σ n w 1 Rw 2 w 1 is in the same connected component that w 2 Denition The subset of the foldable SAWs, fsaw n, is the connected component of the straight line 00... 00, in σ n. fsaw n σ n It is a trivial consequence of Madras example; usually this fact is not known by bioinformaticians; method 1 and method 2 do not produce the same set of conformations Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 19 / 26
Method 1 vs Method 2 Let us consider an equivalence relation on σ n w 1 Rw 2 w 1 is in the same connected component that w 2 Denition The subset of the foldable SAWs, fsaw n, is the connected component of the straight line 00... 00, in σ n. fsaw n σ n It is a trivial consequence of Madras example; usually this fact is not known by bioinformaticians; method 1 and method 2 do not produce the same set of conformations Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 19 / 26
Other subset of SAWs fsaw n, the equivalence class of the n-step straight walk; fsaw (n, k) is the set of equivalence classes of size k in σ n ; USAW n = fsaw (n, 1), that is, the set of all unfoldable walks; Madras' example belongs in USAW 223. f 1 SAW n is the complement of USAW n in σ n. This is the set of SAWs on which we can apply at least one pivot move of ±90. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 20 / 26
Current smallest (107-step) USAW Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 21 / 26
{n : USAW n } = Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 22 / 26
A dierent subset of SAWs: fsaw s Denition fsaw n : walks in fsaw n such that no intersection of vertex or edge during the transformation of one SAW to another occurs. Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 23 / 26
A dierent subset of SAWs: prudent SAWs Denition A self-avoiding walk is prudent if it never takes a step pointing towards a vertex it has already visited. It is equivalent to require that the new step avoids the past convex hull of the walk. Prudent fsaws Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 24 / 26
A dierent subset of SAWs: prudent SAWs Denition A self-avoiding walk is prudent if it never takes a step pointing towards a vertex it has already visited. It is equivalent to require that the new step avoids the past convex hull of the walk. Prudent fsaws Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 24 / 26
Some open question We know that for n 14 fsaw n = σ n and that fsaw 107 σ 107...we can do better? The PSP problem still remains NP-complete in fsaw n? Is there an unfoldable walk in Z 3? For any dimension d, do we have the existence of n N such that fsaw d n σ d n? fsaw 2 2 and fsaw 2 3 are Hamiltonian graphs, but they are not Eulerian. What about fsaw n k? It seems reasonable to restrict the PSP problem to fsaws; what if we restrict the problem to the prudent SAWs? Can we produce a more ecient algorithm then Pivot algorithm, to produce prudent SAWs? Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 25 / 26
Some open question We know that for n 14 fsaw n = σ n and that fsaw 107 σ 107...we can do better? The PSP problem still remains NP-complete in fsaw n? Is there an unfoldable walk in Z 3? For any dimension d, do we have the existence of n N such that fsaw d n σ d n? fsaw 2 2 and fsaw 2 3 are Hamiltonian graphs, but they are not Eulerian. What about fsaw n k? It seems reasonable to restrict the PSP problem to fsaws; what if we restrict the problem to the prudent SAWs? Can we produce a more ecient algorithm then Pivot algorithm, to produce prudent SAWs? Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 25 / 26
Some open question We know that for n 14 fsaw n = σ n and that fsaw 107 σ 107...we can do better? The PSP problem still remains NP-complete in fsaw n? Is there an unfoldable walk in Z 3? For any dimension d, do we have the existence of n N such that fsaw d n σ d n? fsaw 2 2 and fsaw 2 3 are Hamiltonian graphs, but they are not Eulerian. What about fsaw n k? It seems reasonable to restrict the PSP problem to fsaws; what if we restrict the problem to the prudent SAWs? Can we produce a more ecient algorithm then Pivot algorithm, to produce prudent SAWs? Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 25 / 26
Some open question We know that for n 14 fsaw n = σ n and that fsaw 107 σ 107...we can do better? The PSP problem still remains NP-complete in fsaw n? Is there an unfoldable walk in Z 3? For any dimension d, do we have the existence of n N such that fsaw d n σ d n? fsaw 2 2 and fsaw 2 3 are Hamiltonian graphs, but they are not Eulerian. What about fsaw n k? It seems reasonable to restrict the PSP problem to fsaws; what if we restrict the problem to the prudent SAWs? Can we produce a more ecient algorithm then Pivot algorithm, to produce prudent SAWs? Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 25 / 26
Some open question We know that for n 14 fsaw n = σ n and that fsaw 107 σ 107...we can do better? The PSP problem still remains NP-complete in fsaw n? Is there an unfoldable walk in Z 3? For any dimension d, do we have the existence of n N such that fsaw d n σ d n? fsaw 2 2 and fsaw 2 3 are Hamiltonian graphs, but they are not Eulerian. What about fsaw n k? It seems reasonable to restrict the PSP problem to fsaws; what if we restrict the problem to the prudent SAWs? Can we produce a more ecient algorithm then Pivot algorithm, to produce prudent SAWs? Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 25 / 26
Thank you! Christophe Guyeux - Luigi Marangio Unfoldable Self-Avoiding Walks 06/11/2017 26 / 26