10.1. Basics, Graphs, and Rules

10.1.1. Graph Interface

Graph objects have a full interface to access individual vertices and edges. The attributes of vertices and edges can be accessed both in their raw string form, and as their chemical counterpart (if they have one).

Explore in the playground.

 1g = graphDFS("[R]{x}C([O-])CC=O")
 2print("|V| =", g.numVertices)
 3print("|E| =", g.numEdges)
 4for v in g.vertices:
 5   print("v%d: label='%s'" % (v.id, v.stringLabel), end="")
 6   print("\tas molecule: atomId=%d, charge=%d" % (v.atomId, v.charge), end="")
 7   print("\tis oxygen?", v.atomId == AtomIds.Oxygen)
 8   print("\td(v) =", v.degree)
 9   for e in v.incidentEdges:
10      print("\tneighbour:", e.target.id)
11for e in g.edges:
12   print("(v%d, v%d): label='%s'" % (e.source.id, e.target.id, e.stringLabel), end="")
13   try:
14      bt = str(e.bondType)
15   except LogicError:
16      bt = "Invalid"
17   print("\tas molecule: bondType=%s" % bt, end="")
18   print("\tis double bond?", e.bondType == BondType.Double)

10.1.2. Formose Grammar

The graph grammar modelling the formose chemistry.

Explore in the playground.

 1formaldehyde = smiles("C=O", name="Formaldehyde")
 2glycolaldehyde = smiles( "OCC=O", name="Glycolaldehyde")
 3ketoEnolGML = """rule [
 4   ruleID "Keto-enol isomerization" 
 5   left [
 6      edge [ source 1 target 4 label "-" ]
 7      edge [ source 1 target 2 label "-" ]
 8      edge [ source 2 target 3 label "=" ]
 9   ]   
10   context [
11      node [ id 1 label "C" ]
12      node [ id 2 label "C" ]
13      node [ id 3 label "O" ]
14      node [ id 4 label "H" ]
15   ]   
16   right [
17      edge [ source 1 target 2 label "=" ]
18      edge [ source 2 target 3 label "-" ]
19      edge [ source 3 target 4 label "-" ]
20   ]   
21]"""
22ketoEnol_F = ruleGMLString(ketoEnolGML)
23ketoEnol_B = ruleGMLString(ketoEnolGML, invert=True)
24aldolAddGML = """rule [
25   ruleID "Aldol Addition"
26   left [
27      edge [ source 1 target 2 label "=" ]
28      edge [ source 2 target 3 label "-" ]
29      edge [ source 3 target 4 label "-" ]
30      edge [ source 5 target 6 label "=" ]
31   ]
32   context [
33      node [ id 1 label "C" ]
34      node [ id 2 label "C" ]
35      node [ id 3 label "O" ]
36      node [ id 4 label "H" ]
37      node [ id 5 label "O" ]
38      node [ id 6 label "C" ]
39   ]
40   right [
41      edge [ source 1 target 2 label "-" ]
42      edge [ source 2 target 3 label "=" ]
43      edge [ source 5 target 6 label "-" ]
44
45      edge [ source 4 target 5 label "-" ]
46      edge [ source 6 target 1 label "-" ]
47   ]
48]"""
49aldolAdd_F = ruleGMLString(aldolAddGML)
50aldolAdd_B = ruleGMLString(aldolAddGML, invert=True)

10.1.3. Printing Graphs/Molecules

The visualisation of graphs can be “prettified” using special printing options. The changes can make the graphs look like normal molecule visualisations.

Explore in the playground.

 1# Our test graph, representing the molecule caffeine:
 2g = smiles('Cn1cnc2c1c(=O)n(c(=O)n2C)C')
 3# ;ake an object to hold our settings:
 4p = GraphPrinter()
 5# First try visualising without any prettifications:
 6p.disableAll()
 7g.print(p)
 8# Now make chemical edges look like bonds, and put colour on atoms.
 9# Also put the "charge" part of vertex labels in superscript:
10p.edgesAsBonds = True
11p.raiseCharges=True
12p.withColour = True
13g.print(p)
14# We can also "collapse" normal hydrogen atoms into the neighbours,
15# and just show a count:
16p.collapseHydrogens = True
17g.print(p)
18# And finally we can make "internal" carbon atoms simple lines:
19p.simpleCarbons = True
20g.print(p)
21# There are also options for adding indices to the vertices,
22# and modify the rendering of labels and edges:
23p2 = GraphPrinter()
24p2.disableAll()
25p2.withTexttt = True
26p2.thick = True
27p2.withIndex = True
28# We can actually print two different versions at the same time:
29g.print(p2, p)

10.1.4. Including Files

We can include other files (a la C/C++) to separate functionality.

Explore in the playground.

1include("050_formoseGrammar.py")
2post.summarySection("Input Graphs")
3for a in inputGraphs:
4   a.print()
5post.summarySection("Input Rules")
6for a in inputRules:
7   a.print()

10.1.5. Graph Loading

Molecules are encoded as attributed graphs. They can be loaded from SMILES strings, and in general any graph can be loaded from a GML specification, or from the SMILES-like format GraphDFS.

Explore in the playground.

 1# Load a graph from a SMILES string (only for molecule graphs):
 2ethanol1 = smiles("CCO", name="Ethanol1")
 3# Load a graph from a SMILES-like format, called "GraphDFS", but for general graphs:
 4ethanol2 = graphDFS("[C]([H])([H])([H])[C]([H])([H])[O][H]", name="Ethanol2")
 5# The GraphDFS format also supports implicit hydrogens:
 6ethanol3 = graphDFS("CCO", name="Ethanol3")
 7# The basic graph format is GML:
 8ethanol4 = graphGMLString("""graph [
 9   node [ id 0 label "C" ]
10   node [ id 1 label "C" ]
11   node [ id 2 label "O" ]
12   node [ id 3 label "H" ]
13   node [ id 4 label "H" ]
14   node [ id 5 label "H" ]
15   node [ id 6 label "H" ]
16   node [ id 7 label "H" ]
17   node [ id 8 label "H" ]
18   edge [ source 1 target 0 label "-" ]
19   edge [ source 2 target 1 label "-" ]
20   edge [ source 3 target 0 label "-" ]
21   edge [ source 4 target 0 label "-" ]
22   edge [ source 5 target 0 label "-" ]
23   edge [ source 6 target 1 label "-" ]
24   edge [ source 7 target 1 label "-" ]
25   edge [ source 8 target 2 label "-" ]
26]""", name="Ethanol4")
27# They really are all loading the same graph into different objects:
28assert ethanol1.isomorphism(ethanol2) == 1
29assert ethanol1.isomorphism(ethanol3) == 1
30assert ethanol1.isomorphism(ethanol4) == 1
31# and they can be visualised:
32ethanol1.print()
33# All loaded graphs are added to a list 'inputGraphs':
34for g in inputGraphs:
35   g.print()

10.1.6. Hello World

These examples use the Python 3 interface for the software. After each run a PDF summary is compiled. The content can be specified via the Python script.

Explore in the playground.

1# Normal printing to the terminal:
2print("Hello world")
3# Make some headers in the summary:
4post.summaryChapter("Hello")
5post.summarySection("World")
6# Load a moleucle from a SMILES string:
7mol = smiles("Cn1cnc2c1c(=O)n(c(=O)n2C)C", name="Caffeine")
8# Put a visualisation of the molecule in the summary:
9mol.print()

10.1.7. Graph Morphisms

Graph objects have methods for finding morphisms with the VF2 algorithms for isomorphism and monomorphism. We can therefore easily detect isomorphic graphs, count automorphisms, and search for substructures.

Explore in the playground.

 1mol1 = smiles("CC(C)CO")
 2mol2 = smiles("C(CC)CO")
 3# Check if there is just one isomorphism between the graphs:
 4isomorphic = mol1.isomorphism(mol2) == 1
 5print("Isomorphic?", isomorphic)
 6# Find the number of automorphisms in the graph,
 7# by explicitly enumerating all of them:
 8numAutomorphisms = mol1.isomorphism(mol1, maxNumMatches=2**30)
 9print("|Aut(G)| =", numAutomorphisms)
10# Let's count the number of methyl groups:
11methyl = smiles("[CH3]")
12# The symmetry of the group it self should not be counted,
13# so find the size of the automorphism group of methyl.
14numAutMethyl = methyl.isomorphism(methyl, maxNumMatches=2**30)
15print("|Aut(methyl)|", numAutMethyl)
16# Now find the number of methyl matches,
17numMono = methyl.monomorphism(mol1, maxNumMatches=2**30)
18print("#monomorphisms =", numMono)
19# and divide by the symmetries of methyl.
20print("#methyl groups =", numMono / numAutMethyl)

10.1.8. Rule Loading

Rules must be specified in GML format.

Explore in the playground.

 1# A rule (L <- K -> R) is specified by three graph fragments:
 2# left, context, and right
 3destroyVertex = ruleGMLString("""rule [
 4   left [
 5      node [ id 1 label "A" ]
 6   ]
 7]""")
 8createVertex = ruleGMLString("""rule [
 9   right [
10      node [ id 1 label "A" ]
11   ]
12]""")
13identity = ruleGMLString("""rule [
14   context [
15      node [ id 1 label "A" ]
16   ]
17]""")
18# A vertex/edge can change label:
19labelChange = ruleGMLString("""rule [
20   left [
21      node [ id 1 label "A" ]
22      edge [ source 1 target 2 label "A" ]
23   ]
24   # GML can have Python-style line comments too
25   context [
26      node [ id 2 label "Q" ]
27   ]
28   right [
29      node [ id 1 label "B" ]
30      edge [ source 1 target 2 label "B" ]
31   ]
32]""")
33# A chemical rule should probably not destroy and create vertices:
34ketoEnol = ruleGMLString("""rule [
35   left [
36      edge [ source 1 target 4 label "-" ]
37      edge [ source 1 target 2 label "-" ]
38      edge [ source 2 target 3 label "=" ]
39   ]   
40   context [
41      node [ id 1 label "C" ]
42      node [ id 2 label "C" ]
43      node [ id 3 label "O" ]
44      node [ id 4 label "H" ]
45   ]   
46   right [
47      edge [ source 1 target 2 label "=" ]
48      edge [ source 2 target 3 label "-" ]
49      edge [ source 3 target 4 label "-" ]
50   ]   
51]""")
52# Rules can be printed, but label changing edges are not visualised in K:
53ketoEnol.print()
54# Add with custom options, like graphs:
55p1 = GraphPrinter()
56p2 = GraphPrinter()
57p1.disableAll()
58p1.withTexttt = True
59p1.withIndex = True
60p2.setReactionDefault()
61for p in inputRules:
62   p.print(p1, p2)
63# Be careful with printing options and non-existing implicit hydrogens:
64p1.disableAll()
65p1.edgesAsBonds = True
66p2.setReactionDefault()
67p2.simpleCarbons = True # !!
68ketoEnol.print(p1, p2)

10.1.9. Rule Morphisms

Rule objects, like graph objects, have methods for finding morphisms with the VF2 algorithms for isomorphism and monomorphism. We can therefore easily detect isomorphic rules, and decide if one rule is at least as specific/general as another.

Explore in the playground.

 1# A rule with no extra context:
 2small = ruleGMLString("""rule [
 3   ruleID "Small"
 4   left [
 5      node [ id 1 label "H" ]
 6      node [ id 2 label "O" ]
 7      edge [ source 1 target 2 label "-" ]
 8   ]
 9   right [
10      node [ id 1 label "H+" ]
11      node [ id 2 label "O-" ]
12   ]
13]""")
14# The same rule, with a bit of context:
15large = ruleGMLString("""rule [
16   ruleID "Large"
17   left [
18      node [ id 1 label "H" ]
19      node [ id 2 label "O" ]
20      edge [ source 1 target 2 label "-" ]
21   ]
22   context [
23      node [ id 3 label "C" ]
24      edge [ source 2 target 3 label "-" ]
25   ]
26   right [
27      node [ id 1 label "H+" ]
28      node [ id 2 label "O-" ]
29   ]
30]""")
31isomorphic = small.isomorphism(large) == 1
32print("Isomorphic?", isomorphic)
33atLeastAsGeneral = small.monomorphism(large) == 1
34print("At least as general?", atLeastAsGeneral)