10.1. Basics, Graphs, and Rules¶
10.1.1. Hello World¶
These examples use the Python 3 interface for the software. After each run a PDF summary is compiled. The content can be specified via the Python script.
1# Normal printing to the terminal:
2print("Hello world")
3# Make some headers in the summary:
4post.summaryChapter("Hello")
5post.summarySection("World")
6# Load a moleucle from a SMILES string:
7mol = smiles("Cn1cnc2c1c(=O)n(c(=O)n2C)C", name="Caffeine")
8# Put a visualisation of the molecule in the summary:
9mol.print()
10.1.2. Graph Loading¶
Molecules are encoded as attributed graphs. They can be loaded from SMILES strings, and in general any graph can be loaded from a GML specification, or from the SMILES-like format GraphDFS.
1# Load a graph from a SMILES string (only for molecule graphs):
2ethanol1 = smiles("CCO", name="Ethanol1")
3# Load a graph from a SMILES-like format, called "GraphDFS", but for general graphs:
4ethanol2 = graphDFS("[C]([H])([H])([H])[C]([H])([H])[O][H]", name="Ethanol2")
5# The GraphDFS format also supports implicit hydrogens:
6ethanol3 = graphDFS("CCO", name="Ethanol3")
7# The basic graph format is GML:
8ethanol4 = graphGMLString("""graph [
9 node [ id 0 label "C" ]
10 node [ id 1 label "C" ]
11 node [ id 2 label "O" ]
12 node [ id 3 label "H" ]
13 node [ id 4 label "H" ]
14 node [ id 5 label "H" ]
15 node [ id 6 label "H" ]
16 node [ id 7 label "H" ]
17 node [ id 8 label "H" ]
18 edge [ source 1 target 0 label "-" ]
19 edge [ source 2 target 1 label "-" ]
20 edge [ source 3 target 0 label "-" ]
21 edge [ source 4 target 0 label "-" ]
22 edge [ source 5 target 0 label "-" ]
23 edge [ source 6 target 1 label "-" ]
24 edge [ source 7 target 1 label "-" ]
25 edge [ source 8 target 2 label "-" ]
26]""", name="Ethanol4")
27# They really are all loading the same graph into different objects:
28assert ethanol1.isomorphism(ethanol2) == 1
29assert ethanol1.isomorphism(ethanol3) == 1
30assert ethanol1.isomorphism(ethanol4) == 1
31# and they can be visualised:
32ethanol1.print()
33# All loaded graphs are added to a list 'inputGraphs':
34for g in inputGraphs:
35 g.print()
10.1.3. Printing Graphs/Molecules¶
The visualisation of graphs can be “prettified” using special printing options. The changes can make the graphs look like normal molecule visualisations.
1# Our test graph, representing the molecule caffeine:
2g = smiles('Cn1cnc2c1c(=O)n(c(=O)n2C)C')
3# ;ake an object to hold our settings:
4p = GraphPrinter()
5# First try visualising without any prettifications:
6p.disableAll()
7g.print(p)
8# Now make chemical edges look like bonds, and put colour on atoms.
9# Also put the "charge" part of vertex labels in superscript:
10p.edgesAsBonds = True
11p.raiseCharges=True
12p.withColour = True
13g.print(p)
14# We can also "collapse" normal hydrogen atoms into the neighbours,
15# and just show a count:
16p.collapseHydrogens = True
17g.print(p)
18# And finally we can make "internal" carbon atoms simple lines:
19p.simpleCarbons = True
20g.print(p)
21# There are also options for adding indices to the vertices,
22# and modify the rendering of labels and edges:
23p2 = GraphPrinter()
24p2.disableAll()
25p2.withTexttt = True
26p2.thick = True
27p2.withIndex = True
28# We can actually print two different versions at the same time:
29g.print(p2, p)
10.1.4. Graph Interface¶
Graph objects have a full interface to access individual vertices and edges. The attributes of vertices and edges can be accessed both in their raw string form, and as their chemical counterpart (if they have one).
1g = graphDFS("[R]{x}C([O-])CC=O")
2print("|V| =", g.numVertices)
3print("|E| =", g.numEdges)
4for v in g.vertices:
5 print("v%d: label='%s'" % (v.id, v.stringLabel), end="")
6 print("\tas molecule: atomId=%d, charge=%d" % (v.atomId, v.charge), end="")
7 print("\tis oxygen?", v.atomId == AtomIds.Oxygen)
8 print("\td(v) =", v.degree)
9 for e in v.incidentEdges:
10 print("\tneighbour:", e.target.id)
11for e in g.edges:
12 print("(v%d, v%d): label='%s'" % (e.source.id, e.target.id, e.stringLabel), end="")
13 try:
14 bt = str(e.bondType)
15 except LogicError:
16 bt = "Invalid"
17 print("\tas molecule: bondType=%s" % bt, end="")
18 print("\tis double bond?", e.bondType == BondType.Double)
10.1.5. Graph Morphisms¶
Graph objects have methods for finding morphisms with the VF2 algorithms for isomorphism and monomorphism. We can therefore easily detect isomorphic graphs, count automorphisms, and search for substructures.
1mol1 = smiles("CC(C)CO")
2mol2 = smiles("C(CC)CO")
3# Check if there is just one isomorphism between the graphs:
4isomorphic = mol1.isomorphism(mol2) == 1
5print("Isomorphic?", isomorphic)
6# Find the number of automorphisms in the graph,
7# by explicitly enumerating all of them:
8numAutomorphisms = mol1.isomorphism(mol1, maxNumMatches=2**30)
9print("|Aut(G)| =", numAutomorphisms)
10# Let's count the number of methyl groups:
11methyl = smiles("[CH3]")
12# The symmetry of the group it self should not be counted,
13# so find the size of the automorphism group of methyl.
14numAutMethyl = methyl.isomorphism(methyl, maxNumMatches=2**30)
15print("|Aut(methyl)|", numAutMethyl)
16# Now find the number of methyl matches,
17numMono = methyl.monomorphism(mol1, maxNumMatches=2**30)
18print("#monomorphisms =", numMono)
19# and divide by the symmetries of methyl.
20print("#methyl groups =", numMono / numAutMethyl)
10.1.6. Rule Loading¶
Rules must be specified in GML format.
1# A rule (L <- K -> R) is specified by three graph fragments:
2# left, context, and right
3destroyVertex = ruleGMLString("""rule [
4 left [
5 node [ id 1 label "A" ]
6 ]
7]""")
8createVertex = ruleGMLString("""rule [
9 right [
10 node [ id 1 label "A" ]
11 ]
12]""")
13identity = ruleGMLString("""rule [
14 context [
15 node [ id 1 label "A" ]
16 ]
17]""")
18# A vertex/edge can change label:
19labelChange = ruleGMLString("""rule [
20 left [
21 node [ id 1 label "A" ]
22 edge [ source 1 target 2 label "A" ]
23 ]
24 # GML can have Python-style line comments too
25 context [
26 node [ id 2 label "Q" ]
27 ]
28 right [
29 node [ id 1 label "B" ]
30 edge [ source 1 target 2 label "B" ]
31 ]
32]""")
33# A chemical rule should probably not destroy and create vertices:
34ketoEnol = ruleGMLString("""rule [
35 left [
36 edge [ source 1 target 4 label "-" ]
37 edge [ source 1 target 2 label "-" ]
38 edge [ source 2 target 3 label "=" ]
39 ]
40 context [
41 node [ id 1 label "C" ]
42 node [ id 2 label "C" ]
43 node [ id 3 label "O" ]
44 node [ id 4 label "H" ]
45 ]
46 right [
47 edge [ source 1 target 2 label "=" ]
48 edge [ source 2 target 3 label "-" ]
49 edge [ source 3 target 4 label "-" ]
50 ]
51]""")
52# Rules can be printed, but label changing edges are not visualised in K:
53ketoEnol.print()
54# Add with custom options, like graphs:
55p1 = GraphPrinter()
56p2 = GraphPrinter()
57p1.disableAll()
58p1.withTexttt = True
59p1.withIndex = True
60p2.setReactionDefault()
61for p in inputRules:
62 p.print(p1, p2)
63# Be careful with printing options and non-existing implicit hydrogens:
64p1.disableAll()
65p1.edgesAsBonds = True
66p2.setReactionDefault()
67p2.simpleCarbons = True # !!
68ketoEnol.print(p1, p2)
10.1.7. Rule Morphisms¶
Rule objects, like graph objects, have methods for finding morphisms with the VF2 algorithms for isomorphism and monomorphism. We can therefore easily detect isomorphic rules, and decide if one rule is at least as specific/general as another.
1# A rule with no extra context:
2small = ruleGMLString("""rule [
3 ruleID "Small"
4 left [
5 node [ id 1 label "H" ]
6 node [ id 2 label "O" ]
7 edge [ source 1 target 2 label "-" ]
8 ]
9 right [
10 node [ id 1 label "H+" ]
11 node [ id 2 label "O-" ]
12 ]
13]""")
14# The same rule, with a bit of context:
15large = ruleGMLString("""rule [
16 ruleID "Large"
17 left [
18 node [ id 1 label "H" ]
19 node [ id 2 label "O" ]
20 edge [ source 1 target 2 label "-" ]
21 ]
22 context [
23 node [ id 3 label "C" ]
24 edge [ source 2 target 3 label "-" ]
25 ]
26 right [
27 node [ id 1 label "H+" ]
28 node [ id 2 label "O-" ]
29 ]
30]""")
31isomorphic = small.isomorphism(large) == 1
32print("Isomorphic?", isomorphic)
33atLeastAsGeneral = small.monomorphism(large) == 1
34print("At least as general?", atLeastAsGeneral)
10.1.8. Formose Grammar¶
The graph grammar modelling the formose chemistry.
1formaldehyde = smiles("C=O", name="Formaldehyde")
2glycolaldehyde = smiles( "OCC=O", name="Glycolaldehyde")
3ketoEnolGML = """rule [
4 ruleID "Keto-enol isomerization"
5 left [
6 edge [ source 1 target 4 label "-" ]
7 edge [ source 1 target 2 label "-" ]
8 edge [ source 2 target 3 label "=" ]
9 ]
10 context [
11 node [ id 1 label "C" ]
12 node [ id 2 label "C" ]
13 node [ id 3 label "O" ]
14 node [ id 4 label "H" ]
15 ]
16 right [
17 edge [ source 1 target 2 label "=" ]
18 edge [ source 2 target 3 label "-" ]
19 edge [ source 3 target 4 label "-" ]
20 ]
21]"""
22ketoEnol_F = ruleGMLString(ketoEnolGML)
23ketoEnol_B = ruleGMLString(ketoEnolGML, invert=True)
24aldolAddGML = """rule [
25 ruleID "Aldol Addition"
26 left [
27 edge [ source 1 target 2 label "=" ]
28 edge [ source 2 target 3 label "-" ]
29 edge [ source 3 target 4 label "-" ]
30 edge [ source 5 target 6 label "=" ]
31 ]
32 context [
33 node [ id 1 label "C" ]
34 node [ id 2 label "C" ]
35 node [ id 3 label "O" ]
36 node [ id 4 label "H" ]
37 node [ id 5 label "O" ]
38 node [ id 6 label "C" ]
39 ]
40 right [
41 edge [ source 1 target 2 label "-" ]
42 edge [ source 2 target 3 label "=" ]
43 edge [ source 5 target 6 label "-" ]
44
45 edge [ source 4 target 5 label "-" ]
46 edge [ source 6 target 1 label "-" ]
47 ]
48]"""
49aldolAdd_F = ruleGMLString(aldolAddGML)
50aldolAdd_B = ruleGMLString(aldolAddGML, invert=True)
10.1.9. Including Files¶
We can include other files (a la C/C++) to separate functionality.
1include("050_formoseGrammar.py")
2post.summarySection("Input Graphs")
3for a in inputGraphs:
4 a.print()
5post.summarySection("Input Rules")
6for a in inputRules:
7 a.print()