References

fgconfig

class fgutils.fgconfig.FGConfig(name: str | None = None, pattern: str | None = None, parser: Parser | None = None, group_atoms: list[int] | None = None, anti_pattern: str | list[str] = [], depth: int | None = None, len_exclude_nodes: list[str] = ['R'])

Functional group configuration class.

Parameters:
  • name – The name of the functional gruop.

  • pattern – The structural description of the functional group.

  • parser – (optional) A parser to use to convert the pattern into a structure.

  • group_atoms – (optional) A list of indices indicating with nodes in the pattern belong to the functional group. A pattern might have some wildcard nodes attached that are required to match but do not belong to the group. (Default = all nodes)

  • anti_pattern – (optional) A list of anti patterns that must not be matched. (Default = None)

  • depth – (optional) The maximal depth to check the patterns. (Default = max(pattern, anti_pattern)

  • len_exclude_nodes – (optional) Node types that should be excluded in the pattern length. (Default = [“R”] - wildcard pattern)

property pattern_len: int

The number of nodes of the functional group structure. Nodes specified in len_exclude_nodes are not included.

class fgutils.fgconfig.FGConfigProvider(config: FGConfig | list[dict] | list[FGConfig] | None = None, mapper: PermutationMapper | None = None)

Provider for functional group configs.

Parameters:
  • config – A FGConfig object or a list of config objects. The configurations can also be passed as dictionaries.

  • mapper – (optional) A PermutationMapper to use.

get_by_name(name: str) FGConfig

Get the functional group config by name.

Returns:

Returns the FGConfig instance.

get_tree() list[FGTreeNode]

Get the functional groups hirachically organized in a tree. Functional groups are ordered based on their structure. A group is another groups child if its structure is more specific, i.e., the parent structure is a subgraph of the child. A child can have multiple parents and a parent can have multiple childs.

Returns:

Returns the list of root groups.

utils

fgutils.utils.split_its(graph: Graph) tuple[Graph, Graph]

Split an ITS graph into reactant graph G and product graph H.

Parameters:

graph – ITS graph to split up.

Returns:

Tuple of two graphs (G, H).

query

class fgutils.query.FGQuery(mapper: PermutationMapper | None = None, config: FGConfig | list[FGConfig] | FGConfigProvider | None = None, require_implicit_hydrogen: bool = True)

Class to get functional groups from a molecule.

Parameters:
  • mapper – (optional) The permutation mapper to use.

  • config_provider – (optional) A functional group config provider. If not specified the default functional group collection will be used.

  • require_implicit_hydrogen – Flag to specify if implicit hydrogens are required for the query. This usually depends on the provided FGConfigs. If the configured functional group patterns do not require hydrogens this can be set to false. (Default = True)

get(value) list[tuple[str, list[int]]]

Get all functional groups from a molecule. The query returns two functional groups for acetylsalicylic acid:

>>> smiles = "O=C(C)Oc1ccccc1C(=O)O" # acetylsalicylic acid
>>> query = FGQuery()
>>> query.get(smiles)
[('ester', [0, 1, 3]), ('carboxylic_acid', [10, 11, 12])]
Parameters:

value – This is either a graph or SMILES as string.

Returns:

Returns a list of tuples. The first element in a tuple is the functional group name and the second element is a list of node indices that belong to this functional group (functional_group_name, [idx_1, idx_2, ...]).

parse

class fgutils.parse.Parser(use_multigraph=False, init_aam=False, verbose=False)

Class to convert a SMILES like graph description into a NetworkX graph.

Example for parsing acetic acid:

>>> parser = Parser()
>>> g = parser("CC(O)=O")
Graph with 4 nodes and 3 edges
Parameters:
  • use_multigraph – Flag to specify if the resulting graph object should be of type networkx.MultiGraph or networkx.Graph. The difference is that a MultiGraph can have more than one edge between two nodes. For parsing molecule like graphs this is not necessary because bond types are encoded as edge labels. (Default = False)

  • verbose – Flag to print information during parsing. (Default = False)

parse(pattern: str, idx_offset: int = 0)

Method to parse a SMILES like graph pattern.

Parameters:
  • pattern – The pattern to convert into a graph. The pattern is a tree-like description of the graph. It is strongly oriented at the SMILES notation.

  • idx_offset – The index offset argument provides the starting value for the consecutive node numbering. (Default = 0)

Returns:

Returns the converted graph object.

proxy

class fgutils.proxy.GraphSampler(unique=False)

Base class for sampling ProxyGraphs.

Parameters:

unique – If set to true each graph is only returned once. (Default = False)

sample(graphs: list[ProxyGraph], group_name=None) list[ProxyGraph] | None

Method to retrive a new sample from a list of graphs. If unique is set to true the first graph that was not yet returned is selected. None is returned if all graphs have been selected. If unique is false all graphs are returned each time the function is called.

Parameters:
  • graphs – A list of graphs to sample from.

  • group_name – (optional) The group name is an optional argument. It’s not necessary to specify it if it’s not needed.

Returns:

Returns one or more graphs from the list or None if sampling should stop.

class fgutils.proxy.MolProxy(core: str | list[str] | ProxyGroup, groups: ProxyGroup | list[ProxyGroup] | dict[str, ProxyGroup], parser: Parser | None = None)

Proxy to generate molecules.

Parameters:
  • core – A pattern string or ProxyGroup representing the core graph. For example a specific functional group.

  • groups – A list of groups to expand the core graph with.

  • parser – (optional) The parser to convert patterns into structures.

class fgutils.proxy.Proxy(core: str | list[str] | ProxyGroup, groups: ProxyGroup | list[ProxyGroup] | dict[str, ProxyGroup], enable_aam: bool = True, parser: Parser | None = None)

Proxy is a generator class. It extends a specific core graph by a set of subgraphs (groups). This class implements the iterator interface so it can be used in a for loop to generate samples:

>>> proxy = Proxy("C{g}", ProxyGroup("g", ["C", "O", "N"]))
>>> for graph in proxy:
>>>    print([d["symbol"] for n, d in graph.nodes(data=True)])
['C', 'C']
['C', 'O']
['C', 'N']
Parameters:
  • core – A pattern string or ProxyGroup representing the core graph. For example a specific functional group or a reaction center.

  • groups – A list of groups to expand the core graph with.

  • enable_aam – Flag to specify if the ‘aam’ label is set in the result graph. (Default = True)

  • parser – (optional) The parser to convert patterns into structures.

static from_dict(config: dict) Proxy

Load Proxy from dict config. The expected JSON format is:

{
    "core": <pattern>,
    "groups": <groups> # for expected format take a
    # look at ProxyGroup.from_dict()
}
Parameters:

config – The configuration dictionary. E.g. loaded from JSON

Returns:

The instantiated Proxy.

get_next()

Get the next sample.

Returns:

A generated graph.

property groups: list[ProxyGroup]

The list of ProxyGroups. This can be set with values of the following type: ProxyGroup | list[ProxyGroup] | dict[str, ProxyGroup]

class fgutils.proxy.ProxyGraph(pattern: str, anchor: list[int] = [0], name: str | None = None, **kwargs)

ProxyGraph is essentially a subgraph used to expand molecules. If the node that is replaced by the pattern has more edges than the pattern has anchors, the last anchor will be used multiple times. In the default case the first node in the pattern will connect to all the neighboring nodes of the replaced node.

Parameters:
  • pattern – String representation of the graph.

  • anchor – A list of indices in the pattern that are used to connect to the parent graph. (Default = [0])

  • name – A name for the graph. This is just for visualization and debugging.

  • kwargs – Keyword arguments are used as graph properties. Specify whatever you need.

class fgutils.proxy.ProxyGroup(name, graphs: str | list[str] | ProxyGraph | list[ProxyGraph], sampler=None, unique=False)

ProxyGroup is a collection of patterns that can be replaced for a labeled node in a graph. The node label is the respective group name where one of the patterns will be replaced

Parameters:
  • name – The name of the group.

  • graphs – (optional) A list of subgraphs or a list of graph descriptions. The patterns are converted to ProxyGraphs with one anchor at index 0. Use ProxyGraph objects if you need more control over how subgraphs are instantiated.

  • sampler – (optional) An object or a function to retrive individual graphs from the list. The expected function interface is: func(list[ProxyGraph]) -> list[ProxyGraph]. Implement the __call__ method if you use a class. The function can have an optional keyword argument group_name.

  • unique – Argument to specify if graphs can be returned multiple times. This only takes effect if sampler is not set. (Default = False)

static from_dict(config: dict) dict[str, ProxyGroup]

Load ProxyGroups from dict config. The expected JSON format is:

{
    "group_name_1": <pattern>,
    "group_name_2": [<pattern>],
    "group_name_3": <group_config> # for expected format take a
    # look at ProxyGroup.from_dict_single()
}
Parameters:

config – The configuration dictionary. E.g. loaded from JSON file.

Returns:

Returns a mapping dictionary of ProxyGroups where the key is the group name and the value is the ProxyGroup object.

static from_dict_single(name: str, config: dict) ProxyGroup

Load a single ProxyGroup object from configuration. The expected JSON format is one of the following:

{
    # short form
    "graphs": <pattern>,
    "graphs": [<pattern>],
    "graphs": {"pattern": <pattern>, "anchor": list[int]},
    # complete config
    "graphs": [{
            "pattern": <pattern>,
            "anchor": list[int],
            <any_key>: <any_value>
        }],

    # <pattern> is the SMILES-like graph description of type str
}

It’s possible to specify additional properties on graphs. These can be used for example by custom samplers to implement some logic or dependencies between groups.

Parameters:
  • name – The name of the ProxyGroup.

  • config – The configuration dictionary. E.g. loaded from JSON file.

Returns:

The instantiated ProxyGroup.

property graphs: list[ProxyGraph]

The list of ProxyGraphs. This can be set with values of the following type: str | list[str] | ProxyGraph | list[ProxyGraph]

sample_graphs() list[ProxyGraph] | None

Sample graphs. This method uses the sampler to select a list of graphs.

Returns:

A list of ProxyGraph objects or None if there is nothing more to sample.

class fgutils.proxy.ReactionProxy(core: str | list[str] | ProxyGroup, groups: ProxyGroup | list[ProxyGroup] | dict[str, ProxyGroup], enable_aam: bool = True, parser: Parser | None = None)

Proxy to generate reactions.

Parameters:
  • core – A pattern string or ProxyGroup representing the core graph. For example a specific functional group or a reaction center.

  • groups – A list of groups to expand the core graph with.

  • enable_aam – Flag to specify if the ‘aam’ label is set in the result graph. (Default = True)

  • parser – (optional) The parser to convert patterns into structures.

get_next()

Generate a new reaction sample. The reaction proxy returns two graphs G and H. G is the reactant graph and H is the product graph.

Returns:

A tuple of two graphs (G, H) representing the reaction G → H.

fgutils.proxy.build_graphs(core: ProxyGraph, groups: dict[str, ProxyGroup], parser: Parser)

Replace labeled nodes in the core graph with groups. For each labeled node the respective group is used to replace the node by the specified subgraphs.

Parameters:
  • core – The parent graph with labeled nodes.

  • groups – A list of groups to replace the labeled nodes in the core graph with. The dictionary keys must be the group names.

  • parser – The parser that is used to convert graph patterns into graphs.

Returns:

Returns a list of graphs with replaced nodes.

fgutils.proxy.build_group_tree(core: ProxyGroup, groups: ProxyGroup | list[ProxyGroup], parser=None) Graph

Constructs a tree of all possible graph instantiations. The number of leave nodes in this tree is the number of possible samples.

Parameters:
  • core – The ProxyGroup that serves as core group.

  • groups – A list of groups to replace labeled nodes.

  • parser – (optional) A parser to use for conversion from pattern to structures.

Returns:

Returns the construction tree as nx.Graph object.

fgutils.proxy.replace_next_node(graph, groups: dict[str, ProxyGroup], parser: Parser)

Replace the next labeled node in graph with the respective group.

Parameters:
  • graph – The graph where a node should be replace by a subgraph.

  • groups – A mapping dictionary of groups to replace the labeled nodes in the parent with. The dictionary keys must be the group name.

  • parser – The parser to use to convert the pattern into structure.

Returns:

Returns a list of new graphs where the first labeled node is replaced. None is returned if no replaceable labeled node is left.

fgutils.proxy.replace_node(graph, node, replacement_graph: ProxyGraph, parser: Parser)

Replace node in graph with replacement_graph converted by the parser.

Parameters:
  • graph – The graph where a node should be replace by a subgraph.

  • node – The node to replace in graph.

  • replacement_graph – The subgraph that is inserted instead of the node.

  • parser – The parser to convert the pattern into structure.

Returns:

Returns a new graph with node replace by replacement_graph.

proxy_collection

class fgutils.proxy_collection.diels_alder_proxy.DielsAlderProxy(enable_aam=True, neg_sample=False)

A proxy for the generation of Diels-Alder reaction samples and counter-samples. The proxy returns two graphs G and H as tuple. G is the reactant graph and H is the product graph. For a comprehensive description of the proxy configuration read section Diels-Alder Reaction Proxy.

Parameters:
  • enable_aam – Flag to specify if the aam label is set in the result graphs. (Default = True)

  • neg_sample – If set to true the proxy will exclusively generate negative samples, i.e., reactions where a Diels-Alder graph transformation rule is theoretically applicable but the reaction will never happen in reality. (Default = False)