Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch in Oxygen Atom Mapping Between Reactants and Product in Indigo Automapping #2719

Open
RamziWeslati opened this issue Dec 31, 2024 · 2 comments

Comments

@RamziWeslati
Copy link

Summary
Automapping of a reaction using the Indigo library appears to mismatch oxygen atom labels (9 and 12) between the reactants and product.

Steps to Reproduce

from indigo import Indigo

class IndigoMapper:
    def __init__(self):
        self.mapper = Indigo()
        self.mapper.setOption("timeout", 2000)
        self.mapper.setOption("aam-timeout", 2000)

    def map_one(self, smiles: str) -> str:
        mapped_reaction = self.mapper.loadReaction(smiles)
        mapped_reaction.automap("discard")
        return mapped_reaction.smiles()

product = "COc1ccccc1OCC(O)CN1CCN(CC(=O)Nc2c(C)cccc2C)CC1"
reactants = ["COc1ccccc1OC(CO)CN1CCN(CC(=O)Nc2c(C)cccc2C)CC1"]

mapper = IndigoMapper()
reaction_smiles = f"{'.'.join(reactants)}>>{product}"
mapped_reaction_smiles = mapper.map_one(reaction_smiles)

print("Mapped Reaction:", mapped_reaction_smiles)

Actual behavior
After automapping the reaction with automap("discard"), the oxygen atoms labeled as 9 and 12 in the reactants and product appear mismatched. Specifically:

  • Oxygen 9 in the reactant maps to 12 in the product.
  • Oxygen 12 in the reactant maps to 9 in the product.

This behavior can be visualized in the attached molecule images.

image image

Expected behavior
Oxygen atoms in the reactants and product should retain consistent mapping based on chemical equivalence. Specifically:

  • Reactant oxygen labeled 9 should map to product oxygen 9.
  • Reactant oxygen labeled 12 should map to product oxygen 12.

Environment details:

  • Indigo Version: 1.25.0.0-g0b3363e57-x86_64-darwin-appleclang-15000100
  • Python Version: 3.9.13 [Clang 12.0.0]
  • macOs Sonoma 14.7.1

Attachments

  1. Image of the mapped product structure.
  2. Image of the mapped reactant structure.

Additional context
This issue could potentially impact downstream applications that rely on accurate atom mappings for reaction modeling. Could you confirm if this is expected behavior or a bug in the mapping logic?

cc @ben-ikt

@AlexanderSavelyev
Copy link
Collaborator

AlexanderSavelyev commented Jan 6, 2025

TL;DR it is expected behavior the the atom mapping
TL;DR2 are you sure that disconnecting carbon chain is lower energy than disconnecting hydroxyl group - is your expected behavior is actually true in real chemistry?

Yes, it is expected behavior for the Atom mapping using MCSS algorithms. MCS - maximum common substructure approach tries to find maximum structure in both reactant and product, then "removes" it and continues to find other maximum parts until all atoms matched. In this case it finds this substructure first

image

Then this one

image

So basically - the reaction just disconnect one hydroxyl group and connects to other. One should not look at atom number (it is just taken from the input order) but look at bond connections.

Yes MCSS algorithm can be not perfect in some cases but in this case I don't understand how do you expect that 9th oxygen matches to 12 - it means that one should remove a carbon in between? And that is I don't sure how what is more possible (in terms of atom electronegativities, covalent bonds stabilities, etc) unless you know exactly how this reaction should be processed

@AlexanderSavelyev
Copy link
Collaborator

slightly updated comment above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants