Re: Unambiguously describing molecules... and more
|From:||H. S. Teoh <hsteoh@...>|
|Date:||Sunday, March 18, 2007, 22:44|
On Sun, Mar 18, 2007 at 03:29:12PM -0700, Leon Lin wrote:
Ahhh, I see. That would be hard... macro-structure is very hard to
capture in something that aims to describe every last detail. A more
specialized system, perhaps one aimed specifically at describing
proteins, would work better, I think. That way, irrelevant details such
as repeating the precise structure of common amino acids all the time
can be abstracted away in more convenient units. The human brain can
only actively keep track of so many things simultaneously; you need to
chunk off repetitive parts so that they're easier to handle.
> "H. S. Teoh" <hsteoh@...> wrote: On Fri, Mar 16, 2007 at 08:21:40PM -0400, Leon Lin wrote:
> >> Can you come up with anything better that SMILES (see
> >> http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html)?
> >It looks pretty solid to me already. What disadvantages do you see in it
> >that might need improvement?
> Perhaps this is more of an impossibility that a linguistic idea, but
> can anyone think of an efficient way to describe the angles at which
> the bonds are? This is probably not important or necessary except
> maybe with proteins, which twist and bend even though they could be
> stretched out into a long sequence of amino acids.
True. But then, you're trying to represent a 3D structure in what is
essentially a 1D medium. This will be non-trivial at best.
> Going back the graph thing, if I'm not mistaken, SMILES cannot tell
> if something is on the inside or outside of something else. For
> example, if we had a buckyball (see
with some attachments, you
> would have no idea whether these appendages were on the inside or
> outside of the "ball". This difference affects how the molecule
> interacts with other things.
> Also, if the language is a human conlang, then readability is an
> issue. A "shortcut" option could be available for describing
> crystals or other things that have a repeated pattern. For example,
> cubane could be described as follows:
> Cubane is a molecule with 8 carbon atoms in the structure of a cube
> (with an atom at each vertex). Each carbon atom is bonded to one
> hydrogen atom.
> In SMILES, cubane is C12C3C4C1C5C4C3C25. When in the midst of other
> hydrocarbons, one could mistake this for another asymmetrical,
> branching, molecule whose structure is difficult to remember. How
> complicated would a description of a crystal be? Surely there is a
> way to describe symmetrical chemical structures? I am not
> critisizing SMILES because I don't think it was created for humans
> to read.[...]
So your system will need some way of abstracting away repeated units,
and have a consistent system for describing the 3D geometry of the
structure form by these units. I don't know if anybody has done
something like this, but it would surely be interesting! I'm very
interested in consistent systems of describing geometric structures
(esp. in >2 dimensions).
We are in class, we are supposed to be learning, we have a teacher... Is
it too much that I expect him to teach me??? -- RL