doug.molineux.blog

Blog

Graph Neural Networks (GNNs)

4/9/2025

Graph Neural Networks (GNNs): Revolutionizing AI for Molecular Structures

Imagine trying to teach a computer to understand a molecule. You could describe it as a collection of atoms with specific properties, but that misses something fundamental: the connections between those atoms. This network of connections—this graph—is precisely what makes molecules behave the way they do. Graph Neural Networks (GNNs) are a family of deep learning architectures specifically designed to process data that can be represented as graphs, making them uniquely suited for understanding molecular structures and revolutionizing modern drug discovery.

What Are Graph Neural Networks?

At their simplest, GNNs are neural networks that can directly operate on graph-structured data. Unlike traditional neural networks that expect data in regular formats like grids (images) or sequences (text), GNNs work with interconnected nodes and edges. In a molecular context, atoms become nodes and chemical bonds become edges.

Traditional machine learning approaches to molecular analysis often relied on manually engineered "fingerprints" or descriptors that attempted to capture molecular properties. These methods essentially flattened the rich structural information into vectors, losing critical spatial and connectivity information in the process. GNNs, by contrast, preserve and leverage this structural information directly.

The core insight behind GNNs is a process called message passing. In this paradigm, each node (atom) in the graph iteratively gathers information from its neighbors, updates its own representation based on this information, and then passes on its updated knowledge. After several rounds of message passing, each node's representation contains information not just about itself, but about its local neighborhood and, ultimately, the entire graph structure.

How GNNs Work: The Technical Details

While the concept is elegant, the implementation involves several sophisticated components working together:

  1. Node Feature Initialization: Each atom in a molecule starts with a set of features that might include atomic number, hybridization state, formal charge, and more. These features form the initial representation of each node.

  2. Edge Feature Encoding: Similarly, bonds between atoms have properties like bond type (single, double, triple), conjugation, and stereochemistry that form edge features.

  3. Message Passing Layers: The heart of a GNN architecture consists of these specialized layers where each node aggregates information from its neighbors. Mathematically, this often looks like:

    h_v^(k+1) = Update(h_v^k, Aggregate({h_u^k : u ∈ N(v)}))

    Where h_v^k is the feature vector for node v at iteration k, N(v) is the neighborhood of v, and Aggregate and Update are learnable functions.

  4. Readout Function: After several iterations of message passing, a readout function combines node-level representations into a single graph-level representation, which can then be used for prediction tasks.

The architecture can be trained end-to-end using standard deep learning techniques, typically supervised learning with backpropagation.

Popular GNN Architectures

The field has rapidly evolved with several specialized architectures:

Graph Convolutional Networks (GCNs) pioneered by Kipf and Welling provide a spectral approach to graph convolutions, offering an efficient approximation that scales to large graphs.

Graph Attention Networks (GATs) introduced by Veličković et al. incorporate attention mechanisms that allow nodes to weigh the importance of different neighbors, providing more flexibility in message passing.

Message Passing Neural Networks (MPNNs) formalized by Gilmer et al. at Google provide a general framework encompassing many GNN variants, with special focus on quantum chemistry applications.

Directional Message Passing Neural Networks (D-MPNNs) extend the message passing paradigm by incorporating bond directions, which is particularly important for capturing stereochemistry in drug-like molecules.

GNNs in Drug Discovery: A Perfect Match

The application of GNNs to drug discovery feels almost inevitable in retrospect. Molecules are naturally represented as graphs, and many of the properties we care about in drug development—binding affinity, solubility, toxicity—are fundamentally determined by molecular structure.

Property Prediction

One of the most direct applications of GNNs is predicting molecular properties. Traditional approaches like Quantitative Structure-Activity Relationship (QSAR) modeling have been used for decades, but GNNs have demonstrated superior performance across numerous benchmarks.

For example, Kearnes et al. demonstrated that graph convolutional networks can outperform traditional fingerprint-based methods for predicting solubility, toxicity, and other pharmacologically relevant properties. More recently, the Therapeutics Data Commons has established benchmarks where GNN-based approaches consistently rank at the top.

Virtual Screening

Virtual screening involves computationally evaluating large libraries of compounds to identify those most likely to bind to a target protein. GNNs have been particularly successful here, as demonstrated by DeepChem's implementation of GNN-based virtual screening tools that are now used by numerous pharmaceutical companies.

A fascinating example comes from Stokes et al. who used a GNN to identify a novel antibiotic (halicin) with activity against bacteria that had developed resistance to conventional antibiotics. The model was trained on a dataset of known antibiotics and their activity against various bacterial strains, then used to screen a library of over 100 million molecules.

De Novo Molecular Design

Perhaps the most ambitious application is using GNNs not just to evaluate existing molecules, but to create entirely new ones with desired properties. This typically involves combining GNNs with generative models.

One approach, demonstrated by Jin et al. in their Junction Tree Variational Autoencoder, uses a GNN to encode molecular graphs into a latent space, and then decodes points from this space back into valid molecules. By navigating this latent space, researchers can generate molecules with specific desired properties.

Another approach combines GNNs with reinforcement learning, where the model is rewarded for generating molecules with desired properties. You et al. demonstrated this with their Graph Convolutional Policy Network, which learned to build molecules atom-by-atom while optimizing for specific properties.

Real-World Impact and Industry Adoption

The pharmaceutical industry has enthusiastically embraced GNNs, with numerous startups and established companies integrating them into their discovery pipelines:

Relay Therapeutics uses GNNs as part of their platform to understand protein motion and its implications for drug binding, helping them discover novel cancer therapeutics.

Atomwise pioneered the use of deep learning for virtual screening and has screened billions of compounds across hundreds of protein targets, with several candidates now in preclinical development.

Insilico Medicine combines GNNs with generative adversarial networks in their Chemistry42 platform, which has produced candidates for challenging targets like DDR1 kinase in record time.

Exscientia made history with the first AI-designed drug to enter clinical trials, using graph-based approaches as part of their pharmacology-driven AI platform.

Challenges and Future Directions

Despite their impressive success, GNNs in drug discovery face several challenges:

Limited training data remains a significant obstacle. While public datasets like ChEMBL contain millions of data points, they are sparse across the vast chemical space and often noisy or inconsistent.

Interpretability is crucial in the pharmaceutical context where understanding the "why" behind predictions can guide medicinal chemistry optimization. Efforts to make GNNs more interpretable include attention visualization and attribution methods.

Multi-scale integration represents the frontier of GNN research in drug discovery. Molecules don't exist in isolation but interact with proteins, which themselves exist within cellular contexts. Developing GNNs that can seamlessly traverse these scales is an active area of research.

3D structures contain critical information about molecular conformations that 2D graph representations might miss. Recent work on geometric deep learning and 3D-aware GNNs, such as SchNet and DimeNet, aims to address this by incorporating distance and angular information.

Getting Started with GNNs for Drug Discovery

For software engineers interested in exploring this field, several excellent resources and frameworks exist:

PyTorch Geometric provides an extensive collection of GNN implementations and utilities specifically designed for working with graph data, including chemistry-specific features.

DeepChem, an open-source platform for deep learning in drug discovery, offers pre-built GNN architectures and molecular featurization tools, along with comprehensive tutorials.

RDKit serves as the backbone for molecular manipulation and representation in most GNN-based drug discovery pipelines.

TorchDrug focuses specifically on drug discovery applications of deep learning, with built-in support for various GNN architectures.

A typical workflow might involve:

  1. Preparing molecular data using RDKit
  2. Featurizing molecules as graphs with atom and bond features
  3. Implementing or selecting a GNN architecture
  4. Training the model on a property prediction task
  5. Evaluating performance against traditional methods

For those without extensive chemistry background, starting with property prediction tasks using datasets from MoleculeNet is a good entry point.

Conclusion

Graph Neural Networks represent a paradigm shift in how we approach computational drug discovery. By preserving and leveraging the inherent graph structure of molecules, they enable more accurate predictions, more efficient screening, and more creative design of potential therapeutics.

The synergy between GNNs and drug discovery is particularly powerful because it combines rigorous chemical knowledge with modern deep learning approaches. Rather than replacing domain expertise, GNNs amplify it by automating the exploration of chemical space and providing insights that might otherwise remain hidden.

As we look to the future, the integration of GNNs with other emerging technologies—federated learning for privacy-preserving collaboration, active learning for experiment design, and quantum computing for enhanced simulation—promises to further accelerate the pace of discovery.

For software engineers entering this space, the opportunity to impact human health through the application of cutting-edge AI is profound. The molecules designed today with the help of GNNs could become the life-saving treatments of tomorrow, making this field not just intellectually fascinating but deeply meaningful.

References

  1. Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv:1609.02907.

  2. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2017). Graph attention networks. arXiv:1710.10903.

  3. Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017). Neural message passing for quantum chemistry. arXiv:1704.01212.

  4. Kearnes, S., McCloskey, K., Berndl, M., Pande, V., & Riley, P. (2016). Molecular graph convolutions: moving beyond fingerprints. arXiv:1603.00856.

  5. Stokes, J. M., Yang, K., Swanson, K., Jin, W., Cubillos-Ruiz, A., Donghia, N. M., ... & Collins, J. J. (2020). A deep learning approach to antibiotic discovery. Cell, 180(4), 688-702. https://www.cell.com/cell/fulltext/S0092-8674(20)30102-1

  6. Jin, W., Barzilay, R., & Jaakkola, T. (2018). Junction tree variational autoencoder for molecular graph generation. arXiv:1812.01070.

  7. You, J., Liu, B., Ying, R., Pande, V., & Leskovec, J. (2018). Graph convolutional policy network for goal-directed molecular graph generation. arXiv:1806.02473.

  8. Schütt, K. T., Kindermans, P. J., Sauceda, H. E., Chmiela, S., Tkatchenko, A., & Müller, K. R. (2017). SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. arXiv:1706.08566.

  9. Klicpera, J., Groß, J., & Günnemann, S. (2020). Directional message passing for molecular graphs. arXiv:2003.03123.

© 2025 doug.molineux.blog. Built with Gatsby.