In order to cope with the irregularity of graph data, alternative approaches must be designed, most notably to circumvent the issue of irregular non-Euclidean data, and to be invariant to the graph representation. However, whilst graph-based deep learning shares some connection with CNNs with respect to local connectivity of the component data, CNNs exploit the properties of regular connectivity, shift-invariance, and compositionality to achieve their noteworthy performance. In GCNs and some other forms of GNNs, information is propagated through a graph in a manner similar to how conventional convolutional neural networks (CNNs) treat grid data (e.g. Many other forms of GNN have been presented since then, including, but not limited to, Graph Attention Networks, Graph Autoencoders, and Graph Spatial–Temporal Networks. using the principles of spectral graph theory. In 2013, the Graph Convolutional Network (GCN) was presented by Bruna et al. This work was later expanded upon by Micheli and Scarselli et al. in 2005, presenting an architecture for learning node representations using recurrent neural networks capable of acting on directed, undirected, labelled, and cyclic graphs. The first Graph Neural Networks (GNNs) was put forward by Gori et al. Among these, the graph convolution network (GCN) is gaining popularity and various architectures have been proposed in data science community. the neural network is able to learn descriptors itself instead of relying on predefined molecular descriptors. This rapid growth is due in part to the substantial increase in available biochemical data thanks to the rise of techniques such as High Throughput Screening (HTS) and parallel synthesis, and also to the recent surge in parallel computational power that can be feasibly attained by harnessing General Purpose computing on Graphics Processing Units (GPGPU).Įfforts have also been taken to enable neural networks to do representation learning, i.e. Notably in the pharmaceutical area, in recent years AI has shown incredible growth, and is being used now not just for bioactivity and physical–chemical property prediction, but also for de novo design, image analysis, and synthesis prediction, to name a few. Over the past decade, deep learning has become a staple in the machine learning toolbox of many fields and research areas. Recently, deep neural network methods have become the latest weapon in a Cheminformatician’s arsenal for doing QSAR. Once a descriptor set has been defined, various modelling methods, including linear mapping methods like linear regression, partial least square and non-linear methods like support vector machine, random forest etc., are applied to building models. Approaches to generating representations using the graph representation of a molecule include graph kernels, and perhaps most importantly in the present context, ECFP (Extended Connectivity Circular Fingerprints). Researchers have proposed numerous descriptors to represent molecular 2D and 3D structures, aiming to correlate these descriptors with predicted endpoints. After Hansch proposed the QSAR concept, engineering molecular descriptors to build accurate models for the prediction of various properties has become the standard approach to QSAR modelling. The major aim of QSAR study is to reduce the number of compounds synthesized during the drug development, a notoriously long and costly process, hence the desire to improve its efficiency from a drug discovery perspective. QSAR (Quantitative Structure Activity Relationships) have been applied for decades in the development of relationships between physicochemical properties of chemical substances and their biological activities to obtain a reliable mathematical and statistical model for prediction of the activities of new chemical entities.