Advertisement

Machine Learning in Biology and Medicine

Open AccessPublished:August 27, 2019DOI:https://doi.org/10.1016/j.yamp.2019.07.010

      Keywords

      Key points

      • Understanding complex biological processes in a cell is critical to improve the understanding of the biology of most diseases and will help to optimally intervene to mitigate the disease.
      • The state of the cells involved in the disease can be characterized using multimodal imaging and next-generation sequencing technologies.
      • Commonly, the phenotype information of the disease is in clinical records as unstructured text.
      • Machine learning is a technology that enables us to understand the relationship between the state of the cells and the phenotype of the disease.
      • In this review, the authors introduce common machine learning paradigms and illustrate how they can be used for the progress in biology and medicine.

      Introduction

      Humans, like other multicellular organisms, are made up of many cells, the fundamental units of life. The cells in a complex multicellular organism are organized into tissues, which are groups of similar cells that work together on a specific task. Organs are structures made up of 2 or more tissues organized to carry out a particular function. The cells in a tissue have specific functions, and the molecular processes in the cells are regulated to achieve this functionality. A disease condition happens when the regulatory process breaks down [
      • Davies K.E.
      • Nowak K.J.
      Molecular mechanisms of muscular dystrophies: old and new players.
      ,
      • Heineke J.
      • Molkentin J.D.
      Regulation of cardiac hypertrophy by intracellular signalling pathways.
      ,
      • Guggino W.B.
      • Stanton B.A.
      New insights into cystic fibrosis: molecular switches that regulate CFTR.
      ,
      • Kudlow B.A.
      • Kennedy B.K.
      • Monnat Jr., R.J.
      Werner and Hutchinson-Gilford progeria syndromes: mechanistic basis of human progeroid diseases.
      ,
      • Muoio D.M.
      • Newgard C.B.
      Molecular and metabolic mechanisms of insulin resistance and β-cell failure in type 2 diabetes.
      ,
      • Golemis E.A.
      • Scheet P.
      • Beck T.N.
      • et al.
      Molecular mechanisms of the preventable causes of cancer in the United States.
      ]. The breakdown in the regulatory mechanisms can happen in several ways, including single nucleotide mutations, insertions and deletions, copy number changes, epigenetic modifications, and changes in the chromatin conformations [
      • Byun J.A.
      • Melacini G.
      NMR methods to dissect the molecular mechanisms of disease-related mutations (DRMs): understanding how DRMs remodel functional free energy landscapes.
      ]. It is important to identify the alterations responsible for the disruption of these biochemical processes before an optimal intervention can be made.
      Recent advances in the next-generation sequencing (NGS) technologies and multimodal imaging enable us to comprehensively profile the genomic, epigenetic, and protein profiles of the normal and diseased tissues [
      • Carter T.C.
      • He M.M.
      Challenges of identifying clinically actionable genetic variants for precision medicine.
      ,
      • Collins F.S.
      • Varmus H.
      A new initiative on precision medicine.
      ,
      • Kitchen R.R.
      • Rozowsky J.S.
      • Gerstein M.B.
      • et al.
      Decoding neuroproteomics: integrating the genome, translatome and functional anatomy.
      ]. Together these molecular profiles comprise the genotype of the disease. Clinical practices and research leveraging electronic health records (EHRs) are widely implemented [
      • Overby C.L.
      • Pathak J.
      • Gottesman O.
      • et al.
      A collaborative approach to developing an electronic health record phenotyping algorithm for drug-induced liver injury.
      ,
      • Gottesman O.
      • Kuivaniemi H.
      • Tromp G.
      • et al.
      The Steele R, Nigam N, Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future.
      ] and therefore provide a comprehensive characterization of the phenotype of the disease [
      • Rasmussen-Torvik L.J.
      • Stallings S.C.
      • Gordon A.S.
      • et al.
      Design and anticipated outcomes of the eMERGE-PGx project: a multicenter pilot for preemptive pharmacogenomics in electronic health record systems.
      ].
      Using machine learning (ML) algorithms, we can integrate large scale heterogeneous data (e.g. clinical, imaging, and genomic data), and allow researchers to better understand the genetic basis of disease and identify the optimal therapeutic approach. Several research programs have been designed to take advantage of this approach, as exemplified by the All of Us research program, which aims to collect genome sequence data from 1 million participants to become a critical component of the precision medicine research platform [
      • Collins F.S.
      • Varmus H.
      A new initiative on precision medicine.
      ,
      • Zheng R.
      • Li M.
      • Liang Z.
      • et al.
      SinNLRR: a robust subspace clustering method for cell type detection by nonnegative and low rank representation.
      ].
      Given such massive datasets, one could in principle integrate molecular markers and other clinical correlates of disease to devise rules to identify appropriate clinical decision. However, as the number of scenarios increases, defining rules to accurately address all scenarios becomes very demanding, ineffiecient, and innaccurate. ML approaches, on the other hand, do not depend on rules defined by human experts. Instead, they process data in raw form and are programmed to learn to perform the task of identifying these rules. Usually, the tasks are not well-defined and a concise relationship between input and output may not be known. Instead several examples of input/output pairs are presented to the algorithm. The computational method then adjusts its parameters in order to provide the best approximation of the relationship implicit in the input/output examples. This approach has been successfully applied in solving important clinical challenges, such as detection of retinopathy [
      • Brown J.M.
      • Campbell J.P.
      • Beers A.
      • et al.
      Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks.
      ], melanoma [
      • Haenssle H.A.
      • Fink C.
      • Schneiderbauer R.
      • et al.
      Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists.
      ], and Parkinson disease [
      • Zhan A.
      • Mohan S.
      • Tarolli C.
      • et al.
      Using smartphones and machine learning to quantify Parkinson disease severity: the mobile Parkinson disease score.
      ], and in predicting readmission risk [
      • Benuzillo J.
      • Caine W.
      • Evans R.S.
      • et al.
      Predicting readmission risk shortly after admission for CABG surgery.
      ].
      In this review, we initially introduce the major categories of ML with their application in biology and medicine. We, then discuss deep learning and text mining as separate topics, and discuss the significant impact that these methods have made in the medical research recently. We, then review some of the successful applications of ML to complex problems in biology and medicine.

      Machine learning paradigm

      There are 3 basic types of learning paradigms in ML: supervised learning, unsupervised learning, and reinforcement learning (RL).

      Supervised Learning

      The supervised learning method can be used to design a predictor using samples with known class labels (training data) to predict the class of a new sample. For example, the gene expression profile of a tumor sample can be used to predict the prognosis of the patient using a predictor built from gene expression profiles of tumor samples with known patient prognosis [
      • Cuzick J.
      • Swanson G.P.
      • Fisher G.
      • et al.
      Prognostic value of an RNA expression signature derived from cell cycle proliferation genes in patients with prostate cancer: a retrospective study.
      ]. The supervised learning process is called a classification when the output labels are categorical (eg, good vs poor prognosis), and called regression when the output is continuous (eg, number of months survived). Linear and logistic regressions are examples of supervised learning algorithms. The predictor (to be precise, function to predict) learned could be a straight line or a very complex curve in a very high dimensional space depending on the nature of the data provided. There are several hundred different classification algorithms have been developed [
      • Ferńandez-Delgado M.
      • Cernadas E.
      • Barro S.E.
      Do we need hundreds of classifiers to solve real world classification problems?.
      ], as the No Free Lunch theorem guarantees that there is no such thing as an optimal classification algorithm that works for all datasets and clinical decisions to be modeled [
      • Wolpert D.H.
      • Macready W.G.
      No free lunch theorems for optimization.
      ].
      One important question that arises in supervised learning is the performance of the classification model in classifying a new sample. Empirical learning of classifiers attempts to infer a prediction model from only a finite set of samples. Therefore, a complicated model can learn not only the real patterns but also the noise in the training data to the extent that it negatively impacts the performance of the model on new data, a phenomenon called overfitting. When this happens, the average prediction error estimated during training could be much lower than the error on unknown samples. Regularization is a way to balance the classification error and the complexity of the model, to favor simpler over unnecessarily complicated models, thus imposing Occam's razor on the solution.
      Cross-validation is a strategy to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like overfitting or selection bias and to give an insight on how the model will generalize to an independent dataset. Typically, this involves partitioning the data into mutually exclusive subsets, developing the classifier model on one subset and validating the analysis on the remaining subset. To reduce variability, multiple rounds of cross-validation are performed using different partitions, and the validation results are combined over the rounds to give an estimate of the model's predictive performance [
      • Molinaro A.M.
      • Simon R.
      • Pfeiffer R.M.
      Prediction error estimation: a comparison of resampling methods.
      ,
      • Varma S.
      • Simon R.
      Bias in error estimation when using cross-validation for model selection.
      ].
      Often, the cross-validation procedure is also used to tune the hyperparameters (parameters that govern learning of the predictor) of the classification model. If this is indeed the case, then one risks the possibility of optimistically biasing the error estimation. One way to overcome this problem is to use a nested cross-validation methodology whereby inner cross-validation is used to tune the parameters and select the best model. Second, outer cross-validation is used to evaluate the model selected by the inner cross-validation and select hyperparameters for best prediction performance [
      • Cawley G.C.
      • Talbot N.L.C.
      On over-fitting in model selection and subsequent selection bias in performance evaluation.
      ].
      Supervised learning methods have been successfully used in the design of prognostic and predictive biomarkers. A variety of factors influence a patient’s clinical outcome, including intrinsic characteristics of the patient, the disease, and the effects of any treatments that the patient receives. The treatment-independent characteristics are called prognostic factors, whereas the biomarkers that can identify the response to a therapeutic regimen are called predictive biomarkers. Prognostic biomarkers identify the patients who are to be treated aggressively, and predictive biomarkers help to identify the optimal therapy for the patient. In addition, prognostic markers can also reveal the biology of the disease process. In cancer, these outcomes are usually death, recurrence of disease, or both. For such studies, distinct study phases have been proposed and usually begin with exploratory studies to identify promising prognostic markers and move toward larger, preferably prospective, confirmatory studies [
      • Ransohoff D.F.
      How to improve reliability and efficiency of research about molecular markers: roles of phases, guidelines, and study design.
      ]. Phase 1 studies are early exploratory analyses to generate hypotheses and to identify potential markers for further investigation. Phase 2 studies continue the exploratory investigation and assess the relationship between marker and prognosis. Phase 3 studies are large, confirmatory studies that state prior hypotheses and should certainly be protocol driven to ensure the highest level of evidence [
      • Kattan M.W.
      Judging new markers by their ability to improve predictive accuracy.
      ].
      The method used to derive the prognostic markers can have a strong influence on the results and on the interpretation of the study. Successful ML methods used in the design of prognostic markers include regression models and classification and regression trees [
      • Zlobec I.
      • Steele R.
      • Nigam N.
      • et al.
      A predictive model of rectal tumor response to preoperative radiotherapy using classification and regression tree methods.
      ]. Other approaches, for example, artificial neural networks (ANN), are sometimes used for the analysis of marker studies [
      • Zafeiris D.
      • Rutella S.
      • Ball G.R.
      An artificial neural network integrated pipeline for biomarker discovery using Alzheimer's disease as a case study.
      ,
      • Bertolaccini L.
      • Solli P.
      • Pardolesi A.
      • et al.
      An overview of the use of artificial neural networks in lung cancer research.
      ]. An exemplary application of supervised learning method is the 70-gene-panel–based predictor to identify low-risk patients accurately in early-stage breast cancer, thereby relieving them from the side effects of unnecessary chemotherapy [
      • Cardoso F.
      • van't Veer L.J.
      • Bogaerts J.
      • et al.
      70-gene signature as an aid to treatment decisions in early-stage breast cancer.
      ].

      Unsupervised Learning

      Unsupervised learning is a class of algorithms used to draw inferences on underlying organization and associations among the samples without relevant labels. Cluster analysis, association rule mining, and dimensionality reduction methods are all part of the unsupervised analysis.
      Cluster analysis is a method that groups the samples into distinct groups such that the samples in the same group are more similar to each other than samples in other groups. There are several clustering algorithms available with different pros and cons. The most commonly used clustering algorithms in biomedical research are the K-means, hierarchical clustering, and self-organizing maps [
      • D'Haeseleer P.
      How does gene expression clustering work?.
      ].
      It is often assumed that the clusters identified are truly meaningful in the sense that they correspond to a real underlying grouping. However, in reality, there is no agreed definition of what a true cluster is, and the interpretation strongly depends on the context and aim of the research [
      • Hennig C.
      What are the true clusters?.
      ].
      Association rule learning is an unsupervised analysis method for discovering interesting relations among features characterizing samples based on a chosen measure of interestingness [
      • Hipp J.
      • Güntzer U.
      • Nakhaeizadeh G.
      Algorithms for association rule mining—a general survey and comparison.
      ]. The final goal is to help machine mimic the human brain’s feature extraction and association capabilities from uncategorized data. Association rule learning algorithms have been successfully used to detect multimorbidity and comorbidity in old age [
      • Held F.P.
      • Blyth F.
      • Gnjidic D.
      • et al.
      Association rules analysis of comorbidity and multimorbidity: the :concord health and aging in men project.
      ].
      Unsupervised learning has been most used in identifying homogeneous subgroups of biological samples. In most diseases, the clinical outcome and therapeutic effectiveness are heterogeneous, and therefore, it is important to identify homogeneous subgroups of patients to identify the biological determinants of the disease and effectiveness of therapy. This approach has been very successful in the treatment of patients with breast cancer [
      • Perou C.M.
      • Sørlie T.
      • Eisen M.B.
      • et al.
      Molecular portraits of human breast tumours.
      ,
      • Sorlie T.
      • Perou C.M.
      • Tibshirani R.
      • et al.
      Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.
      ,
      • Dai X.
      • Li T.
      • Bai Z.
      • et al.
      Breast cancer intrinsic subtype classification, clinical use and future trends.
      ], but there is a growing realization that such an approach would be beneficial for all cancer types [
      • Cancer Genome Atlas Research Network
      Comprehensive genomic characterization defines human glioblastoma genes and core pathways.
      ,
      • Tothill R.W.
      • Tinker A.V.
      • George J.
      • et al.
      Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome.
      ,
      • Cancer Genome Atlas Network
      Comprehensive molecular characterization of human colon and rectal cancer.
      ,
      • Cancer Genome Atlas Research Network
      Comprehensive genomic characterization of squamous cell lung cancers.
      ].
      Another application of unsupervised learning algorithms is in the analysis of single-cell RNA-seq data to identify novel cell types in a tissue [
      • Kiselev V.Y.
      • Kirschner K.
      • Schaub M.T.
      • et al.
      SC3: consensus clustering of single-cell RNA-seq data.
      ,
      • Xu C.
      • Su Z.
      Identification of cell types from single-cell transcriptomes using a novel clustering method.
      ,
      • Butler A.
      • Hoffman P.
      • Smibert P.
      • et al.
      Integrating single-cell transcriptomic data across different conditions, technologies, and species.
      ].

      Reinforcement Learning

      RL is an approach that is categorized between supervised and unsupervised learning. It does not rely on a set of labeled training data and so is not a supervised learning algorithm, but it is also not unsupervised learning because it relies on the reward that the learning algorithm needs to maximize. The learning approach is to find the right actions to take in different situations to maximize the reward.
      RL method has been successfully applied to optimizing antiretroviral therapy in human immunodeficiency virus [
      • Parbhoo S.
      • Bogojeska J.
      • Zazzi M.
      • et al.
      Combining kernel and model based learning for HIV therapy selection.
      ] and in determining the best approach to managing sepsis [
      • Komorowski M.
      • Celi L.A.
      • Badawi O.
      • et al.
      The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care.
      ]. Unlike the supervised learning method that produces one-time predictions, the decision made by the RL algorithm affects both the patient’s future health and the patient’s future treatment option.
      In recent times, it is fortunate to have many available treatment options for any given disease condition, and figuring out the best treatment policy to use for a particular patient is challenging for clinicians. In theory, this is a problem that can be addressed using RL algorithm. In a normal RL context to evaluate a policy, one would simply have the agent make decisions and then compute the average reward based on the outcome. Such an approach would not be ethical because of using human subjects in training RL algorithm. Therefore, it is necessary to learn from observational historical data. In RL literature, this is referred to as “Off-Policy Evaluation,” and there are many RL algorithms that can learn the optimal policy effectively in this context [
      • Gottesman O.
      • Johansson F.
      • Meier J.
      • et al.
      Evaluating reinforcement learning algorithms in observational health settings.
      ]. Despite this challenge, RL algorithms are being used to solve major challenges in health care [
      • Liu Y.
      • Logan B.
      • Liu N.
      • et al.
      Deep reinforcement learning for dynamic treatment regimes on medical registry data.
      ].

      Text mining for inference from literature

      The ever-increasing number of scientific publications in the MEDLINE database, which holds millions of references of life sciences-related articles, is a wealth for scientists to garner relevant information. However, getting access to the most relevant information can be tedious because of the lack of intelligent search tools. For example, we have the ability to identify the genomic alterations associated with cancer using sequencing technology at reasonable costs. However, identifying the clinically relevant mutations from the list can be challenging. There are databases for manually curated lists of disease-related and therapeutically targetable genomic alterations, but these are time consuming to construct and often incomplete [
      • Patterson S.E.
      • Statz C.M.
      • Yin T.
      • et al.
      Utility of the JAX Clinical Knowledgebase in capture and assessment of complex genomic cancer data.
      ].
      ML–based natural language processing (NLP) techniques can be used to extract structured information from unstructured text [
      • Roden D.M.
      • Denny J.C.
      Integrating electronic health record genotype and phenotype datasets to transform patient care.
      ] and then can be curated and categorized based on related terms or keywords [
      • Sebastiani F.
      Machine learning in automated text categorization.
      ]. These structured data can then be used as features for ML algorithms to predict disease progression and drug-drug interactions [
      • Zeng Z.
      • Li X.
      • Espino S.
      • et al.
      Contralateral breast cancer event detection using nature language processing.
      ].
      There are several applications that use NLP technique in health care. For example, DeepPhe is an open source implementation of an NLP method for obtaining phenotype-related information of patients with cancer from EHRs [
      • Savova G.K.
      • Tseytlin E.
      • Finan S.
      • et al.
      DeepPhe: a natural language processing system for extracting cancer phenotypes from clinical records.
      ]. Another NLP-based approach demonstrated superior performance in the identification of a contralateral breast cancer event from clinical reports [
      • Zeng Z.
      • Li X.
      • Espino S.
      • et al.
      Contralateral breast cancer event detection using nature language processing.
      ]. An NLP-based approach is also used to organize and classify cancer-related literature into biologically meaningful categories that can help scientists retrieve the desired information efficiently [
      • Baker S.
      • Ali I.
      • Silins I.
      • et al.
      Cancer Hallmarks Analytics Tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer.
      ].
      However, as the ML or Artificial Intelligence (AI) system gets deployed using the initial dataset for training, the datasets also get updated frequently because of the advances in technology. Efforts are being applied to produce datasets (eg, Dream Challenge [
      • Angermueller C.
      • Pärnamaa T.
      • Parts L.
      • et al.
      Deep learning for computational biology.
      ]) to establish standards for benchmarking, testing, and evaluating the performances of various cancer-related algorithms by overcoming the issues of data sharing, security, and standardization.

      Deep learning

      Deep learning or deep neural networks (DNN) is another ML technique, which has gained significant momentum in recent years. DNNs are a class of artificial neural networks (ANN) with a large number of hidden layers and specific activation functions.
      The basic building block of an ANN is a neuron, which receives a set of values as input, and produces a single output based on them. The most common form of such a function is achieved by summing the linear combination of all the inputs, each scaled by a corresponding weight, and passing the output through a nonlinear function. A set of such neurons forms a layer, and several such layers are interposed between the input layer and the output layer. The layers between the input and output layers are called hidden layers, and when the number of the hidden layer is many more than one, it is technically referred to as a DNN.
      DNNs have become one of the most used ML methods owing to improvements in computing speed combined with their data-driven approach, and a lack of the need for feature engineering. They have in particular been very successful at image classification, speech recognition, and NLP [
      • LeCun Y.
      • Bengio Y.
      • Hinton G.
      Deep learning.
      ,
      • Goodfellow I.J.
      • Erhan D.
      • Carrier P.L.
      • et al.
      Challenges in representation learning: a report on three machine learning contests.
      ,
      • Bengio Y.
      • Lee H.
      Editorial introduction to the neural networks special issue on deep learning of representations.
      ].
      Convolutional neural networks (CNN), a type of DNN, have gained popularity in computer vision because of their ability to summarize local image features in each layer and feed it to the next layer for further abstraction. The basic idea of a CNN is that features in certain data types (such as images) are hierarchical on different length scales. For example, on a small scale, an image is constituted of lines and edges with different contrast. On an intermediate scale, it contains texture structure (eg, furry vs smooth). A higher length scale may correspond to an eye or an ear, and at the larger scales, objects such as faces could be recognized. CNN is structured such that each neuron is specialized in some specific feature depending on its depth in the network. To achieve this, neurons within layers are assumed to have a spatial organization in relation to each other (eg, 2-dimensional or 3-dimensional). The weights of each neuron are set locally by connecting it to the corresponding neuron in the previous layer and its neighboring neurons. By using similar local weights for all neurons within each layer, this approach dramatically reduces the number of model parameters and overcomes overfitting.
      CNNs are suitable for processing visual and other 2-dimensional data and have demonstrated superior results in both image and speech applications [
      • LeCun Y.
      • Boser B.
      • Denker J.S.
      • et al.
      Backpropagation applied to handwritten zip code recognition.
      ,

      He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

      ,
      • Tizhoosh H.R.
      • Pantanowitz L.
      Artificial intelligence and digital pathology: challenges and opportunities.
      ]. They can be trained with standard backpropagation and are easier to train than other regular, deep, feed-forward neural networks. In addition to imaging applications in health care, CNNs are also used to call mutation detections by converting sequencing data to images [
      • Sahraeian S.M.E.
      • Liu R.
      • Lau B.
      • et al.
      Deep convolutional neural networks for accurate somatic mutation detection.
      ,
      • Poplin R.
      • Chang P.C.
      • Alexander D.
      • et al.
      A universal SNP and small-indel variant caller using deep neural networks.
      ].
      DNNs tend to have many layers with a large number of parameters. In general, there should be 10 times more training cases than the Vapnik-Chervonenkis (VC) dimension of the hypothesis set. In the case of a DNN, the VC dimension is equal to the number of weights. Therefore, one should have 10 times the number of training cases than the weights [
      • Abu-Mostafa Y.S.
      • Magdon-Ismail M.
      • Lin H.-T.
      Learning from data.
      ]. A sample size smaller than this can result in overfitting and therefore will not generalize well. Given the cost of labeling data, several procedures have been developed to mitigate this issue. The most common approach is to use a regularization approach that forces some parameters of the model to zero. A dropout is an approach to regularization in neural networks that helps to reduce interdependent learning among the neurons. Using network architectures that are aware of the features in the data also can reduce the number of parameters. For example, in CNNs, parameter space is significantly reduced by using the same parameters for all the neurons within the same layer. Other approaches to compensate for the dearth of data are to use data augmentation (eg, image rotation, normalization, reflection, and distortion), use of attention-based classification, multi-instance learning, and weakly supervised learning.
      One major drawback of using DNNs is the “black-box” feature or a lack of clarity on why the network has made the decision. That is, even when a network learns some task very well, it is not very easy to understand how the network has reached its conclusions. In recent years, many efforts have been put into exploring the procedure through which the network learns. One approach is to convert the weights of the neural network into an image format for visualization that may yield insights into the logic of the decision. Another approach is to find a set of inputs that maximize the output of specific neurons. By repeating such an approach, it has been shown that certain neurons in the network become specialized to respond to specific inputs [
      • Yosinski J.
      • Clune J.
      • Nguyen A.
      • et al.
      Understanding neural networks through deep visualization.
      ].
      There have been significant advances in use of DNN for clinical applications. For example, deep learning was used to predict anticancer drug combination outcome directly from gene expression data in cell line experiments [
      • Preuer K.
      • Lewis R.P.
      • Hochreiter S.
      • et al.
      DeepSynergy: predicting anti-cancer drug synergy with deep learning.
      ]. CNNs have also been extensively used for image analysis in clinical settings. For example, they have been used for classification of skin lesions into malignant and benign and have achieved dermatologist-level accuracies [
      • Esteva A.
      • Kuprel B.
      • Novoa R.A.
      • et al.
      Dermatologist-level classification of skin cancer with deep neural networks.
      ]. In another work, Khosravi and colleagues [
      • Khosravi P.
      • Kazemi E.
      • Zhan Q.
      • et al.
      Deep learning enables robust assessment and selection of human blastocysts after in vitro fertilization.
      ] applied a CNN framework to blastocyst images from in vitro fertilization cases and outperformed individual embryologists in predicting embryo quality for successful pregnancy.

      Lifelong learning

      The classic ML paradigm starts with a labeled dataset to which a learning algorithm is applied to produce a model without considering any previously learned knowledge. This approach may need a large number of training examples depending on the complexity of the learning algorithm, and can be quite limited; especially in the clinical setting, where data points are continuously being acquired. Therefore, it is necessary to do lifelong ML or continuous learning, which tries to mimic “human learning” to build a lifelong learning machine. Ideally, this approach should also be able to discover new tasks and learn on the job in open environments in a self-supervised manner. One could argue that without the lifelong learning capability, ML systems will probably never be truly intelligent [
      • Parisi G.I.
      • Kemker R.
      • Part J.L.
      • et al.
      Continual lifelong learning with neural networks: a review.
      ].

      Caveats in the application of machine learning

      ML depends on the quality, objectivity, and size of the data sets used to generate the model, and a lack of truly random or comprehensive data can result in bias. Eliminating harmful biases is essential because ML is often applied to decisions with serious implications, such as diagnostics and therapeutic interventions in medical environments. The data should be representative of different races, genders, backgrounds, and cultures that could be adversely affected. Scientists developing the algorithms should shape data samples in a way that minimizes bias. Left unchecked, feeding biased data to self-learning systems can lead to unintended and sometimes dangerous outcomes. An example of bias introduced in an ML application is the Correctional Offender Management Profiling for Alternative Sanctions, a machine-learning system that makes recommendations for criminal sentencing, by predicting which people are likely to reoffend. Its training model includes race as an input parameter, but not more extensive data points like past arrests. As a result, it has an inherent racial bias that made it unjust and unethical [
      • Larson J.
      • Surya M.
      • Lauren K.
      • et al.
      ]. Such biases can be an issue even without the involvement of ML algorithms as exemplified by the gender bias in the diagnosis and treatment of coronary heart disease [
      • Beery T.A.
      Gender bias in the diagnosis and treatment of coronary artery disease.
      ], but ML approaches designed with the help of such datasets may exacerbate the problem as policymaking always lags behind innovation.
      Modern ML algorithms require far more data for learning because of the large number of parameters in their models. As algorithms make inferences based on these big datasets, there is a higher likelihood of sensitive information being unintentionally included. In addition, sensitive information can be predicted about individuals based on the shared information, thereby threatening individual privacy. The current solution to this problem is to limit access to data which inevitably thwarts innovations in the field. A better solution to this problem needs to be found. A variety of approaches can be used to detect and mitigate the problem of bias in ML systems by designing algorithms that can detect [
      • Chen I.
      • Johansson F.D.
      • Sontag D.
      Why is my classifier discriminatory? Arxiv 2018.
      ] and mitigate biases [

      Amini A, Soleimany A, Schwarting W, et al. Uncovering and mitigating algorithmic bias through learned latent structure. In AAAI/ACM Conference on Artificial Intelligence, Ethics and Society. 2019. Honolulu, Hawaii.

      ].

      Summary

      The impact of ML in biology and medicine is accelerating at an increasing pace and is already generating actionable insights (Table 1). Personalized predictive health care is the future of medicine, and ML will play an integral part of this goal. The increased use of such systems will generate more data that can be used to generate better models, and continuous learning algorithms will improve at a much faster pace. The future of biology and medicine looks very promising with deep profiling of patients and advances in machine learning.
      Table 1Machine Learning Paradigms, Algorithms, and Their Applications
      Learning CategoryML MethodCharacteristicsScope of ApplicationsReferences
      Supervised learningK-Nearest NeighborDoes not involve training (lazy learning)Image classification, predicting the molecular subtype of cancersLi et al
      • Li C.
      • Zhang S.
      • Zhang H.
      • et al.
      Using the K-nearest neighbor algorithm for the classification of lymph node metastasis in gastric cancer.
      , 2012
      Naïve BayesAssumes features are independent (conditioned on class membership)Cancer type predictionsBanu & Thirumalaikolundusubramanian
      • Banu A.B.
      • Thirumalaikolundusubramanian P.
      Comparison of Bayes classifiers for breast cancer classification.
      , 2018
      Decision TreesProvides interpretable rules of classificationPrognostic markers, predictive markersGeurts et al
      • Geurts P.
      • Irrthum A.
      • Wehenkel L.
      Supervised learning with decision tree-based methods in computational and systems biology.
      , 2009
      Support Vector MachinesLinear and nonlinear classification. Maximum-margin criteria provide robust generalization abilityPrognostic markers, predictive markersWu et al
      • Wu T.
      • Wang Y.
      • Jiang R.
      • et al.
      A pathways-based prediction model for classifying breast cancer subtypes.
      , 2017
      Neural NetworksCapable of learning complex problems with little fine-tuning, computationally intensive; tends to be difficult to interpretCancer risk prediction, identifying new chemotypesMueller et al
      • Mueller R.
      • Dawson E.S.
      • Meiler J.
      • et al.
      Discovery of 2-(2-benzoxazoyl amino)-4-aryl-5-cyanopyrimidine as negative allosteric modulators (NAMs) of metabotropic glutamate receptor 5 (mGlu(5)): from an artificial neural network virtual screen to an in vivo tool compound.
      , 2012
      Deep LearningEssentially, neural networks with many hidden layersIdeal for computer vision applications (pathology images) and text mining (EHR)Angermueller et al
      • Angermueller C.
      • Pärnamaa T.
      • Parts L.
      • et al.
      Deep learning for computational biology.
      , 2016; Tang et al
      • Tang B.
      • Pan Z.
      • Yin K.
      • et al.
      Recent advances of deep learning in bioinformatics and computational biology.
      , 2019; Webb
      • Webb S.
      Deep learning for biology.
      , 2018
      Unsupervised learningK-means clusteringPartitions the observations in predefined k clusters in which each observation belongs to the cluster with the nearest meanRNA-seq analysis, sequence clustering, image cytometryNugent & Meila
      • Nugent R.
      • Meila M.
      An overview of clustering applied to molecular biology.
      , 2010
      Hierarchical clusteringProvides hierarchical organization of samples and clusters enable better visualization of the structure in the dataVery popular in biological domain because of the excellent visualization capabilitiesRonan et al
      • Ronan T.
      • Qi Z.
      • Naegle K.M.
      Avoiding common pitfalls when clustering biological data.
      , 2016
      Spectral clusteringGlobal and local structure of the similarities among samples determines the clusteringSingle-cell RNA-sequencing data analysisZheng et al
      • Zheng R.
      • Li M.
      • Liang Z.
      • et al.
      SinNLRR: a robust subspace clustering method for cell type detection by nonnegative and low rank representation.
      , 2019; Kiselev et al
      • Kiselev V.Y.
      • Kirschner K.
      • Schaub M.T.
      • et al.
      SC3: consensus clustering of single-cell RNA-seq data.
      , 2017
      RLQ-learningFinds a policy that maximizes the expected value of the total reward over all successive steps starting from the current stateBehavioral ecologyFrankenhuis et al
      • Frankenhuis W.E.
      • Panchanathan K.
      • Barto A.G.
      Enriching behavioral ecology with reinforcement learning methods.
      , 2019
      Temporal differenceLearn by bootstrapping from the current estimate of the value functionModels for learning in biological systemsNeftci & Averbeck
      • Neftci E.O.
      • Averbeck B.B.
      Reinforcement learning in artificial and biological systems.
      , 2019
      OthersGenerative Adversarial NetworksTwo neural networks contest with each other to generate new data with the same statistics as the training setUnderstanding the organization of biological systems. generating realistic datasetsWang et al
      • Wang X.
      • Dizaji K.G.
      • Huang H.
      Conditional generative adversarial network for gene expression inference.
      , 2018
      Text miningA set of techniques that models and structures the information content of textual sources for obtaining informationExtracting information for EHROhno-Machado et al
      • Ohno-Machado L.
      • Nadkarni P.
      • Johnson K.
      Natural language processing: algorithms and tools to extract computable information from EHRs and from the biomedical literature.
      , 2013

      References

        • Davies K.E.
        • Nowak K.J.
        Molecular mechanisms of muscular dystrophies: old and new players.
        Nat Rev Mol Cell Biol. 2006; 7: 762-773
        • Heineke J.
        • Molkentin J.D.
        Regulation of cardiac hypertrophy by intracellular signalling pathways.
        Nat Rev Mol Cell Biol. 2006; 7: 589-600
        • Guggino W.B.
        • Stanton B.A.
        New insights into cystic fibrosis: molecular switches that regulate CFTR.
        Nat Rev Mol Cell Biol. 2006; 7: 426-436
        • Kudlow B.A.
        • Kennedy B.K.
        • Monnat Jr., R.J.
        Werner and Hutchinson-Gilford progeria syndromes: mechanistic basis of human progeroid diseases.
        Nat Rev Mol Cell Biol. 2007; 8: 394-404
        • Muoio D.M.
        • Newgard C.B.
        Molecular and metabolic mechanisms of insulin resistance and β-cell failure in type 2 diabetes.
        Nat Rev Mol Cell Biol. 2008; 9: 193-205
        • Golemis E.A.
        • Scheet P.
        • Beck T.N.
        • et al.
        Molecular mechanisms of the preventable causes of cancer in the United States.
        Genes Dev. 2018; 32: 868-902
        • Byun J.A.
        • Melacini G.
        NMR methods to dissect the molecular mechanisms of disease-related mutations (DRMs): understanding how DRMs remodel functional free energy landscapes.
        Methods. 2018; 148: 19-27
        • Carter T.C.
        • He M.M.
        Challenges of identifying clinically actionable genetic variants for precision medicine.
        J Healthc Eng. 2016; 2016
        • Collins F.S.
        • Varmus H.
        A new initiative on precision medicine.
        N Engl J Med. 2015; 372: 793-795
        • Kitchen R.R.
        • Rozowsky J.S.
        • Gerstein M.B.
        • et al.
        Decoding neuroproteomics: integrating the genome, translatome and functional anatomy.
        Nat Neurosci. 2014; 17: 1491-1499
        • Overby C.L.
        • Pathak J.
        • Gottesman O.
        • et al.
        A collaborative approach to developing an electronic health record phenotyping algorithm for drug-induced liver injury.
        J Am Med Inform Assoc. 2013; 20: e243-e252
        • Gottesman O.
        • Kuivaniemi H.
        • Tromp G.
        • et al.
        The Steele R, Nigam N, Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future.
        Genet Med. 2013; 15: 761-771
        • Rasmussen-Torvik L.J.
        • Stallings S.C.
        • Gordon A.S.
        • et al.
        Design and anticipated outcomes of the eMERGE-PGx project: a multicenter pilot for preemptive pharmacogenomics in electronic health record systems.
        Clin Pharmacol Ther. 2014; 96: 482-489
        • Zheng R.
        • Li M.
        • Liang Z.
        • et al.
        SinNLRR: a robust subspace clustering method for cell type detection by nonnegative and low rank representation.
        Bioinformatics. 2019; ([pii:btz139])
        • Brown J.M.
        • Campbell J.P.
        • Beers A.
        • et al.
        Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks.
        JAMA Ophthalmol. 2018; 136: 803-810
        • Haenssle H.A.
        • Fink C.
        • Schneiderbauer R.
        • et al.
        Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists.
        Ann Oncol. 2018; 29: 1836-1842
        • Zhan A.
        • Mohan S.
        • Tarolli C.
        • et al.
        Using smartphones and machine learning to quantify Parkinson disease severity: the mobile Parkinson disease score.
        JAMA Neurol. 2018; 75: 876-880
        • Benuzillo J.
        • Caine W.
        • Evans R.S.
        • et al.
        Predicting readmission risk shortly after admission for CABG surgery.
        J Card Surg. 2018; 33: 163-170
        • Cuzick J.
        • Swanson G.P.
        • Fisher G.
        • et al.
        Prognostic value of an RNA expression signature derived from cell cycle proliferation genes in patients with prostate cancer: a retrospective study.
        Lancet Oncol. 2011; 12: 245-255
        • Ferńandez-Delgado M.
        • Cernadas E.
        • Barro S.E.
        Do we need hundreds of classifiers to solve real world classification problems?.
        J Mach Learn Res. 2014; 15: 3133-3181
        • Wolpert D.H.
        • Macready W.G.
        No free lunch theorems for optimization.
        IEEE Trans Evol Comput. 1997; 1: 67-82
        • Molinaro A.M.
        • Simon R.
        • Pfeiffer R.M.
        Prediction error estimation: a comparison of resampling methods.
        Bioinformatics. 2005; 21: 3301-3307
        • Varma S.
        • Simon R.
        Bias in error estimation when using cross-validation for model selection.
        BMC Bioinformatics. 2006; 7: 91
        • Cawley G.C.
        • Talbot N.L.C.
        On over-fitting in model selection and subsequent selection bias in performance evaluation.
        J Mach Learn Res. 2010; 11: 2079-2107
        • Ransohoff D.F.
        How to improve reliability and efficiency of research about molecular markers: roles of phases, guidelines, and study design.
        J Clin Epidemiol. 2007; 60: 1205-1219
        • Kattan M.W.
        Judging new markers by their ability to improve predictive accuracy.
        J Natl Cancer Inst. 2003; 95: 634-635
        • Zlobec I.
        • Steele R.
        • Nigam N.
        • et al.
        A predictive model of rectal tumor response to preoperative radiotherapy using classification and regression tree methods.
        Clin Cancer Res. 2005; 11: 5440-5443
        • Zafeiris D.
        • Rutella S.
        • Ball G.R.
        An artificial neural network integrated pipeline for biomarker discovery using Alzheimer's disease as a case study.
        Comput Struct Biotechnol J. 2018; 16: 77-87
        • Bertolaccini L.
        • Solli P.
        • Pardolesi A.
        • et al.
        An overview of the use of artificial neural networks in lung cancer research.
        J Thorac Dis. 2017; 9: 924-931
        • Cardoso F.
        • van't Veer L.J.
        • Bogaerts J.
        • et al.
        70-gene signature as an aid to treatment decisions in early-stage breast cancer.
        N Engl J Med. 2016; 375: 717-729
        • D'Haeseleer P.
        How does gene expression clustering work?.
        Nat Biotechnol. 2005; 23: 1499-1501
        • Hennig C.
        What are the true clusters?.
        Pattern Recognit Lett. 2015; 64: 53-62
        • Hipp J.
        • Güntzer U.
        • Nakhaeizadeh G.
        Algorithms for association rule mining—a general survey and comparison.
        SIGKDD Explor. 2000; 2: 58-64
        • Held F.P.
        • Blyth F.
        • Gnjidic D.
        • et al.
        Association rules analysis of comorbidity and multimorbidity: the :concord health and aging in men project.
        J Gerontol A Biol Sci Med Sci. 2016; 71: 625-631
        • Perou C.M.
        • Sørlie T.
        • Eisen M.B.
        • et al.
        Molecular portraits of human breast tumours.
        Nature. 2000; 406: 747-752
        • Sorlie T.
        • Perou C.M.
        • Tibshirani R.
        • et al.
        Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.
        Proc Natl Acad Sci U S A. 2001; 98: 10869-10874
        • Dai X.
        • Li T.
        • Bai Z.
        • et al.
        Breast cancer intrinsic subtype classification, clinical use and future trends.
        Am J Cancer Res. 2015; 5: 2929-2943
        • Cancer Genome Atlas Research Network
        Comprehensive genomic characterization defines human glioblastoma genes and core pathways.
        Nature. 2008; 455: 1061-1068
        • Tothill R.W.
        • Tinker A.V.
        • George J.
        • et al.
        Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome.
        Clin Cancer Res. 2008; 14: 5198-5208
        • Cancer Genome Atlas Network
        Comprehensive molecular characterization of human colon and rectal cancer.
        Nature. 2012; 487: 330-337
        • Cancer Genome Atlas Research Network
        Comprehensive genomic characterization of squamous cell lung cancers.
        Nature. 2012; 489: 519-525
        • Kiselev V.Y.
        • Kirschner K.
        • Schaub M.T.
        • et al.
        SC3: consensus clustering of single-cell RNA-seq data.
        Nat Methods. 2017; 14: 483-486
        • Xu C.
        • Su Z.
        Identification of cell types from single-cell transcriptomes using a novel clustering method.
        Bioinformatics. 2015; 31: 1974-1980
        • Butler A.
        • Hoffman P.
        • Smibert P.
        • et al.
        Integrating single-cell transcriptomic data across different conditions, technologies, and species.
        Nat Biotechnol. 2018; 36: 411-420
        • Parbhoo S.
        • Bogojeska J.
        • Zazzi M.
        • et al.
        Combining kernel and model based learning for HIV therapy selection.
        AMIA Jt Summits Transl Sci Proc. 2017; 2017: 239-248
        • Komorowski M.
        • Celi L.A.
        • Badawi O.
        • et al.
        The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care.
        Nat Med. 2018; 24: 1716-1720
        • Gottesman O.
        • Johansson F.
        • Meier J.
        • et al.
        Evaluating reinforcement learning algorithms in observational health settings.
        arxiv. 2018;
        • Liu Y.
        • Logan B.
        • Liu N.
        • et al.
        Deep reinforcement learning for dynamic treatment regimes on medical registry data.
        Healthc Inform. 2017; 2017: 380-385
        • Patterson S.E.
        • Statz C.M.
        • Yin T.
        • et al.
        Utility of the JAX Clinical Knowledgebase in capture and assessment of complex genomic cancer data.
        NPJ Precis Oncol. 2019; 3: 2
        • Roden D.M.
        • Denny J.C.
        Integrating electronic health record genotype and phenotype datasets to transform patient care.
        Clin Pharmacol Ther. 2016; 99: 298-305
        • Sebastiani F.
        Machine learning in automated text categorization.
        ACM Comput Surv. 2002; 34: 1-47
        • Zeng Z.
        • Li X.
        • Espino S.
        • et al.
        Contralateral breast cancer event detection using nature language processing.
        AMIA Annu Symp Proc. 2018; 2017: 1885-1892
        • Savova G.K.
        • Tseytlin E.
        • Finan S.
        • et al.
        DeepPhe: a natural language processing system for extracting cancer phenotypes from clinical records.
        Cancer Res. 2017; 77: e115-e118
        • Baker S.
        • Ali I.
        • Silins I.
        • et al.
        Cancer Hallmarks Analytics Tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer.
        Bioinformatics. 2017; 33: 3973-3981
        • Angermueller C.
        • Pärnamaa T.
        • Parts L.
        • et al.
        Deep learning for computational biology.
        Mol Syst Biol. 2016; 12: 878
        • LeCun Y.
        • Bengio Y.
        • Hinton G.
        Deep learning.
        Nature. 2015; 521: 436-444
        • Goodfellow I.J.
        • Erhan D.
        • Carrier P.L.
        • et al.
        Challenges in representation learning: a report on three machine learning contests.
        Neural Netw. 2015; 64: 59-63
        • Bengio Y.
        • Lee H.
        Editorial introduction to the neural networks special issue on deep learning of representations.
        Neural Netw. 2015; 64: 1-3
        • LeCun Y.
        • Boser B.
        • Denker J.S.
        • et al.
        Backpropagation applied to handwritten zip code recognition.
        Neural Comput. 1989; 1: 541-551
      1. He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

        • Tizhoosh H.R.
        • Pantanowitz L.
        Artificial intelligence and digital pathology: challenges and opportunities.
        J Pathol Inform. 2018; 9: 38
        • Sahraeian S.M.E.
        • Liu R.
        • Lau B.
        • et al.
        Deep convolutional neural networks for accurate somatic mutation detection.
        Nat Commun. 2019; 10: 1041
        • Poplin R.
        • Chang P.C.
        • Alexander D.
        • et al.
        A universal SNP and small-indel variant caller using deep neural networks.
        Nat Biotechnol. 2018; 36: 983-987
        • Abu-Mostafa Y.S.
        • Magdon-Ismail M.
        • Lin H.-T.
        Learning from data.
        2012
        • Yosinski J.
        • Clune J.
        • Nguyen A.
        • et al.
        Understanding neural networks through deep visualization.
        arxiv.org, 2015
        • Preuer K.
        • Lewis R.P.
        • Hochreiter S.
        • et al.
        DeepSynergy: predicting anti-cancer drug synergy with deep learning.
        Bioinformatics. 2018; 34: 1538-1546
        • Esteva A.
        • Kuprel B.
        • Novoa R.A.
        • et al.
        Dermatologist-level classification of skin cancer with deep neural networks.
        Nature. 2017; 542: 115-118
        • Khosravi P.
        • Kazemi E.
        • Zhan Q.
        • et al.
        Deep learning enables robust assessment and selection of human blastocysts after in vitro fertilization.
        NPJ Digit Med. 2019; 2
        • Parisi G.I.
        • Kemker R.
        • Part J.L.
        • et al.
        Continual lifelong learning with neural networks: a review.
        Arxiv. 2019;
        • Larson J.
        • Surya M.
        • Lauren K.
        • et al.
        How we analyzed the COMPAS recidivism algorithm.
        2016 (Available at:)
        • Beery T.A.
        Gender bias in the diagnosis and treatment of coronary artery disease.
        Heart Lung. 1995; 24: 427-435
        • Chen I.
        • Johansson F.D.
        • Sontag D.
        Why is my classifier discriminatory? Arxiv 2018.
        Advances in Neural Information Processing Systems 31, 2018: 3543-3554
      2. Amini A, Soleimany A, Schwarting W, et al. Uncovering and mitigating algorithmic bias through learned latent structure. In AAAI/ACM Conference on Artificial Intelligence, Ethics and Society. 2019. Honolulu, Hawaii.

        • Li C.
        • Zhang S.
        • Zhang H.
        • et al.
        Using the K-nearest neighbor algorithm for the classification of lymph node metastasis in gastric cancer.
        Comput Math Methods Med. 2012; 2012: 876545
        • Banu A.B.
        • Thirumalaikolundusubramanian P.
        Comparison of Bayes classifiers for breast cancer classification.
        Asian Pac J Cancer Prev. 2018; 19: 2917-2920
        • Geurts P.
        • Irrthum A.
        • Wehenkel L.
        Supervised learning with decision tree-based methods in computational and systems biology.
        Mol Biosyst. 2009; 5: 1593-1605
        • Wu T.
        • Wang Y.
        • Jiang R.
        • et al.
        A pathways-based prediction model for classifying breast cancer subtypes.
        Oncotarget. 2017; 8: 58809-58822
        • Mueller R.
        • Dawson E.S.
        • Meiler J.
        • et al.
        Discovery of 2-(2-benzoxazoyl amino)-4-aryl-5-cyanopyrimidine as negative allosteric modulators (NAMs) of metabotropic glutamate receptor 5 (mGlu(5)): from an artificial neural network virtual screen to an in vivo tool compound.
        ChemMedChem. 2012; 7: 406-414
        • Tang B.
        • Pan Z.
        • Yin K.
        • et al.
        Recent advances of deep learning in bioinformatics and computational biology.
        Front Genet. 2019; 10: 214
        • Webb S.
        Deep learning for biology.
        Nature. 2018; 554: 555-557
        • Nugent R.
        • Meila M.
        An overview of clustering applied to molecular biology.
        Methods Mol Biol. 2010; 620: 369-404
        • Ronan T.
        • Qi Z.
        • Naegle K.M.
        Avoiding common pitfalls when clustering biological data.
        Sci Signal. 2016; 9: re6
        • Frankenhuis W.E.
        • Panchanathan K.
        • Barto A.G.
        Enriching behavioral ecology with reinforcement learning methods.
        Behav Processes. 2019; 161: 94-100
        • Neftci E.O.
        • Averbeck B.B.
        Reinforcement learning in artificial and biological systems.
        Nat Mach Intell. 2019; 1: 133-143
        • Wang X.
        • Dizaji K.G.
        • Huang H.
        Conditional generative adversarial network for gene expression inference.
        Bioinformatics. 2018; 34: 1603-1611
        • Ohno-Machado L.
        • Nadkarni P.
        • Johnson K.
        Natural language processing: algorithms and tools to extract computable information from EHRs and from the biomedical literature.
        J Am Med Inform Assoc. 2013; 20: 805