<aside> 📜
© 2026 Denis Jacob Machado. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
</aside>
This page compiles key concepts and questions on the contemporary debate around AI and machine learning applied to biological and biomedical research. Questions are organized from broader foundational concepts to more specific applications.
These are nested concepts. Artificial intelligence (AI) is the broadest category — it refers to any system designed to perform tasks that typically require human intelligence, such as reasoning or decision-making. Machine learning (ML) is a subset of AI in which systems learn from data rather than being explicitly programmed with rules. Deep learning (DL) is a subset of machine learning that uses artificial neural networks with multiple layers to process information — this is where the term "deep" comes from. Deep learning typically handles complex, unstructured data such as images and text, whereas traditional machine learning often works well with structured, tabular data.
An artificial neuron is the basic computational unit of a neural network, loosely inspired by biological neurons. It receives multiple inputs, applies numerical weights to them, sums the weighted values (a matrix operation), and passes the result through an activation function that acts as a threshold to determine whether the neuron "fires" or activates. This output can then serve as input to other neurons in subsequent layers.
Generative models are typically deep learning models that learn the underlying statistical distribution of training data and then produce new outputs that resemble that data. Rather than classifying or predicting existing data points, they generate entirely new examples — such as protein 3D structures, DNA sequences, synthetic medical images, or novel text. Examples include GANs (Generative Adversarial Networks) and models like ProtGPT2.
A Bayesian network is a probabilistic graphical model represented as a directed acyclic graph (DAG). Nodes represent variables, directed edges encode probabilistic dependencies between them, and each node can be derived from one or more parent nodes. The network uses Bayes' theorem to update beliefs about variables given new evidence, enabling principled inference under uncertainty. Bayesian networks are not deep learning models — they belong to probabilistic machine learning and can be used in both supervised and unsupervised modes depending on the task.
AI-guided probabilistic inference refers to the combination of AI or machine learning techniques with probabilistic reasoning frameworks. Rather than running standard Bayesian inference alone, AI methods — such as neural networks — guide or accelerate the exploration of the probability space to improve inference. This approach offers more predictive power than Bayesian networks alone, while retaining the ability to quantify uncertainty. It is more computationally demanding and generally less interpretable than pure Bayesian networks.
LSTMs (Long Short-Term Memory networks) are a type of recurrent neural network designed to capture long-range dependencies in sequential data by processing sequences step by step. They are useful for genomic sequence modeling, chromatin state prediction, and temporal clinical data analysis. Large language models (LLMs), by contrast, use transformer architectures that process entire sequences simultaneously via attention mechanisms, making them more powerful and efficient for natural language tasks. LLMs use tokens — discrete text chunks — as their basic units of input and output, whereas LSTMs work directly with raw sequential elements.
A token is a chunk of text — a word, subword, or character — that a model processes as a discrete unit. Transformers break input text into tokens and then process all tokens simultaneously using attention mechanisms, allowing each token to "attend" to every other token in the sequence. This enables the model to capture context and long-range relationships efficiently. Token limits define the maximum amount of text a model can process at once (the context window).