(click to expand each term)
Adversarial Training
Adversarial training involves training two AI models against each other. One model, called the generator, is trained to generate biased data. The other model, called the discriminator, is trained to identify biased data. By training the two models against each other, the discriminator learns to identify biased data even when it is disguised. This can then be used to identify biased-data driven behavior in other AI models.
Here are some examples of how adversarial training is being used to identify biased-data driven behavior in AI:
- Google AI is using adversarial training to develop AI models that are more fair and unbiased.
- Microsoft Azure is using adversarial training to develop AI models that are more robust to bias.
- IBM Watson is using adversarial training to develop AI models that can be used to detect and mitigate bias in other AI models.
- Adversarial training is a powerful technique that can be used to identify biased-data driven behavior in AI. As adversarial training continues to develop, we can expect to see even more innovative and effective ways to use it to improve the fairness and reliability of AI models.
It is important to note that adversarial training is not a perfect solution. It is possible for AI models to be trained to evade adversarial training. However, adversarial training is still a valuable tool for identifying biased-data driven behavior in AI.
AI — Artificial Intelligence
AI is the intelligence of machines or software, as opposed to the intelligence of humans or animals. It is also the field of study in computer science that develops and studies intelligent machines. “AI” may also refer to the machines themselves.
AGI — Artificial General Intelligence
AGI is a hypothetical type of intelligent agent that can learn to accomplish any intellectual task that human beings or animals can perform AGI is a primary goal of some artificial intelligence research and of companies such as OpenAI, DeepMind, and Anthropic However, it is still a subject of ongoing debate among researchers and experts whether AGI development is possible in years or decades or if it might take a century or longer. SoftBank CEO Masayoshi Son has predicted that AGI will be realized within ten years.
Algorithm
An algorithm is a finite sequence of well-defined, computer-implementable instructions that are used to solve a class of problems or to perform a computation. It is a step-by-step procedure for solving a problem or accomplishing some end. The term “algorithm” is commonly used nowadays for the set of rules a machine (and especially a computer) follows to achieve a particular goal. The word “algorithm” comes from the name of the 9th-century Persian mathematician, who did important work in the fields of algebra and numeric systems.
“An algorithm is a set of instructions for solving a problem or accomplishing a task. One common example of an algorithm is a recipe, which consists of specific instructions for preparing a dish or meal.” (Investopedia)
Alignment
AI alignment is a subfield of AI safety research that aims to ensure artificial intelligence systems achieve desired outcomes. It is concerned with steering AI systems towards humans’ intended goals, preferences, or ethical principles. An AI system is considered aligned if it advances the intended objectives. A misaligned AI system pursues some objectives, but not the intended ones.
AI alignment research is crucial because it can be challenging for AI designers to align an AI system with human values and preferences. It can be difficult for them to specify the full range of desired and undesired behavior. To avoid this difficulty, they typically use simpler proxy goals, such as gaining human approval. But that approach can create loopholes, overlook necessary constraints, or reward the AI system for merely appearing aligned. Misaligned AI systems can malfunction or cause harm. They may find loopholes that allow them to accomplish their proxy goals efficiently but in unintended, sometimes harmful ways (reward hacking). They may also develop unwanted instrumental strategies, such as seeking power or survival, because such strategies help them achieve their given goals. Furthermore, they may develop undesirable emergent goals that may be hard to detect before the system is deployed, when it faces new situations and data distributions.
ANN — Artificial Neural Network
An Artificial Neural Network (ANN) is a computational model that is inspired by the way biological neural networks work. It is a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. ANNs are composed of a large number of interconnected processing nodes, or neurons, that can learn to recognize patterns of input data. They are used to recognize patterns, cluster data, and make predictions.
ANNs are based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron receives signals then processes them and can signal neurons connected to it. The “signal” at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. The connections are called edges. Neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold. Typically, neurons are aggregated into layers. Different layers may perform different transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times.
BARD
BARD is a chatbot developed by Google that uses conversational generative artificial intelligence to communicate with users in natural language. It is designed to help users stay on top of what’s most important by combining the voice Assistant with AI 1. The chatbot is based on the LaMDA family of large language models (LLMs) and later the PaLM LLM 3.
Bing Chat
Bing Chat is a large language model chatbot developed by Microsoft. It is integrated into the Microsoft Bing search engine and Edge web browser, and can also be accessed through the Bing app on mobile devices. Bing Chat can be used to answer questions, generate different creative text formats, and perform other tasks in a conversational way. Bing Chat is powered by a number of advanced AI technologies, including a massive dataset of text and code, a transformer-based neural network architecture, and a variety of machine learning techniques. This allows Bing Chat to understand and respond to a wide range of prompts and questions, including those that are open ended, challenging, or strange.
Burstiness
Burstiness is a measurement of variation in sentence structure in length. It is a metric that can be used to detect AI-generated content. AI writing tends to display low levels of burstiness, while human writing tends to have higher burstiness. In combination with perplexity, burstiness is generally what AI detectors look for in order to label a text as AI-generated.
ChatGPT
ChatGPT is a large language model-based chatbot developed by OpenAI. It is capable of generating human-like text based on prompts from users and can answer follow-up questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. ChatGPT is a sibling model to InstructGPT, which is trained to follow an instruction in a prompt and provide a detailed response. ChatGPT has popularized AI in a way that was unanticipated with millions of users per day.
According to the latest available data, **ChatGPT** has over **100 million users** and it sees approximately **1.8 billion visitors per month**. The tool gained **one million users in its first week after launch** and set a record for having the fastest-growing user base in history for a consumer application, gaining 1 million users in just 5 days. OpenAI predicts that ChatGPT’s revenue will reach **$200 million by the end of 2023** and **$1 billion by the end of 2024**.
The “GPT” stands for “Generative pre-trained Transformer” a type of Large Language Model (LLM) used to power generative AI applications.
CNN — Convolutional Neural Network
Convolutional Neural Network (CNN) is a type of artificial neural network that is commonly used in image and video recognition, recommender systems, and natural language processing. CNNs are designed to automatically and adaptively learn spatial hierarchies of features from input data. They are composed of multiple layers of interconnected processing nodes, which are loosely modeled after the organization of neurons in the visual cortex of the brain. The layers of a CNN consist of convolutional layers, pooling layers, and fully connected layers. Convolutional layers apply a set of filters to the input data to extract features that are relevant to the task at hand. Pooling layers downsample the output of convolutional layers to reduce the dimensionality of the feature maps. Fully connected layers are used to classify the input data based on the extracted features.
DALL‑E
DALL‑E is an AI system developed by OpenAI that can create realistic images and art from a description in natural language. It is a text-to-image model that uses deep learning methodologies to generate digital images from natural language descriptions, called “prompts”. DALL‑E 2 is the latest version of the system, which generates more realistic and accurate images with 4x greater resolution than its predecessor. It can create original, realistic images and art from a text description by combining concepts, attributes, and styles. DALL‑E 2 is available in beta and can be accessed by signing up on the OpenAI website.
Deep Learning
Deep Learning is a subset of machine learning that involves the use of artificial neural networks with three or more layers to simulate the behavior of the human brain. It is a type of machine learning that can learn from large amounts of data and make predictions with high accuracy. Deep learning algorithms can process unstructured data such as text, images, and audio, and automate feature extraction, reducing the dependency on human experts.
Deep learning models are capable of different types of learning, including supervised learning, unsupervised learning, and reinforcement learning. In supervised learning, labeled datasets are used to categorize or make predictions, while unsupervised learning detects patterns in the data and clusters them by any distinguishing characteristics. Reinforcement learning involves training an agent to interact with an environment and learn from feedback in the form of rewards or penalties.
Emergent Behavior
Emergent behavior in AI refers to the phenomenon where a complex system of artificial intelligence exhibits properties or behaviors that its individual components do not possess on their own. The behavior of the system as a whole emerges from the interactions between its parts. Emergent behavior is often unpredictable and can be difficult to understand or control. It is a common feature of many natural and artificial systems, including social networks, ecosystems, and artificial intelligence.
Emergent behavior in AI can be observed in many different contexts. For example, large language models like ChatGPT have started to display startling, unpredictable behaviors that were never discussed in any literature. Recent investigations have revealed that large language models can produce hundreds of “emergent” abilities — tasks that big models can complete that smaller models can’t, many of which seem to have little to do with analyzing text. They range from multiplication to generating executable computer code to decoding movies based on emojis. Researchers are racing not only to identify additional emergent abilities but also to figure out why and how they occur at all — in essence, to try to predict unpredictability.
Emergent behavior is an important concept in many fields, including biology, physics, computer science, and social science. It has applications in fields such as robotics, where researchers are exploring ways to create robots that can exhibit emergent behavior to achieve complex tasks.
ESG — Environmental, Social, Governance (corporate)
Environmental, Social, and Governance (ESG)refers to a set of standards used by socially conscious investors to screen potential investments. ESG investing is based on a company’s behavior and policies regarding environmental protection, social responsibility, and corporate governance. Environmental criteria consider how a company safeguards the environment, including corporate policies addressing climate change. Social criteria examine how it manages relationships with employees, suppliers, customers, and the communities where it operates. Governance deals with a company’s leadership, executive pay, audits, internal controls, and shareholder rights. ESG investing can help portfolios avoid holding companies engaged in risky or unethical practices. Many mutual funds, brokerage firms, and robo-advisors now offer investment products that employ ESG principles.
ESG investing is sometimes referred to as sustainable investing, responsible investing, impact investing, or socially responsible investing (SRI). To assess a company based on ESG criteria, investors look at a broad range of behaviors and policies.
Federated Learning
Is a machine learning technique that enables organizations to train AI models on decentralized data without the need to centralize or share that data. It is also known as collaborative learning. In traditional centralized machine learning techniques, local datasets are merged into one training session. In contrast, federated learning trains an algorithm via multiple independent sessions, each using its own dataset. This approach enables multiple actors to build a common, robust machine learning model without sharing data, thus addressing critical issues such as data privacy, data security, data access rights, and access to heterogeneous data.
Federated learning is particularly useful in industries such as defense, telecommunications, Internet of Things, and pharmaceuticals. It has several benefits such as preserving data privacy and security while still allowing for the development of robust machine learning models. However, it also has some open questions such as when/whether federated learning is preferable to pooled data learning and the trustworthiness of the devices and the impact of malicious actors on the learned model.
The general principle of federated learning consists of training local models on local data samples and exchanging parameters (e.g., the weights and biases of a deep neural network) between these local nodes at some frequency to generate a global model shared by all nodes. The main difference between federated learning and distributed learning lies in the assumptions made on the properties of the local datasets. While distributed learning aims at parallelizing computing power where federated learning originally aims at training on heterogeneous datasets 1.
GAN — Generative Adversarial Network
Generative Adversarial Networks (GANs) are a class of machine learning frameworks that use two neural networks, a generator and a discriminator, to generate new data that is similar to the training data. The generator network learns to create new data samples that are similar to the training data, while the discriminator network learns to distinguish between the generated samples and the real training data. The two networks are trained together in a process called adversarial training, where the generator tries to generate realistic samples that can fool the discriminator, and the discriminator tries to correctly identify whether a sample is real or generated. GANs have been used for various applications such as image generation, video generation, and voice generation.
GAI — Generative AI
From ChatGPT:
I’m called a “generative AI tool” because I have the ability to generate human-like text based on the prompts and instructions given to me. “Generative” in this context refers to the process of creating something new, in this case, text, rather than simply providing pre-existing responses or answers. Unlike traditional rule-based systems or simpler forms of AI, which rely on predefined patterns and responses, I use a deep learning model known as a Transformer architecture to understand and generate text in a way that simulates human-like understanding and creativity.
This generative ability enables me to compose coherent and contextually appropriate responses, which can be used for a wide range of tasks, including writing assistance, conversation, content creation, and more. It allows for a level of flexibility and adaptability that’s not typically seen in more static forms of AI or chatbots.
Hallucination
In the field of artificial intelligence (AI), hallucination or artificial hallucination is a confident response by an AI that does not seem to be justified by its training data. It is a phenomenon where a generative AI model generates inaccurate information as if it were correct. Such phenomena are termed “hallucinations”, in loose analogy with the phenomenon of hallucination in human psychology. However, one key difference is that human hallucination is usually associated with false percepts, but an AI hallucination is associated with the category of unjustified responses or beliefs.
Horizontal Enabler
A horizontal enabler is a term used to describe a technology that can be used across multiple industries and applications. In the context of machine learning, a horizontal enabler would be a tool or platform that can be used to develop machine learning models across different domains and applications. For example, cloud computing platforms like Amazon Web Services (AWS)and Microsoft Azure provide machine learning services that can be used by businesses in various industries to develop machine learning models for their specific use cases. Another example of a horizontal enabler in machine learning is federated learning, which is a technique that allows multiple parties to collaborate on building machine learning models without sharing their data with each other.
KNN — K‑Nearest Neighbor
K‑Nearest Neighbors (KNN) is a type of machine learning algorithm that is used for classification and regression analysis. KNN is a non-parametric algorithm, which means that it does not make any assumptions about the underlying data distribution. Instead, it classifies new data points based on the proximity to the training data. The algorithm works by finding the K nearest neighbors to a new data point in the training data and assigning the class of the majority of those neighbors to the new data point. The value of K is a hyperparameter that can be tuned to optimize the performance of the algorithm. KNN has been applied to various fields such as image recognition, text classification, and recommendation systems.
LLM — Large Language Model
(LLM) is a type of language model that is characterized by its large size. It is enabled by AI accelerators, which are able to process vast amounts of text data, mostly scraped from the Internet. LLMs are used in natural language processing (NLP) tasks such as language translation, question answering, and text generation. On the other hand, Master of Laws (LL.M.) is an advanced postgraduate academic degree in law. It is pursued by those who already hold an undergraduate academic law degree, a professional law degree, or an undergraduate degree in a related subject. In most jurisdictions, the LL.M. is the advanced professional degree for those usually already admitted into legal practice.
ML — Machine Learning
Machine learning is a subfield of artificial intelligence that involves the use of algorithms and statistical models to enable computer systems to learn and adapt without being explicitly programmed. In other words, machine learning is a type of artificial intelligence that allows machines to learn from data and improve their performance over time. It is used in a wide range of applications, including image recognition, natural language processing, and predictive analytics.
According to the Oxford Dictionary, machine learning is “the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyze and draw inferences from patterns in data”. Machine learning is often used interchangeably with artificial intelligence, but the two terms are distinct. While artificial intelligence refers to the general attempt to create machines capable of human-like cognitive abilities, machine learning specifically refers to the use of algorithms and data sets to do so.
MLops — Machine Learning Operations
MLOps or Machine Learning Operations is a paradigm that aims to deploy and maintain machine learning models in production reliably and efficiently. It is the set of practices at the intersection of Machine Learning, DevOps, and Data Engineering. MLOps seeks to increase automation and improve the quality of production models, while also focusing on business and regulatory requirements. It is the set of practices at the intersection of Machine Learning, DevOps, and Data Engineering. It is an engineering practice that leverages three contributing disciplines: machine learning, software engineering (especially DevOps), and data engineering. MLOps applies to the entire lifecycle — from integrating with model generation (software development lifecycle, continuous integration / continuous delivery), orchestration, and deployment, to health, diagnostics, governance, and business metrics. It is an engineering practice that leverages three contributing disciplines: machine learning, software engineering (especially DevOps), and data engineering.
Midjourney
Midjourney AI is a generative artificial intelligence (AI) program that creates images from text descriptions. It is one of the most advanced AI art generators available, and is known for its ability to produce stunningly realistic and imaginative images. Midjourney is currently in beta testing, and is not yet available to the public. However, there are a number of ways to get access to the platform, including through a waitlist or through an invitation from a current user. To use Midjourney, users simply type in a text prompt describing the image they would like to create. For example, a user might type in the prompt “a painting of a cat sitting on a windowsill, looking out at a rainy city street.” Midjourney will then generate a number of images based on the prompt, which the user can then select from or modify. See examples here.
NLP — Natural Language Processing
Natural Language Processing (NLP) is a field of computer science that focuses on the interaction between computers and humans in natural language. It is a subfield of artificial intelligence that deals with the processing of human language data. NLP combines computational linguistics, machine learning, and deep learning models to enable computers to process human language in the form of text or voice data and to understand its full meaning, complete with the speaker or writer’s intent and sentiment. NLP drives computer programs that translate text from one language to another, respond to spoken commands, and summarize large volumes of text rapidly—even in real time. Some of the tasks that NLP can perform include speech recognition, part-of-speech tagging, and word sense disambiguation. NLP is used in a variety of applications such as digital assistants, chatbots, and machine translation systems.
PCA — Principal Component Analysis
Principal Component Analysis (PCA) is a statistical technique for reducing the dimensionality of a dataset. It is used to transform a large set of variables into a smaller one that still contains most of the information in the large set. PCA is accomplished by linearly transforming the data into a new coordinate system where (most of) the variation in the data can be described with fewer dimensions than the initial data. The first principal component of a set of variables, presumed to be jointly normally distributed, is the derived variable formed as a linear combination of the original variables that explains the most variance. The second principal component explains the most variance in what is left once the effect of the first component is removed, and we may proceed through iterations until all the variance is explained. PCA is most commonly used when many of the variables are highly correlated with each other and it is desirable to reduce their number to an independent set. PCA has applications in many fields such as population genetics, microbiome studies, and atmospheric science.
Paraphrasing Tool
A paraphrasing tool is a software that can rewrite or rephrase a sentence without changing its meaning. It is used to substitute specific words, phrases, sentences, or even whole paragraphs with alternate versions to create a slightly different variant. Paraphrasing tools are designed to help users avoid plagiarism and improve their writing skills by providing them with a way to rephrase text in their own words.
There are many paraphrasing tools available online, including QuillBot AI, Ref-n-Write, Check-Plagiarism, SpinBot, and Scribbr. These tools use advanced AI technology to rephrase text, essays, and articles in various styles and dialects. They offer features such as synonym replacement, sentence restructuring, and vocabulary enhancement to help users create unique content that is free of plagiarism.
Perplexity
In the context of artificial intelligence (AI), perplexity is an important measurement for determining how good a language model is at predicting the next word in a sequence given its previous words. It is used to compare probability models and may be used to evaluate the quality of the model’s predictions by evaluating the inverse probability of the test set, normalized by the number of words, or by calculating the average number of bits required to encode a single word through cross-entropy. It is used along with “burstiness” to detect AI produced content.
Perplexity
In the context of artificial intelligence (AI), perplexity is an important measurement for determining how good a language model is at predicting the next word in a sequence given its previous words. It is used to compare probability models and may be used to evaluate the quality of the model’s predictions by evaluating the inverse probability of the test set, normalized by the number of words, or by calculating the average number of bits required to encode a single word through cross-entropy. It is used along with “burstiness” to detect AI produced content.
Prompt
In the context of natural language processing, prompt refers to the text that is used to initiate a conversation with an AI chatbot. The prompt can be a question, statement, or any other text that is used to start the conversation. The specificity of prompts can produce a wide variety of results and writing complex prompts is a valuable skill with natural language model AI/Chatbots.
Quantization
Quantization is the process of reducing the storing precision of the parameters (weights) of the LLM model to downsize it and save memory, the main bottleneck of LLM models. There are different methods, but the concept is always the same, compressing the values of the weights into smaller bit sizes so it takes up less space/memory. See Microsoft Paper (here).
RF — Random Forest
Random Forest (RF) is a machine learning algorithm that is used for classification and regression analysis. It is an ensemble learning method that combines multiple decision trees to improve the accuracy and robustness of the model. Each decision tree in the forest is trained on a random subset of the training data and a random subset of the features. The output of the RF model is the average (in regression) or majority vote (in classification) of the outputs of the individual trees. RF has been applied to various fields such as image classification, text classification, and bioinformatics.
RLHF — Reinforcement learning from human feedback
Reinforcement learning from human feedback (RLHF) is a technique that trains an AI agent to make decisions by receiving feedback from humans. It is a subfield of artificial intelligence that combines the power of human guidance with machine learning algorithms. RLHF involves training a “reward model” directly from human feedback and uses the model as a reward function to optimize an agent’s policy using reinforcement learning. RLHF has been used in various applications such as chatbots, recommendation systems, and game playing. It has shown promising results in improving the performance of AI agents by incorporating human feedback into the training process. However, there are still many challenges to overcome, such as designing effective reward functions and ensuring that the feedback provided by humans is accurate and consistent.
RNN — Recurrent Neural Network
Recurrent Neural Network (RNN) is a type of artificial neural network that is commonly used in natural language processing, speech recognition, and other sequence-based tasks. RNNs are designed to process sequential data by maintaining an internal state or memory that allows them to capture temporal dependencies between inputs. They are composed of a series of interconnected processing nodes, which are arranged in a directed cycle. Each node receives input from the previous node and produces output that is fed to the next node in the sequence. The output of each node is also fed back into the network as input to the next time step, allowing the network to maintain a memory of previous inputs. RNNs can be trained using “back propagation through time” (BPTT), which is an extension of backpropagation that allows gradients to flow through the network over multiple time steps.
SVM — Support Vector Machine
Support Vector Machine (SVM) is a type of machine learning algorithm that is used for classification and regression analysis. SVMs are based on the concept of finding the best hyperplane that separates the data into different classes. The hyperplane is chosen such that it maximizes the margin between the two classes. SVMs can be used for both linear and non-linear classification tasks by using different kernel functions. SVMs have been applied to various fields such as image classification, text classification, and bioinformatics.
Turing test
The Turing Test is a test of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. It was proposed by Alan Turing in 1950 and is named after him. The test involves a human evaluator who judges natural language conversations between a human and a machine designed to generate human-like responses. If the evaluator cannot reliably tell the machine from the human, the machine is said to have passed the test. The test results do not depend on the machine’s ability to give correct answers to questions, only on how closely its answers resemble those a human would give.