Skip to content

AI Glossary

(click to expand each term)

Adver­sar­ial Training

Adver­sar­ial train­ing involves train­ing two AI mod­els against each other. One model, called the gen­er­a­tor, is trained to gen­er­ate biased data. The other model, called the dis­crim­i­na­tor, is trained to iden­tify biased data. By train­ing the two mod­els against each other, the dis­crim­i­na­tor learns to iden­tify biased data even when it is dis­guised. This can then be used to iden­tify biased-data dri­ven behav­ior in other AI models.

Here are some exam­ples of how adver­sar­ial train­ing is being used to iden­tify biased-data dri­ven behav­ior in AI:

    • Google AI is using adver­sar­ial train­ing to develop AI mod­els that are more fair and unbiased.
    • Microsoft Azure is using adver­sar­ial train­ing to develop AI mod­els that are more robust to bias.
    • IBM Wat­son is using adver­sar­ial train­ing to develop AI mod­els that can be used to detect and mit­i­gate bias in other AI models.
    Adver­sar­ial train­ing is a pow­er­ful tech­nique that can be used to iden­tify biased-data dri­ven behav­ior in AI. As adver­sar­ial train­ing con­tin­ues to develop, we can expect to see even more inno­v­a­tive and effec­tive ways to use it to improve the fair­ness and reli­a­bil­ity of AI models.

It is impor­tant to note that adver­sar­ial train­ing is not a per­fect solu­tion. It is pos­si­ble for AI mod­els to be trained to evade adver­sar­ial train­ing. How­ever, adver­sar­ial train­ing is still a valu­able tool for iden­ti­fy­ing biased-data dri­ven behav­ior in AI.

AI — Arti­fi­cial Intelligence

AI is the intel­li­gence of machines or soft­ware, as opposed to the intel­li­gence of humans or ani­mals. It is also the field of study in com­puter sci­ence that devel­ops and stud­ies intel­li­gent machines. “AI” may also refer to the machines themselves.

AGI — Arti­fi­cial Gen­eral Intelligence

AGI is a hypo­thet­i­cal type of intel­li­gent agent that can learn to accom­plish any intel­lec­tual task that human beings or ani­mals can per­form  AGI is a pri­mary goal of some arti­fi­cial intel­li­gence research and of com­pa­nies such as Ope­nAI, Deep­Mind, and Anthropic How­ever, it is still a sub­ject of ongo­ing debate among researchers and experts whether AGI devel­op­ment is pos­si­ble in years or decades or if it might take a cen­tury or longer. Soft­Bank CEO Masayoshi Son has pre­dicted that AGI will be real­ized within ten years.

Algorithm

An algo­rithm is a finite sequence of well-defined, com­puter-imple­mentable instruc­tions that are used to solve a class of prob­lems or to per­form a com­pu­ta­tion. It is a step-by-step pro­ce­dure for solv­ing a prob­lem or accom­plish­ing some end. The term “algo­rithm” is com­monly used nowa­days for the set of rules a machine (and espe­cially a com­puter) fol­lows to achieve a par­tic­u­lar goal. The word “algo­rithm” comes from the name of the 9th-cen­tury Per­sian math­e­mati­cian, who did impor­tant work in the fields of alge­bra and numeric systems.

“An algo­rithm is a set of instruc­tions for solv­ing a prob­lem or accom­plish­ing a task. One com­mon exam­ple of an algo­rithm is a recipe, which con­sists of spe­cific instruc­tions for prepar­ing a dish or meal.” (Investo­pe­dia)

Alignment

AI align­ment is a sub­field of AI safety research that aims to ensure arti­fi­cial intel­li­gence sys­tems achieve desired out­comes. It is con­cerned with steer­ing AI sys­tems towards humans’ intended goals, pref­er­ences, or eth­i­cal prin­ci­ples. An AI sys­tem is con­sid­ered aligned if it advances the intended objec­tives. A mis­aligned AI sys­tem pur­sues some objec­tives, but not the intended ones.

AI align­ment research is cru­cial because it can be chal­leng­ing for AI design­ers to align an AI sys­tem with human val­ues and pref­er­ences. It can be dif­fi­cult for them to spec­ify the full range of desired and unde­sired behav­ior. To avoid this dif­fi­culty, they typ­i­cally use sim­pler proxy goals, such as gain­ing human approval. But that approach can cre­ate loop­holes, over­look nec­es­sary con­straints, or reward the AI sys­tem for merely appear­ing aligned. Mis­aligned AI sys­tems can mal­func­tion or cause harm. They may find loop­holes that allow them to accom­plish their proxy goals effi­ciently but in unin­tended, some­times harm­ful ways (reward hack­ing). They may also develop unwanted instru­men­tal strate­gies, such as seek­ing power or sur­vival, because such strate­gies help them achieve their given goals. Fur­ther­more, they may develop unde­sir­able emer­gent goals that may be hard to detect before the sys­tem is deployed, when it faces new sit­u­a­tions and data distributions.

ANN — Arti­fi­cial Neural Network

An Arti­fi­cial Neural Net­work (ANN) is a com­pu­ta­tional model that is inspired by the way bio­log­i­cal neural net­works work. It is a branch of machine learn­ing mod­els that are built using prin­ci­ples of neu­ronal orga­ni­za­tion dis­cov­ered by con­nec­tion­ism in the bio­log­i­cal neural net­works con­sti­tut­ing ani­mal brains. ANNs are com­posed of a large num­ber of inter­con­nected pro­cess­ing nodes, or neu­rons, that can learn to rec­og­nize pat­terns of input data. They are used to rec­og­nize pat­terns, clus­ter data, and make predictions.

ANNs are based on a col­lec­tion of con­nected units or nodes called arti­fi­cial neu­rons, which loosely model the neu­rons in a bio­log­i­cal brain. Each con­nec­tion, like the synapses in a bio­log­i­cal brain, can trans­mit a sig­nal to other neu­rons. An arti­fi­cial neu­ron receives sig­nals then processes them and can sig­nal neu­rons con­nected to it. The “sig­nal” at a con­nec­tion is a real num­ber, and the out­put of each neu­ron is com­puted by some non-lin­ear func­tion of the sum of its inputs. The con­nec­tions are called edges. Neu­rons and edges typ­i­cally have a weight that adjusts as learn­ing pro­ceeds. The weight increases or decreases the strength of the sig­nal at a con­nec­tion. Neu­rons may have a thresh­old such that a sig­nal is sent only if the aggre­gate sig­nal crosses that thresh­old. Typ­i­cally, neu­rons are aggre­gated into lay­ers. Dif­fer­ent lay­ers may per­form dif­fer­ent trans­for­ma­tions on their inputs. Sig­nals travel from the first layer (the input layer), to the last layer (the out­put layer), pos­si­bly after tra­vers­ing the lay­ers mul­ti­ple times.

BARD

BARD is a chat­bot devel­oped by Google that uses con­ver­sa­tional gen­er­a­tive arti­fi­cial intel­li­gence to com­mu­ni­cate with users in nat­ural lan­guage. It is designed to help users stay on top of what’s most impor­tant by com­bin­ing the voice Assis­tant with AI 1. The chat­bot is based on the LaMDA fam­ily of large lan­guage mod­els (LLMs) and later the PaLM LLM 3.

Bing Chat

Bing Chat is a large lan­guage model chat­bot devel­oped by Microsoft. It is inte­grated into the Microsoft Bing search engine and Edge web browser, and can also be accessed through the Bing app on mobile devices. Bing Chat can be used to answer ques­tions, gen­er­ate dif­fer­ent cre­ative text for­mats, and per­form other tasks in a con­ver­sa­tional way. Bing Chat is pow­ered by a num­ber of advanced AI tech­nolo­gies, includ­ing a mas­sive dataset of text and code, a trans­former-based neural net­work archi­tec­ture, and a vari­ety of machine learn­ing tech­niques. This allows Bing Chat to under­stand and respond to a wide range of prompts and ques­tions, includ­ing those that are open ended, chal­leng­ing, or strange.

Burstiness

Bursti­ness is a mea­sure­ment of vari­a­tion in sen­tence struc­ture in length. It is a met­ric that can be used to detect AI-gen­er­ated con­tent. AI writ­ing tends to dis­play low lev­els of bursti­ness, while human writ­ing tends to have higher bursti­ness. In com­bi­na­tion with per­plex­ity, bursti­ness is gen­er­ally what AI detec­tors look for in order to label a text as AI-generated.

ChatGPT

Chat­GPT is a large lan­guage model-based chat­bot devel­oped by Ope­nAI. It is capa­ble of gen­er­at­ing human-like text based on prompts from users and can answer fol­low-up ques­tions, admit its mis­takes, chal­lenge incor­rect premises, and reject inap­pro­pri­ate requests. Chat­GPT is a sib­ling model to Instruct­GPT, which is trained to fol­low an instruc­tion in a prompt and pro­vide a detailed response. Chat­GPT has pop­u­lar­ized AI in a way that was unan­tic­i­pated with mil­lions of users per day.

Accord­ing to the lat­est avail­able data, **Chat­GPT** has over **100 mil­lion users** and it sees approx­i­mately **1.8 bil­lion vis­i­tors per month**. The tool gained **one mil­lion users in its first week after launch** and set a record for hav­ing the fastest-grow­ing user base in his­tory for a con­sumer appli­ca­tion, gain­ing 1 mil­lion users in just 5 days. Ope­nAI pre­dicts that Chat­G­P­T’s rev­enue will reach **$200 mil­lion by the end of 2023** and **$1 bil­lion by the end of 2024**.

The “GPT” stands for “Gen­er­a­tive pre-trained Trans­former” a type of Large Lan­guage Model (LLM) used to power gen­er­a­tive AI applications.

CNN — Con­vo­lu­tional Neural Network

Con­vo­lu­tional Neural Net­work (CNN) is a type of arti­fi­cial neural net­work that is com­monly used in image and video recog­ni­tion, rec­om­mender sys­tems, and nat­ural lan­guage pro­cess­ing. CNNs are designed to auto­mat­i­cally and adap­tively learn spa­tial hier­ar­chies of fea­tures from input data. They are com­posed of mul­ti­ple lay­ers of inter­con­nected pro­cess­ing nodes, which are loosely mod­eled after the orga­ni­za­tion of neu­rons in the visual cor­tex of the brain. The lay­ers of a CNN con­sist of con­vo­lu­tional lay­ers, pool­ing lay­ers, and fully con­nected lay­ers. Con­vo­lu­tional lay­ers apply a set of fil­ters to the input data to extract fea­tures that are rel­e­vant to the task at hand. Pool­ing lay­ers down­sam­ple the out­put of con­vo­lu­tional lay­ers to reduce the dimen­sion­al­ity of the fea­ture maps. Fully con­nected lay­ers are used to clas­sify the input data based on the extracted features.

DALL‑E

DALL‑E is an AI sys­tem devel­oped by Ope­nAI that can cre­ate real­is­tic images and art from a descrip­tion in nat­ural lan­guage. It is a text-to-image model that uses deep learn­ing method­olo­gies to gen­er­ate dig­i­tal images from nat­ural lan­guage descrip­tions, called “prompts”. DALL‑E 2 is the lat­est ver­sion of the sys­tem, which gen­er­ates more real­is­tic and accu­rate images with 4x greater res­o­lu­tion than its pre­de­ces­sor. It can cre­ate orig­i­nal, real­is­tic images and art from a text descrip­tion by com­bin­ing con­cepts, attrib­utes, and styles. DALL‑E 2 is avail­able in beta and can be accessed by sign­ing up on the Ope­nAI website.

Deep Learning

Deep Learn­ing is a sub­set of machine learn­ing that involves the use of arti­fi­cial neural net­works with three or more lay­ers to sim­u­late the behav­ior of the human brain. It is a type of machine learn­ing that can learn from large amounts of data and make pre­dic­tions with high accu­racy. Deep learn­ing algo­rithms can process unstruc­tured data such as text, images, and audio, and auto­mate fea­ture extrac­tion, reduc­ing the depen­dency on human experts.

Deep learn­ing mod­els are capa­ble of dif­fer­ent types of learn­ing, includ­ing super­vised learn­ing, unsu­per­vised learn­ing, and rein­force­ment learn­ing. In super­vised learn­ing, labeled datasets are used to cat­e­go­rize or make pre­dic­tions, while unsu­per­vised learn­ing detects pat­terns in the data and clus­ters them by any dis­tin­guish­ing char­ac­ter­is­tics. Rein­force­ment learn­ing involves train­ing an agent to inter­act with an envi­ron­ment and learn from feed­back in the form of rewards or penalties.

Emer­gent Behavior

Emer­gent behav­ior in AI refers to the phe­nom­e­non where a com­plex sys­tem of arti­fi­cial intel­li­gence exhibits prop­er­ties or behav­iors that its indi­vid­ual com­po­nents do not pos­sess on their own. The behav­ior of the sys­tem as a whole emerges from the inter­ac­tions between its parts. Emer­gent behav­ior is often unpre­dictable and can be dif­fi­cult to under­stand or con­trol. It is a com­mon fea­ture of many nat­ural and arti­fi­cial sys­tems, includ­ing social net­works, ecosys­tems, and arti­fi­cial intelligence.

Emer­gent behav­ior in AI can be observed in many dif­fer­ent con­texts. For exam­ple, large lan­guage mod­els like Chat­GPT have started to dis­play star­tling, unpre­dictable behav­iors that were never dis­cussed in any lit­er­a­ture. Recent inves­ti­ga­tions have revealed that large lan­guage mod­els can pro­duce hun­dreds of “emer­gent” abil­i­ties — tasks that big mod­els can com­plete that smaller mod­els can’t, many of which seem to have lit­tle to do with ana­lyz­ing text. They range from mul­ti­pli­ca­tion to gen­er­at­ing exe­cutable com­puter code to decod­ing movies based on emo­jis. Researchers are rac­ing not only to iden­tify addi­tional emer­gent abil­i­ties but also to fig­ure out why and how they occur at all — in essence, to try to pre­dict unpredictability.

Emer­gent behav­ior is an impor­tant con­cept in many fields, includ­ing biol­ogy, physics, com­puter sci­ence, and social sci­ence. It has appli­ca­tions in fields such as robot­ics, where researchers are explor­ing ways to cre­ate robots that can exhibit emer­gent behav­ior to achieve com­plex tasks.

ESG — Envi­ron­men­tal, Social, Gov­er­nance (cor­po­rate)

Envi­ron­men­tal, Social, and Gov­er­nance (ESG)refers to a set of stan­dards used by socially con­scious investors to screen poten­tial invest­ments. ESG invest­ing is based on a company’s behav­ior and poli­cies regard­ing envi­ron­men­tal pro­tec­tion, social respon­si­bil­ity, and cor­po­rate gov­er­nance. Envi­ron­men­tal cri­te­ria con­sider how a com­pany safe­guards the envi­ron­ment, includ­ing cor­po­rate poli­cies address­ing cli­mate change. Social cri­te­ria exam­ine how it man­ages rela­tion­ships with employ­ees, sup­pli­ers, cus­tomers, and the com­mu­ni­ties where it oper­ates. Gov­er­nance deals with a company’s lead­er­ship, exec­u­tive pay, audits, inter­nal con­trols, and share­holder rights. ESG invest­ing can help port­fo­lios avoid hold­ing com­pa­nies engaged in risky or uneth­i­cal prac­tices. Many mutual funds, bro­ker­age firms, and robo-advi­sors now offer invest­ment prod­ucts that employ ESG principles.

ESG invest­ing is some­times referred to as sus­tain­able invest­ing, respon­si­ble invest­ing, impact invest­ing, or socially respon­si­ble invest­ing (SRI). To assess a com­pany based on ESG cri­te­ria, investors look at a broad range of behav­iors and policies.

Fed­er­ated Learning

Is a machine learn­ing tech­nique that enables orga­ni­za­tions to train AI mod­els on decen­tral­ized data with­out the need to cen­tral­ize or share that data. It is also known as col­lab­o­ra­tive learn­ing. In tra­di­tional cen­tral­ized machine learn­ing tech­niques, local datasets are merged into one train­ing ses­sion. In con­trast, fed­er­ated learn­ing trains an algo­rithm via mul­ti­ple inde­pen­dent ses­sions, each using its own dataset. This approach enables mul­ti­ple actors to build a com­mon, robust machine learn­ing model with­out shar­ing data, thus address­ing crit­i­cal issues such as data pri­vacy, data secu­rity, data access rights, and access to het­ero­ge­neous data.

Fed­er­ated learn­ing is par­tic­u­larly use­ful in indus­tries such as defense, telecom­mu­ni­ca­tions, Inter­net of Things, and phar­ma­ceu­ti­cals. It has sev­eral ben­e­fits such as pre­serv­ing data pri­vacy and secu­rity while still allow­ing for the devel­op­ment of robust machine learn­ing mod­els. How­ever, it also has some open ques­tions such as when/whether fed­er­ated learn­ing is prefer­able to pooled data learn­ing and the trust­wor­thi­ness of the devices and the impact of mali­cious actors on the learned model.

The gen­eral prin­ci­ple of fed­er­ated learn­ing con­sists of train­ing local mod­els on local data sam­ples and exchang­ing para­me­ters (e.g., the weights and biases of a deep neural net­work) between these local nodes at some fre­quency to gen­er­ate a global model shared by all nodes. The main dif­fer­ence between fed­er­ated learn­ing and dis­trib­uted learn­ing lies in the assump­tions made on the prop­er­ties of the local datasets. While dis­trib­uted learn­ing aims at par­al­leliz­ing com­put­ing power where fed­er­ated learn­ing orig­i­nally aims at train­ing on het­ero­ge­neous datasets 1.

GAN — Gen­er­a­tive Adver­sar­ial Network

Gen­er­a­tive Adver­sar­ial Net­works (GANs) are a class of machine learn­ing frame­works that use two neural net­works, a gen­er­a­tor and a dis­crim­i­na­tor, to gen­er­ate new data that is sim­i­lar to the train­ing data. The gen­er­a­tor net­work learns to cre­ate new data sam­ples that are sim­i­lar to the train­ing data, while the dis­crim­i­na­tor net­work learns to dis­tin­guish between the gen­er­ated sam­ples and the real train­ing data. The two net­works are trained together in a process called adver­sar­ial train­ing, where the gen­er­a­tor tries to gen­er­ate real­is­tic sam­ples that can fool the dis­crim­i­na­tor, and the dis­crim­i­na­tor tries to cor­rectly iden­tify whether a sam­ple is real or gen­er­ated. GANs have been used for var­i­ous appli­ca­tions such as image gen­er­a­tion, video gen­er­a­tion, and voice generation.

GAI — Gen­er­a­tive AI 

From Chat­GPT:
I’m called a “gen­er­a­tive AI tool” because I have the abil­ity to gen­er­ate human-like text based on the prompts and instruc­tions given to me. “Gen­er­a­tive” in this con­text refers to the process of cre­at­ing some­thing new, in this case, text, rather than sim­ply pro­vid­ing pre-exist­ing responses or answers. Unlike tra­di­tional rule-based sys­tems or sim­pler forms of AI, which rely on pre­de­fined pat­terns and responses, I use a deep learn­ing model known as a Trans­former archi­tec­ture to under­stand and gen­er­ate text in a way that sim­u­lates human-like under­stand­ing and creativity.

This gen­er­a­tive abil­ity enables me to com­pose coher­ent and con­tex­tu­ally appro­pri­ate responses, which can be used for a wide range of tasks, includ­ing writ­ing assis­tance, con­ver­sa­tion, con­tent cre­ation, and more. It allows for a level of flex­i­bil­ity and adapt­abil­ity that’s not typ­i­cally seen in more sta­tic forms of AI or chatbots.

Hallucination

In the field of arti­fi­cial intel­li­gence (AI), hal­lu­ci­na­tion or arti­fi­cial hal­lu­ci­na­tion is a con­fi­dent response by an AI that does not seem to be jus­ti­fied by its train­ing data. It is a phe­nom­e­non where a gen­er­a­tive AI model gen­er­ates inac­cu­rate infor­ma­tion as if it were cor­rect. Such phe­nom­ena are termed “hal­lu­ci­na­tions”, in loose anal­ogy with the phe­nom­e­non of hal­lu­ci­na­tion in human psy­chol­ogy. How­ever, one key dif­fer­ence is that human hal­lu­ci­na­tion is usu­ally asso­ci­ated with false per­cepts, but an AI hal­lu­ci­na­tion is asso­ci­ated with the cat­e­gory of unjus­ti­fied responses or beliefs.

Hor­i­zon­tal Enabler

A hor­i­zon­tal enabler is a term used to describe a tech­nol­ogy that can be used across mul­ti­ple indus­tries and appli­ca­tions. In the con­text of machine learn­ing, a hor­i­zon­tal enabler would be a tool or plat­form that can be used to develop machine learn­ing mod­els across dif­fer­ent domains and appli­ca­tions. For exam­ple, cloud com­put­ing plat­forms like Ama­zon Web Ser­vices (AWS)and Microsoft Azure pro­vide machine learn­ing ser­vices that can be used by busi­nesses in var­i­ous indus­tries to develop machine learn­ing mod­els for their spe­cific use cases. Another exam­ple of a hor­i­zon­tal enabler in machine learn­ing is fed­er­ated learn­ing, which is a tech­nique that allows mul­ti­ple par­ties to col­lab­o­rate on build­ing machine learn­ing mod­els with­out shar­ing their data with each other.

KNN — K‑Nearest Neighbor

K‑Nearest Neigh­bors (KNN) is a type of machine learn­ing algo­rithm that is used for clas­si­fi­ca­tion and regres­sion analy­sis. KNN is a non-para­met­ric algo­rithm, which means that it does not make any assump­tions about the under­ly­ing data dis­tri­b­u­tion. Instead, it clas­si­fies new data points based on the prox­im­ity to the train­ing data. The algo­rithm works by find­ing the K near­est neigh­bors to a new data point in the train­ing data and assign­ing the class of the major­ity of those neigh­bors to the new data point. The value of K is a hyper­pa­ra­me­ter that can be tuned to opti­mize the per­for­mance of the algo­rithm. KNN has been applied to var­i­ous fields such as image recog­ni­tion, text clas­si­fi­ca­tion, and rec­om­men­da­tion systems.

LLM — Large Lan­guage Model

(LLM) is a type of lan­guage model that is char­ac­ter­ized by its large size. It is enabled by AI accel­er­a­tors, which are able to process vast amounts of text data, mostly scraped from the Inter­net. LLMs are used in nat­ural lan­guage pro­cess­ing (NLP) tasks such as lan­guage trans­la­tion, ques­tion answer­ing, and text gen­er­a­tion. On the other hand, Mas­ter of Laws (LL.M.) is an advanced post­grad­u­ate aca­d­e­mic degree in law. It is pur­sued by those who already hold an under­grad­u­ate aca­d­e­mic law degree, a pro­fes­sional law degree, or an under­grad­u­ate degree in a related sub­ject. In most juris­dic­tions, the LL.M. is the advanced pro­fes­sional degree for those usu­ally already admit­ted into legal practice.

ML — Machine Learning

Machine learn­ing is a sub­field of arti­fi­cial intel­li­gence that involves the use of algo­rithms and sta­tis­ti­cal mod­els to enable com­puter sys­tems to learn and adapt with­out being explic­itly pro­grammed. In other words, machine learn­ing is a type of arti­fi­cial intel­li­gence that allows machines to learn from data and improve their per­for­mance over time. It is used in a wide range of appli­ca­tions, includ­ing image recog­ni­tion, nat­ural lan­guage pro­cess­ing, and pre­dic­tive analytics.

Accord­ing to the Oxford Dic­tio­nary, machine learn­ing is “the use and devel­op­ment of com­puter sys­tems that are able to learn and adapt with­out fol­low­ing explicit instruc­tions, by using algo­rithms and sta­tis­ti­cal mod­els to ana­lyze and draw infer­ences from pat­terns in data”. Machine learn­ing is often used inter­change­ably with arti­fi­cial intel­li­gence, but the two terms are dis­tinct. While arti­fi­cial intel­li­gence refers to the gen­eral attempt to cre­ate machines capa­ble of human-like cog­ni­tive abil­i­ties, machine learn­ing specif­i­cally refers to the use of algo­rithms and data sets to do so.

MLops — Machine Learn­ing Operations

MLOps or Machine Learn­ing Oper­a­tions is a par­a­digm that aims to deploy and main­tain machine learn­ing mod­els in pro­duc­tion reli­ably and effi­ciently. It is the set of prac­tices at the inter­sec­tion of Machine Learn­ing, DevOps, and Data Engi­neer­ing. MLOps seeks to increase automa­tion and improve the qual­ity of pro­duc­tion mod­els, while also focus­ing on busi­ness and reg­u­la­tory require­ments. It is the set of prac­tices at the inter­sec­tion of Machine Learn­ing, DevOps, and Data Engi­neer­ing. It is an engi­neer­ing prac­tice that lever­ages three con­tribut­ing dis­ci­plines: machine learn­ing, soft­ware engi­neer­ing (espe­cially DevOps), and data engi­neer­ing. MLOps applies to the entire life­cy­cle — from inte­grat­ing with model gen­er­a­tion (soft­ware devel­op­ment life­cy­cle, con­tin­u­ous inte­gra­tion / con­tin­u­ous deliv­ery), orches­tra­tion, and deploy­ment, to health, diag­nos­tics, gov­er­nance, and busi­ness met­rics. It is an engi­neer­ing prac­tice that lever­ages three con­tribut­ing dis­ci­plines: machine learn­ing, soft­ware engi­neer­ing (espe­cially DevOps), and data engineering.

Midjourney

Mid­jour­ney AI is a gen­er­a­tive arti­fi­cial intel­li­gence (AI) pro­gram that cre­ates images from text descrip­tions. It is one of the most advanced AI art gen­er­a­tors avail­able, and is known for its abil­ity to pro­duce stun­ningly real­is­tic and imag­i­na­tive images. Mid­jour­ney is cur­rently in beta test­ing, and is not yet avail­able to the pub­lic. How­ever, there are a num­ber of ways to get access to the plat­form, includ­ing through a wait­list or through an invi­ta­tion from a cur­rent user. To use Mid­jour­ney, users sim­ply type in a text prompt describ­ing the image they would like to cre­ate. For exam­ple, a user might type in the prompt “a paint­ing of a cat sit­ting on a win­dowsill, look­ing out at a rainy city street.” Mid­jour­ney will then gen­er­ate a num­ber of images based on the prompt, which the user can then select from or mod­ify. See exam­ples here.

NLP — Nat­ural Lan­guage Processing

Nat­ural Lan­guage Pro­cess­ing (NLP) is a field of com­puter sci­ence that focuses on the inter­ac­tion between com­put­ers and humans in nat­ural lan­guage. It is a sub­field of arti­fi­cial intel­li­gence that deals with the pro­cess­ing of human lan­guage data. NLP com­bines com­pu­ta­tional lin­guis­tics, machine learn­ing, and deep learn­ing mod­els to enable com­put­ers to process human lan­guage in the form of text or voice data and to under­stand its full mean­ing, com­plete with the speaker or writer’s intent and sen­ti­ment. NLP dri­ves com­puter pro­grams that trans­late text from one lan­guage to another, respond to spo­ken com­mands, and sum­ma­rize large vol­umes of text rapidly—even in real time. Some of the tasks that NLP can per­form include speech recog­ni­tion, part-of-speech tag­ging, and word sense dis­am­bigua­tion. NLP is used in a vari­ety of appli­ca­tions such as dig­i­tal assis­tants, chat­bots, and machine trans­la­tion systems.

PCA — Prin­ci­pal Com­po­nent Analysis

Prin­ci­pal Com­po­nent Analy­sis (PCA) is a sta­tis­ti­cal tech­nique for reduc­ing the dimen­sion­al­ity of a dataset. It is used to trans­form a large set of vari­ables into a smaller one that still con­tains most of the infor­ma­tion in the large set. PCA is accom­plished by lin­early trans­form­ing the data into a new coor­di­nate sys­tem where (most of) the vari­a­tion in the data can be described with fewer dimen­sions than the ini­tial data. The first prin­ci­pal com­po­nent of a set of vari­ables, pre­sumed to be jointly nor­mally dis­trib­uted, is the derived vari­able formed as a lin­ear com­bi­na­tion of the orig­i­nal vari­ables that explains the most vari­ance. The sec­ond prin­ci­pal com­po­nent explains the most vari­ance in what is left once the effect of the first com­po­nent is removed, and we may pro­ceed through iter­a­tions until all the vari­ance is explained. PCA is most com­monly used when many of the vari­ables are highly cor­re­lated with each other and it is desir­able to reduce their num­ber to an inde­pen­dent set. PCA has appli­ca­tions in many fields such as pop­u­la­tion genet­ics, micro­biome stud­ies, and atmos­pheric science.

Para­phras­ing Tool

A para­phras­ing tool is a soft­ware that can rewrite or rephrase a sen­tence with­out chang­ing its mean­ing. It is used to sub­sti­tute spe­cific words, phrases, sen­tences, or even whole para­graphs with alter­nate ver­sions to cre­ate a slightly dif­fer­ent vari­ant. Para­phras­ing tools are designed to help users avoid pla­gia­rism and improve their writ­ing skills by pro­vid­ing them with a way to rephrase text in their own words.

There are many para­phras­ing tools avail­able online, includ­ing Quill­Bot AI, Ref-n-Write, Check-Pla­gia­rism, Spin­Bot, and Scribbr. These tools use advanced AI tech­nol­ogy to rephrase text, essays, and arti­cles in var­i­ous styles and dialects. They offer fea­tures such as syn­onym replace­ment, sen­tence restruc­tur­ing, and vocab­u­lary enhance­ment to help users cre­ate unique con­tent that is free of plagiarism.

Perplexity

In the con­text of arti­fi­cial intel­li­gence (AI), per­plex­ity is an impor­tant mea­sure­ment for deter­min­ing how good a lan­guage model is at pre­dict­ing the next word in a sequence given its pre­vi­ous words. It is used to com­pare prob­a­bil­ity mod­els and may be used to eval­u­ate the qual­ity of the model’s pre­dic­tions by eval­u­at­ing the inverse prob­a­bil­ity of the test set, nor­mal­ized by the num­ber of words, or by cal­cu­lat­ing the aver­age num­ber of bits required to encode a sin­gle word through cross-entropy. It is used along with “bursti­ness” to detect AI pro­duced content.

Perplexity

In the con­text of arti­fi­cial intel­li­gence (AI), per­plex­ity is an impor­tant mea­sure­ment for deter­min­ing how good a lan­guage model is at pre­dict­ing the next word in a sequence given its pre­vi­ous words. It is used to com­pare prob­a­bil­ity mod­els and may be used to eval­u­ate the qual­ity of the model’s pre­dic­tions by eval­u­at­ing the inverse prob­a­bil­ity of the test set, nor­mal­ized by the num­ber of words, or by cal­cu­lat­ing the aver­age num­ber of bits required to encode a sin­gle word through cross-entropy. It is used along with “bursti­ness” to detect AI pro­duced content.

Prompt

In the con­text of nat­ural lan­guage pro­cess­ing, prompt refers to the text that is used to ini­ti­ate a con­ver­sa­tion with an AI chat­bot. The prompt can be a ques­tion, state­ment, or any other text that is used to start the con­ver­sa­tion. The speci­ficity of prompts can pro­duce a wide vari­ety of results and writ­ing com­plex prompts is a valu­able skill with nat­ural lan­guage model AI/Chatbots.

Quantization

Quan­ti­za­tion is the process of reduc­ing the stor­ing pre­ci­sion of the para­me­ters (weights) of the LLM model to down­size it and save mem­ory, the main bot­tle­neck of LLM mod­els. There are dif­fer­ent meth­ods, but the con­cept is always the same, com­press­ing the val­ues of the weights into smaller bit sizes so it takes up less space/memory. See Microsoft Paper (here).

RF — Ran­dom Forest

Ran­dom For­est (RF) is a machine learn­ing algo­rithm that is used for clas­si­fi­ca­tion and regres­sion analy­sis. It is an ensem­ble learn­ing method that com­bines mul­ti­ple deci­sion trees to improve the accu­racy and robust­ness of the model. Each deci­sion tree in the for­est is trained on a ran­dom sub­set of the train­ing data and a ran­dom sub­set of the fea­tures. The out­put of the RF model is the aver­age (in regres­sion) or major­ity vote (in clas­si­fi­ca­tion) of the out­puts of the indi­vid­ual trees. RF has been applied to var­i­ous fields such as image clas­si­fi­ca­tion, text clas­si­fi­ca­tion, and bioinformatics.

RLHF — Rein­force­ment learn­ing from human feedback

Rein­force­ment learn­ing from human feed­back (RLHF) is a tech­nique that trains an AI agent to make deci­sions by receiv­ing feed­back from humans. It is a sub­field of arti­fi­cial intel­li­gence that com­bines the power of human guid­ance with machine learn­ing algo­rithms. RLHF involves train­ing a “reward model” directly from human feed­back and uses the model as a reward func­tion to opti­mize an agent’s pol­icy using rein­force­ment learn­ing. RLHF has been used in var­i­ous appli­ca­tions such as chat­bots, rec­om­men­da­tion sys­tems, and game play­ing. It has shown promis­ing results in improv­ing the per­for­mance of AI agents by incor­po­rat­ing human feed­back into the train­ing process. How­ever, there are still many chal­lenges to over­come, such as design­ing effec­tive reward func­tions and ensur­ing that the feed­back pro­vided by humans is accu­rate and consistent.

RNN — Recur­rent Neural Network

Recur­rent Neural Net­work (RNN) is a type of arti­fi­cial neural net­work that is com­monly used in nat­ural lan­guage pro­cess­ing, speech recog­ni­tion, and other sequence-based tasks. RNNs are designed to process sequen­tial data by main­tain­ing an inter­nal state or mem­ory that allows them to cap­ture tem­po­ral depen­den­cies between inputs. They are com­posed of a series of inter­con­nected pro­cess­ing nodes, which are arranged in a directed cycle. Each node receives input from the pre­vi­ous node and pro­duces out­put that is fed to the next node in the sequence. The out­put of each node is also fed back into the net­work as input to the next time step, allow­ing the net­work to main­tain a mem­ory of pre­vi­ous inputs. RNNs can be trained using “back prop­a­ga­tion through time” (BPTT), which is an exten­sion of back­prop­a­ga­tion that allows gra­di­ents to flow through the net­work over mul­ti­ple time steps.

SVM — Sup­port Vec­tor Machine

Sup­port Vec­tor Machine (SVM) is a type of machine learn­ing algo­rithm that is used for clas­si­fi­ca­tion and regres­sion analy­sis. SVMs are based on the con­cept of find­ing the best hyper­plane that sep­a­rates the data into dif­fer­ent classes. The hyper­plane is cho­sen such that it max­i­mizes the mar­gin between the two classes. SVMs can be used for both lin­ear and non-lin­ear clas­si­fi­ca­tion tasks by using dif­fer­ent ker­nel func­tions. SVMs have been applied to var­i­ous fields such as image clas­si­fi­ca­tion, text clas­si­fi­ca­tion, and bioinformatics.

Tur­ing test

The Tur­ing Test is a test of a machine’s abil­ity to exhibit intel­li­gent behav­ior equiv­a­lent to, or indis­tin­guish­able from, that of a human. It was pro­posed by Alan Tur­ing in 1950 and is named after him. The test involves a human eval­u­a­tor who judges nat­ural lan­guage con­ver­sa­tions between a human and a machine designed to gen­er­ate human-like responses. If the eval­u­a­tor can­not reli­ably tell the machine from the human, the machine is said to have passed the test. The test results do not depend on the machine’s abil­ity to give cor­rect answers to ques­tions, only on how closely its answers resem­ble those a human would give.