This post aims to give an insight into the basics of quantum computing (QC) to anyone interested. The focus is primarily on presenting the most important points in a clear and comprehensible way without much prior knowledge. Even though the sections build on each other, the topic markers at the beginning of each one should help with looking up a particular concept. Afterwards, a short review of import topics in artificial intelligence (AI) and machine learning (ML) is presented. This is done in order to finally take a closer look at the intersection of classical AI and QC: Quantum Machine Learning. Also, other promising applications of QC will be discussed.

This post will also be expanded in the future to look at certain concepts and techniques for quantum computing in further detail. The necessary mathematical background for that part (in terms of vector calculus, probability theory and complex algebra) is assumed to some extent. These updates are designed for people who are particularly interested in the topic or who start working in this field in hope of reducing potentially long research time a bit.

Quantum Computing

classical and quantum-mechanical light switch

A classical bit can be in the state 0 or 1 at any given time. Contrary, a qubit can be in a superposition of both states which allows for a simultaneous calculation with both values.

Classical Bit vs. Qubit

Classical computers work with bits. Any information displayed on monitors is translated into 0’s and 1’s by the computer. Figuratively speaking, the values 0 and 1 could be interpreted as “lights off” and “lights on”, respectively. At any point in time the light is in one of these two states. A qubit, on the other hand, refers to a quantum mechanical bit. It also has the attributes 0 and 1 but they are written in the quantum mechanical notion as |0〉 and |1〉. This is referred to as the braket- or Dirac-notation and is commonly used when dealing with quantum states. And this is exactly what is present: A qubit is a quantum-mechanical system with 2 states. What exactly the advantages of such a system are will be discussed in the following sections.


Superposition

In quantum physics there are principles which we do not know in our classical world. For example, some readers may be familiar with the thought experiment of Schrödinger’s cat, in which a presumably dead cat in a box is both dead and alive as long as no one looks into it. Without diving deeper into this thought experiment, we will look at the underlying phenomenon: Superposition. Superposition is the property of any quantum mechanical system that its possible states all overlap until the system is being observed or measured. This is linked to one of the fundamental statements of quantum mechanics that the state of a system is determined by the measurement itself. For lack of a better term, one could say that before measuring it a quantum mechanical system is in all possible states simultaneously.

The ability to generate a superposition is, along with entanglement (see below), the most important aspect when talking about the potential of quantum computing. In fact, a qubit can be manipulated to exist in the two states |0〉 and |1〉 simultaneously. When performing a calculation with such a qubit on a quantum computer, the two states are processed in parallel and both results, based on the input |0〉 or |1〉, are obtained (encoded as a state on the quantum computer). This phenomenon is called quantum parallelism and becomes particularly powerful when looking at a system with multiple qubits.

Superposition in Quantum Registers

Looking at a system consisting of two qubits, a superposition can be generated just as good as with one qubit. Such a quantum register, as a system of multiple qubits is called, can thus exist simultaneously in the four states |00〉, |01〉, |10〉 and |11〉. In fact, with an additional qubit all previously possible states can be generated in combination with the extra |0〉 and |1〉, respectively. Consequently, the number of possible states doubles with every qubit that gets added to the register. Thus, the growth of possible states  is exponential and the meaning of this is revealed for larger registers. As a small example: On earth there are approximately 7.8 billion people. In order to build a quantum computer which is able to process the same number of states, only 33 qubits are necessary. This capacity is why quantum computing is considered a breakthrough technology.

When transitioning from a classical bit to a qubit, an entire state space is ‘inflated’ in the form of a sphere. Besides the pure  and  there is an infinite amount of possible states on the surface  of the Bloch sphere.

Bloch Sphere

But what exactly should you think of when talking or reading about qubits? As mentioned in the beginning, they are quantum-mechanical systems consisting of two states but their technical realization depend tremendously on the chosen architecture (see e.g. here for a comparison of different architectures). For example, companies like Google or IBM work with qubits based on superconductivity. Another approach, among others, is used by IonQ which uses charged atoms (ions) in combinations with lasers to realize qubits. However, qubits can always, independent of their individual architecture and thus their actual realization, be visualized with the help of the Bloch sphere.

With quantum-mechanical bits there is a much larger space of possible states compared to a classical bit. One can imagine that during the transition to a qubit between the two endpoints 0 and 1 a whole state space is spanned, which then exists in the form of a sphere. A qubit can then take all possible states on the surface of this Bloch sphere. Also, the following is worth mentioning: The closer the point on surface is to |0⟩, i.e. the north pole, the more likely it is to get a 0 as a result when measuring the qubit. The same is true for the |1⟩ state, or the south pole. Different points on the surface of the Bloch sphere correspond to different probabilities for the measurement of a 0 or a 1. It is still interesting that on the same “latitude” of the Bloch sphere the probability for the two states remains the same. Here, only the mathematical description of the state changes, which is quite important for quantum algorithms. If, on the other hand, the state of the qubit is exactly on one of the poles, the probability for measuring 0, or 1 respectively, is 100%.

Measurement

But what exactly does it mean to “measure” a qubit? As described at the beginning, quantum mechanical systems are in all possible states until they are observed or measured. Until a measurement is performed, a quantum system is, so to speak, a “black box” (a closed system where the insides are unknown). Even if the state of the quantum system is not known, it can still be influenced from the outside without losing destroying the superposition. While there is no classical analogy for superposition, it is not difficult to find an example of a black box in that regard: Even if one does not know the contents of a cooking pot, one can still heat it on a stove or add spices to influence the contents. If one also has the information about the insides of the pot at the beginning, one can imagine (with sufficient prior culinary knowledge) what happened while cooking it. By looking into the pot and tasting its contents, certainty is achieved.

This brings us to one of the most important statements of this post: A computation on a quantum computer is nothing more than the controlled modification of probabilities. As a simplified picture, one can imagine that an operation on a quantum computer shifts the points on the Bloch spheres of several qubits, thus changing the probabilities of measuring 0 or 1 for each qubit. Like this, one generates different probabilities for different states of the register, namely, a probability distribution. A single measurement, whose technical implementation depends strongly on the chosen qubit architecture, then resembles a random experiment. Because of the importance of this term in connection with quantum computing, it shall be considered again in more detail by means of a small example:

When rolling two normal dice, a random experiment is performed. In this case, the most probable result is a 7, since most value combinations of the dice result in this value (1 and 6, 2 and 5, 3 and 4). Rolling a 2, on the other hand, is very unlikely, since both dice must show a 1. It can be calculated that the probability of rolling a 7 with two dice is about 17%, while the probability of rolling a 2 is about 3%. So when rolling two dice 100 times and writing down the results, one expects  a 7 in about 17 cases, whereas a 2 should occur in about 3 rolls. Similar to a quantum computer, rolling the dice (measuring) once does not give any information about the probability distribution of the numbers on the dice (states). However, if one performs the same random experiment more often, the frequency of the results behaves according to the probability distribution.

Even if entangled qubits are locally separated from each other, they form a closed quantum system (indicated by the connecting line), which instantly gets affected as a whole when, for example, one qubit is measured

Entanglement

In addition to superposition another quantum mechanical phenomenon is important when working with quantum computers. Quantum mechanical systems can be influenced in such a way that the individual states of different qubits are unknown but depend on each other when being measured. This phenomenon, which, for instance, plays an important role in the quantum teleportation protocol, is called entanglement and can also be explained using a simplified small example:

Light particles (photons) have a property called “polarization” which essentially describes the direction of rotation of the particle. Photons can be either left- or right-handed in terms of polarization. If one creates a pair of photons from a common source, this can be done in such a way that the polarization must be exactly opposite to each other. Thus, if one measures that one of the photons is left-handed, one automatically knows that the other photon must be right-handed. Here it is important that both photons were in a superposition of left- and right-handedness before the measurement. Namely, there were no hidden or inaccessible information during the creation of the photon pair which would indicate the properties when measuring them (for the interested reader the keyword is hidden variables, see e.g. here for a video). In fact, neither photons property is fixed until the other one is measured.

For quantum computing, however, the ability to entangle two qubits is not just a nice feature, but essential for unlocking its true potential. As mentioned before, the power of quantum computing lies in the fact that with just a few qubits, one can span an exponentially large computational space. The only problem here is that without entanglement, not the whole space is accessible for computation. A quantum computer should be able to generate a superposition of all possible combinations of states of the qubits, but this is only possible by allowing entanglement.

Current Situation

After this introduction to the basic concepts of quantum computing, a few more words about the current state of the art: Considering that today’s largest quantum computers operate with around 100 qubits, it is natural to ask why, besides Google’s “Quantum Supremacy” experiment in 2019 (demonstrating a task that a quantum computer could solve much faster than a classical computer) with 53 qubits, there have not been many more media breakthroughs in the field. The quick answer is that this quantum superiority was demonstrated for a very specific task that was ideal for today’s quantum computers. This is because today’s quantum computers are still very error-prone and can only be operated stably for short periods of time. The good answer, however, is that technological advances and steady increases in the number of usable qubits will ensure that most likely in the very near future there will be relevant problems which a quantum computer can solve faster and/or better than any classical computer.

Artificial Intelligence / Machine Learning

Nowadays, when talking about current AI applications, what is meant are always forms of weak AI. Contrary, strong AI refers to an artificial intelligence which excels in multiple disciplines like complex logical thinking-patterns or human-like communication abilities. Right now it only appears in sci-fi movies and cannot yet be realized. A weak AI, however, can handle tasks in a very limited scope. This does not mean that the tasks can be solved worse by a weak AI than by a human. In most cases, AI works even much better than a human.

By limited scope, we simply mean that while weak AI can be constructed for a variety of tasks, conceptionally different tasks also have to be approached differently. A voice assistant, such as the ones preinstalled on many of our smartphones, can recognize what is said very well within certain limits, but it could not easily be extended to recognize obstacles for autonomous driving.

The capabilities of the AIs presented above are based on individual models that generate an output based on a corresponding input. For example, a model could be designed to recognize a cat on an input image. This task is very simple for us humans in the vast majority of cases. The AI model, on the other hand, could receive as input the specifications of the colors (RGB values) of the individual pixels as a long list of numbers. This input could then be used to generate a single number between 0 and 1 that indicates how likely the model thinks that a cat can be seen in the image.

Simplified example: The individual pixels of an image can be translated into gray levels between 0 (black) and 1 (white) as numbers for a model that was developed, for example, to recognize cats in images.

Machine Learning Methods

The construction of such a model is the central topic in Machine Learning (ML). Therein, models can be divided into different categories based on their general learning scheme. In a supervised learning model, for example, there is an intensive learning process in which the model (to stay with the example) is shown photos with and without cats with the according information. In this way, the model learns to recognize features of cats in pictures, and then, after the learning process is complete, it is able to evaluate whether a cat can be seen in new, unfamiliar pictures. The term “supervised” learning comes from the fact that the model receives already labeled data, i.e. data that has been provided by another entity (e.g. a human) with information about its content.

An unsupervised learning model, on the other hand, tries to recognize structures or patterns in given data sets without providing any additional labels as input. For example, such a model may be shown a large number of images of human faces during a training process. Based on these images, the model is supposed to learn what human faces look like and how certain features in the image depend on each other in order to be able to generate realistic images of faces by itself. Such artificially generated images can be seen on the site thispersondoesnotexist.com and impressively demonstrates the potential of machine learning.

Illustration of very simple tasks for a super- and unsupervised learning model: Fort he separation oft wo distinct groups (called classes) a supervised learning model (left) needs sample data points, whose group affiliation is specified. This allows the model to find a dividing line (dashed) and also to classify new, unknown data into the groups (classification task). A task for an unsupervised learning model could be to find accumulation points within data sets and to divide them (based on criteria like maximum distance between different points) into groups (clustering task).

Another approach is reinforcement learning, which is similar to an efficient trial-and-error approach. Within such models there is a so-called agent, which can perform different actions in given situations. Based on an implemented reward function, a sequence of actions can then be evaluated as either good or bad, whereupon the agent adjusts his sequence of actions. By trying many different possibilities, the reward should be maximized and thus an optimal course of action found. Reinforcement learning models can be used in robotics, for example, to teach a robot to stand or walk.

Reinforcement learning is often used in game situations. For example, the agent (Mario) is asked to perform a sequence of actions (moving, jumping) in order to maximize the reward (distance traveled and/or time survived). (Image from towardsdatascience.com)

Deep Learning

With the advancements made in technological development in recent years, another field in Machine Learning has become much more present, which is called Deep Learning. Deep Learning is based on a special architecture of the model, which is called an (artificial) neural network. The name is inspired by the biochemical processes of the brain, in which neurons are interconnected and can communicate with each other through these connections. Depending on how much communication takes place between two neurons, the stronger the connection between them. Based on the communication or an input signal, neurons can also be activated. The higher the activation of a neuron, the more important its role in the transmission of the data involved.

Mathematically, there is nothing more behind neural networks than a large set of numbers arranged in matrices (weights between neurons) and vectors (activations). Due to the extremely high degree of complexity of today’s deep learning models, the matrices and vectors involved are accordingly high-dimensional, which just means that very a huge amount of weights and activations contribute to the model. And this is exactly where quantum computers come into play.

Graphical representation an imaginary neural network. Circles correspond to neurons whose activation is represented by brightness. Lines represent the strength of the connection between neurons, with both “positive”/blue, and “negative”/red connections allowed. The strength of the connection is represented by the thickness of the line.

Quantum Machine Learning and other potentials of QC

When talking about the potentials of quantum computers, there are several fields to be mentioned: In chemistry, the simulation of molecules is a great challenge, since the description of the atoms and electrons involved must be a quantum mechanical one. Classical simulations quickly reach their limits here, since the corresponding properties are difficult to describe even for smaller molecules due to the enormously high number of possible states. The description of such effects are much easier for a quantum computer, since qubits follow the identical physical laws. Therefore, it is also expected that the first industrial successes of quantum computers will be achieved in the field of chemistry.

3D-model of a methylamine molecule (structural formula CH3NH2). The dark gray sphere represents a carbon atom (C), the blue sphere represents a nitrogen atom (N) and the light gray spheres represent hydrogen atoms (H). Created with Avogadro.

Illustration of a small network. Lines between points could correspond to a length or time of the route. When including more nodes, finding an optimal route between two points becomes much more complex.

Since quantum computers are able to process an exponentially large number of states due to superposition, successes in combinatorial optimization are also expected. This topic area deals, for example, with questions of how to find an optimal route to a given destination or in what order to send deliveries to different addresses. In general, solutions are sought to problems that require a finite (but mostly large) set of points and connections between them to describe them. For such problems there is an exponentially large number of solutions, which scales badly when processed with a classical computer. This is due to the fact that a large number of new possibilities/routes/etc. are added when just one more point with associated connections is added to the problem.

A third field where quantum computing is expected to have a major impact in the future is machine learning. As described in the section on the introduction of quantum computing, quantum computers inherently operate, with a sufficiently large number of qubits,  in high-dimensional spaces which are generated exploiting superposition and entanglement. As was also mentioned in the last section, neural networks, on which a lot of today’s modern AI applications are based, also operate with matrices and vectors in high-dimensional spaces. However, with classical computing resources, the scaling of these spaces is limited. Thus, the use of quantum computing allows for more complex models requiring fewer resources than a classical approach. Furthermore, another point is that the probabilistic nature of quantum systems can be advantageous for certain machine learning methods. Here, probabilistic means that a prediction is made with a certain probability and is not absolutely predetermined (i.e., deterministic). Taking for example the field of reinforcement learning, one would mostly like to achieve a certain variance in the executed actions, which could very naturally be achieved by a quantum computer. To what extent this aspect can actually be exploited, remains largely unexplored.

Nevertheless, it can be assumed that for the foreseeable future quantum computers will work in tandem with classical computers: Calculations of classical machines will be supported or improved by using quantum computers. This is referred to as hybrid quantum computing when classical and quantum mechanical computers are allowed to work closely together. This is also how the near future looks in the case of quantum machine learning. What this term actually refers to most of the time is quantum-enhanced machine learning. This means that certain subroutines of a machine learning model are to be evaluated on a quantum computer, since either the calculations are to be executed more efficiently or the results obtained are to be qualitatively better.

Hybrid quantum computing scheme. The measurement results are processed using a classical computer. Depending on the task and previous results, the classical computer creates input for the quantum hardware.

Apart from these supporting ones, a special approach to quantum machine learning shall be discussed here, which is in focus due to its currently already very good feasibility: In so-called variational models, there are classical parameters (adjustable numbers) that are given as input to the quantum computer. These parameters correspond, if we stay with the image of the Bloch sphere, to rotations of the state of a qubit on the surface of the sphere. The idea with variational models is then to classically optimize these parameters based on the obtained measurement results. So, again, you have a hybrid quantum computing model, where after the classical training process (i.e., optimizing the involved parameters), you are left with quantum hardware only. Such variational models are currently studied quite intensively at the moment in order to be able to determine possible advantages in the field of QML in the near future

Introduction to Quantum Computing,
Artificial Intelligence and Quantum Machine Learning