The Fuzz about Quantum Convolutional Neural Networks
Written by Charles Yuan. A discussion on the paper title “Quantum convolutional neural network for classical data classification”.
Introduction
“First I will take the skies from the human race! Military aircraft, civilian planes, space capsules, rockets, ICBMs. On my order, every country in the world will become nailed to the ground. Humanity will lose its best form of transportation, and the skies will be thrust a hundred years into the past!” [1] A bold proclamation of world peace by none other than a crazed, white-haired arms dealer. Incidentally, it also happened to be my first taste of the potential of quantum computers. Now, I’m sure we’ve all heard of quantum computers before, so much so that they can be placed in the bucket of over-hyped concepts along with artificial intelligence and nuclear fusion. Science fiction also misconstrues and over-amplifies their abilities to effortlessly break all encryption and descend the world into anarchy, handing the reigns over to the one with the quantum computer itself. Unfortunately, real life isn’t nearly as interesting. With the current limitations of quantum computers and the existence of quantum and post-quantum encryption, no world-conquering will be possible with quantum computers alone, for better or for worse. However, that’s not to say that we can’t have some fun with them in the meantime. Ever heard of quantum neural networks?
Basics of Quantum Computing
For those who’ve heard of quantum computers, you’ve probably heard of something called qubits. Whereas conventional computers utilize bits (1’s and 0's), quantum computers use qubits as their basic units of computation. These qubits represent a superposition state between | 0 ⟩ and | 1 ⟩, where a single qubit state can be represented as a normalized two-dimensional complex vector | ψ ⟩ = | α ⟩+ | β ⟩, ||α||² + ||β||² = 1 [2]. Subsequently, a multi-qubit system can then be represented as a tensor product of n single qubits. This exists as a superposition of 2ⁿ basis states from |00…00 ⟩ to |11…11 ⟩ [2]. Quantum entanglement appears as a correlation between different qubits in this system that is controlled by quantum gates in a quantum circuit to perform quantum computations for a specified purpose [2]. Quantum gates are unitary operators that map a qubit system into another one [2]. We’ll discuss the specifics of this in the next section.
Confused? Well, all you really need to know is that the qubits, which quantum computers use, result in a system that can use quantum gates to perform numerical tasks. These quantum gates leverage the principles of quantum superposition and entanglement to achieve exponential computational gains over existing classical algorithms on tasks such as prime factorization. Prime factorization is important, as most modern computer cryptography works by utilizing large numbers to encrypt files and its corresponding prime factors to decrypt it again. Remember when I said that multi-qubit systems represent a superposition of 2ⁿ basis states? A regular computer can only be in ONE of the 2ⁿ states, represented by a sequence of 1’s and 0’s. However, a quantum computer can be in ALL 2ⁿ states at once due to quantum superposition, which is only possible due to quantum entanglement. Thus, this allows quantum computers to perform crazy feats, like putting the achievement of world peace in the hands of an arms dealer.
Variational Quantum Circuits and QNN’s
So how does this all relate to neural networks and machine learning? So far, it seems like all I’ve mentioned is a bunch of complicated math and quantum physics without giving any context for their usage. Well, quantum computing in and of itself is its own research field, and those who wish to develop quantum neural networks will need to understand much more than what I’ve explained. Concepts such as entanglement and superposition are also derived from the study of quantum physics, which is a complicated field in and of itself. If you wish to learn more about these topics and the math behind them, I suggest you do some independent research. However, for brevity and clarity, I’ll try to simplify things from here on out.
The fundamental algorithm behind the creation of quantum neural networks is known as a variational quantum algorithm (VQA) [2]. This is a type of algorithm that uses quantum circuits known as variational quantum circuits (VQC). Quantum circuits, by the way, are just circuits used for quantum computing, similar to classic circuits used for classical computations. VQC’s utilize the rotation operator gates, which comprise unitary operators, to perform various numerical tasks [2]. Without delving into the specifics of unitary operators, you should know that VQC’s, like neural networks, possess free parameters. VQA’s were originally developed for universal function approximations, which, coincidentally, is also exactly what neural networks are used for. Subsequently, the free parameters of the rotation operator gates can be trained in order to perform various numerical tasks such as approximation, optimization, and classification [2]. Sound familiar?
This thus led to the application of VQA’s in machine learning, replacing the artificial neural network component of the existing models with VQC’s [2]. Subsequently, this resulted in the creation of quantum neural networks, or QNN’s. However, there is one last thing we need to discuss before talking about the model itself. Since we are using conventional data with quantum neural networks, we must first encode the input correctly.
Quantum Data Encoding
Existing machine learning techniques already perform transformations on input data X using feature maps; φ : X → X’. This often makes data easier to work with afterwards. In quantum computing, a quantum feature map is utilized to transform input data into a Hilbert space; φ : X → H [2]. Without delving too much into it, just know that a Hilbert space is a vector space that represents the state of a physical system in quantum mechanics [4]. This is especially important to quantum computing, as our data will be represented now by vectors in Hilbert space.
So how exactly do we implement this quantum feature map? Well, there are actually four techniques: amplitude encoding, qubit encoding, dense qubit encoding, and hybrid encoding [3]. However, for the sake of simplicity (and keeping your mental states intact), I’ll only briefly describe the first one. Amplitude encoding involves encoding input data as probability amplitudes of a quantum state. By amplitude encoding the input data, the number of trainable parameters in the QNN scales at log₂n due to the previously discussed principle of quantum superposition, as opposed to the conventional n [3]. For those who are familiar with big O notation, you’d understand that O(log(n)) is significantly less than O(n), which is what makes amplitude encoding so powerful. Essentially, using amplitude encoding can allow for exponentially larger input data sizes to be processed without needing to scale up the size of the network dramatically. This is part of what makes quantum neural networks so promising.
Quantum Convolutional Neural Networks
So finally we arrive at the main topic: Quantum Convolutional Neural Networks. For those who are familiar with conventional CNN’s, you’ll know that they are comprised of three types of layers: convolutional, pooling, and fully connected. For QCNN’s, the convolutional and pooling layers are represented by the aforementioned parameterized VQC’s, while there is no need for a quantum equivalent of fully connected layers. That’s because the QCNN directly outputs a probability of the final measured state [3]. These “quantum layers’’ are now known as ansatz, but you can think of them as parameterized quantum circuit templates [3]. For both the convolutional and pooling circuits, there are a variety of two-qubit quantum circuits to choose from. For the convolutional circuits, the authors of the paper, Hur et al., decided to test each of the following nine:
Drawing primarily upon the work of a different paper by Sim et al., the authors modified the four-qubit parameterized quantum circuits to work with two-qubit data [3] [5]. They selected these circuits primarily due to their entangling capability, simplicity of construction, and similarity in function to conventional convolutional layers [3]. For the pooling layers, a simple two-qubit circuit with two free parameters was chosen [3]. The pooling layer applies two controlled rotations Rᶻ(θ₁) and Rₓ(θ₂) to essentially perform the quantum equivalent of size reduction. Finally, the output of the two-qubit network is the probability of the last qubit being measured in the | 1 ⟩ state.
Results
So how do QCNN’s fare when actually tested on benchmark datasets? Well, utilizing MNIST and fashion MNIST, the authors decided to compare the performance of each of the nine convolutional ansatz against each other, along with the best-performing ones against conventional CNN’s [3]. The datasets themselves were comprised of 28x28 pixel images, with the task being binary classification (hence the choice of two-qubit circuits) [3]. The batch size used was 25 with a learning rate of 0.01, with the model being trained on 200 iterations for simplicity and speed [3]. The results of each of the different ansatz is displayed in the following table.
The results reveal that all the ansatz perform reasonably well, with the ones possessing more free parameters achieving higher scores. This is all well and good, but the most important question is, how do they compare to conventional CNN’s? Do they perform better or worse? To answer that, the authors constructed four different CNN’s with 26, 34, 44, and 56 free parameters, to roughly match the 40–50 free parameters that the ansatz have [3]. After testing them on input sizes of 8 and 16, the results show that the QCNN models actually perform equal to or better than their CNN counterparts on both MNIST and fashion MNIST [3].
Conclusion
So then, you might be wondering, why aren’t quantum neural networks more widespread? Why do we still use conventional algorithms when quantum machine learning holds so much promise? Well, for one, not everyone has access to a quantum computer, and you can’t simply purchase one at your local Best Buy. Despite quantum programming becoming more democratized with open-source libraries like Qiskit and Cirq, access to cutting-edge quantum computers still are not available to the general public. Another key detail to take note of here is the limited size of the CNN’s utilized in the authors’ experiments. They utilized a CNN with a few dozen free parameters. Compare that to conventional neural networks, which usually possess a few million trainable parameters, and you can understand why the performance of the CNN’s were surprisingly lacking. Lastly, quantum computing is hard. Anyone with a basic knowledge of linear algebra, statistics, and partial derivatives can implement a neural network from scratch, but concepts in quantum computing are based on those of quantum physics, which is a whole different ballpark. There are numerous concepts from quantum physics, quantum computing, and systems analysis that I did not cover, and quite frankly would take entire textbooks to teach properly.
So then, should we simply give up on quantum neural networks? Of course not! Quantum machine learning itself is still a fairly premature field of research, and is also limited by our current quantum computing environments. In the future, when IBM or Google decide to provide users with access to a 1000-qubit quantum computer, who knows what type of quantum neural networks can be developed? However, until then, a lot of work can still be done in both areas of machine learning and quantum computing without needing to merge the two. Nevertheless, the next time you wanna seem smart in front of your friends, just explain the basics of QCNN’s and watch as their minds shut down.