Researchers from Multiverse Computing reported a quantum enhancement of a large language model on IBM hardware. This involves a hybrid scheme utilizing a 156-qubit Heron processor.

The authors described the experiment as the first "end-to-end quantum enhancement" of an LLM on a superconducting processor for autoregressive text generation.

The tests used Meta's Llama 3.1 8B. The base model was not fine-tuned; its parameters were "frozen," and quantum adapters—Cayley-parameterized unitary adapters (CUA)—were added. Initially, these were trained classically before being integrated into the hybrid quantum-classical scheme.

The experiment was conducted on the IBM Quantum System Two, an architecture designed for hybrid quantum systems, utilizing the 156-qubit Heron chip.

The hybrid version reduced the perplexity of Llama 3.1 8B by 1.4%. This was achieved by adding approximately 6,000 parameters—about 0.000075% of the model's size.

During the demonstration, the quantum-enhanced Llama correctly answered questions about astronomy and biology that the base version struggled with (for example, regarding the presence of rings around all gas giant planets).

According to lead author Borja Aispuru, this work serves as proof of concept. The quantum blocks enabled more accurate predictions of the next token in the text with minimal computational resource expenditure.

The team aims to achieve further reductions in perplexity and improvements in accuracy with fewer parameters compared to fully classical approaches.

As a reminder, in May, shares of quantum companies rose following the U.S. Department of Commerce's announcement of $2 billion in funding for American firms under the CHIPS R&D program.