Thursday, November 21, 2024

Creating liberating content

Realme 12X 5G Tipped...

The Realme 12x 5G was launched by Realme last week in China. The...

iQOO will launch a member...

iQOO Neo 10 series's new member will feature SDG3 SoC In April, iQOO is...

Samsung Galaxy A35 and...

Samsung Galaxy A35 and A55 Specs and featuresRelated Samsung released the Galaxy A35 and...

Motorola confirms upcoming smartphone...

Motorola has begun to tease the release of its next smartphone. It is...
HomeAITruEra has released...

TruEra has released a free tool to test LLM apps for hallucinations.

TruEra, a vendor that provides tools for testing, debugging, and monitoring machine language (ML) models, today announced the release of TruLens, open-source software specialised to testing applications built on large language models (LLMs) such as the GPT series.

TruLens, which is available for free today, gives organisations with a quick and easy approach to review and iterate on their LLM applications, eliminating the possibility of hallucination and bias during the manufacturing stage.

Currently, just a few vendors provide tools to address this element of LLM app development, even as enterprises across industries continue to investigate the potential of generative AI for various use cases.

Why is TruLens used in LLM applications?

LLMs are popular, but when it comes to developing apps based on these models, businesses must go through a time-consuming trial phase that includes human-driven response scoring. Once the first version of an app is created, teams must manually test and assess its responses, tweak prompts, hyperparameters, and models, and then re-test until a satisfying result is obtained.

This takes a long time and is tough to scale up.

TruEra is addressing this issue with TruLens by proposing a programmatic technique of evaluation known as “feedback functions.” According to the business, a feedback function evaluates the quality and efficacy of an LLM application’s output by analysing both the text generated by the LLM and the response’s metadata.

“Consider it a way to track and evaluate direct and indirect feedback on the performance and quality of your LLM app.” This enables developers to create credible and powerful LLM apps more quickly. “You can use it for a wide range of LLM use cases, such as chatbot question answering, information retrieval, and so on,” Anupam Datta, TruEra’s cofounder, president, and chief scientist, told VentureBeat.

With a few lines of code, TruLens may be integrated into the development process. Once it’s up and running, users can design their own feedback functions that are tailored to specific use cases, or they can rely on the built-in alternatives.

Currently, the software includes feedback features that assess truthfulness, relevance of question-answering, harmful or poisonous language, user attitude, language mismatch, response verbosity, and fairness and prejudice. Furthermore, it reports how much an LLM is pinged within the programme, providing a convenient way to track usage expenses.

“This also assists you in determining how to build the best version of the app at the lowest possible ongoing cost.” “Every ping adds up,” Datta observed.

Other offerings for LLM applications

While testing LLM-driven apps for performance and response accuracy is critical, only a few players have introduced ways to address it. Datadog’s OpenAI model monitoring integration, Arize’s Pheonix solution, and Israel-based Mona Labs’ recently debuted generative AI monitoring solution are among them.

According to TruEra, TruLens is best employed throughout the development phase of LLM app development.

“This is actually the phase that most companies are in right now — they’re experimenting with development and have a real need for tools to help them iterate faster and zero in on application versions that are both effective at their tasks and risk-free.” “Of course, you can use it on both development and production models,” Datta explained.

According to a survey conducted by Accenture, 98% of worldwide executives believe that AI foundation models will play a major part in their organisations’ strategies during the next three to five years. This indicates that enterprise demand for products like TruLens will expand in the near future.

Get notified whenever we post something new!

Continue reading

Realme 12X 5G Tipped to Launch in India Soon

The Realme 12x 5G was launched by Realme last week in China. The Realme 12x 5G sits lower than other current models, such as the Realme 12 5G and 12+ 5G. There are multiple rumors that the smartphone will...

iQOO will launch a member of the Neo 10 series featuring a Snapdragon 8 Gen3 chipset.

iQOO Neo 10 series's new member will feature SDG3 SoC In April, iQOO is planning to release a new Z series of smartphones in the domestic market of China. The newly released will feature the Snapdragon 8s Gen 3 processor,...

Samsung Galaxy A35 and Galaxy A55 have best displays in the price range: DxOMark

Samsung Galaxy A35 and A55 Specs and featuresRelated Samsung released the Galaxy A35 and A55 smartphones worldwide earlier this week. DxOMark, a well-known authority on camera and display tests, gave both devices good ratings soon after they were released. To top...