Exploring AI-Driven Tools for Research in Humanistic Buddhism: A Comparative Evaluation

This is a guest post by William Chong. For more information, see at the end of this post.

Introduction

Our research goal aims to streamline the transition from data to insights in Humanistic Buddhism research, leveraging advanced tools to integrate extensive bibliographies and primary sources. With a focus on identifying the most promising techniques, we embarked on an exploration to enhance efficiency and precision in research methodologies. Through this endeavor, we seek to bridge the gap between traditional scholarship and modern technology, facilitating a more effective approach to knowledge generation in the field of Humanistic Buddhism.


We start by sharing our initial inquiry, with some initial reflection. While this inquiry was meant to look into the capabilities of generative AI tools to assist research, it is currently limited by not having shared a common set of questions to determine its characteristics. Future explorations should look at designing methods to serve as standards, so as to determine the suitability of new tools for research as they emerge.

Methodology

Our methodology revolves around the systematic evaluation of three distinct types of tools tailored for Humanistic Buddhism research: 

  1. txyz.ai, a GPT-based tool explicitly designed for research purposes; 
  2. ChatGPT Plus’ personalized GPTs, offering customizable AI models to suit specific research needs; and 
  3. customgpt.ai, equipped with Retrieval Augmented Generative AI capabilities. 

In assessing these tools, we prioritize four key criteria: 

  1. Ability to ingest large amounts of data;
  2. No Hallucinations;
  3. Accuracy quoting from text; and,
  4. Completeness, ensuring comprehensive coverage of relevant information without overlooking critical details.

Through rigorous testing and analysis, we aim to discern the strengths and limitations of each tool, facilitating informed decisions regarding their suitability for advancing research endeavors in the realm of Humanistic Buddhism.

Tool Evaluation: An Overview

Criteriatxyz.aiGPT Pluscustomgpt.ai
Ability to Ingest Large Amounts of Data❌ (Hard limit on the number of pages for upload)✔️ (In the form of CSV)✔️ (Multiple formats such as PDFs and text, but not CSVs)
No Hallucination✔️ (Defaults to looking for external sources if unable to answer from uploaded resource)✔️ (Possible to prevent using custom instructions)✔️ (Possible to prevent using custom instructions)
Accuracy Quoting from Text✔️ (Demonstrates accuracy in quoting text, especially citations)✔️ (Accurately retrieves mentions and quotes from provided data)✔️ (Provides accurate quotes from indexed data)
Completeness (Doesn’t Miss Information)❌ (Lacks continuity in chat interactions, leading to missed information)? (Dependent on code deployed during analysis)❌ (May miss entire chapters or present less relevant information)

Findings and Analysis

Our evaluation of txyz.ai, ChatGPT Plus’ personalized GPTs, and customGPT.ai has yielded nuanced insights into their performance within the realm of Humanistic Buddhism research. Notably, txyz.ai exhibited limitations in handling extensive data and maintaining continuity in chat interactions, while demonstrating strengths in accurate quoting from text. ChatGPT Plus showcased promising capabilities in ingesting large amounts of data through CSV files and preventing hallucination with custom instructions. Conversely, customGPT.ai revealed potential issues with missing information and less relevant output, despite its versatility in handling various data formats. These findings underscore the need for a critical appraisal of AI-driven tools’ suitability for Humanistic Buddhism research and highlight areas for further refinement and exploration.

Evaluation of txyz.ai

Ability to Ingest Large Amounts of Data:

Txyz.ai exhibited limitations in data ingestion, primarily due to a hard limit on the number of pages (100 pages for PDF) for upload. This constraint restricted the tool’s capability to handle extensive datasets effectively, requiring users to pare down their bibliographies to fit within the specified parameters.

No Hallucination:

In addressing the criterion of hallucination, txyz.ai implemented a cautious approach by defaulting to external sources when unable to provide answers from the uploaded resource. While this strategy helped mitigate the risk of hallucination, it occasionally led to responses that were less directly relevant to the user’s query.

Accuracy of Quotation:

During testing, txyz.ai demonstrated commendable accuracy in quoting text, particularly citations, from provided data. The tool effectively retrieved relevant quotes and mentions, showcasing its ability to pinpoint specific information within texts.

Completeness:

One of the notable limitations of txyz.ai was its lack of continuity in chat interactions, which occasionally resulted in missed information. Despite its proficiency in quoting text accurately, the tool struggled to maintain coherence in conversations, leading to disjointed exchanges and overlooking certain inquiries.

Illustrative Example:
In our evaluation of txyz.ai’s capabilities, we encountered challenges related to its handling of specific datasets, particularly in the context of extensive bibliographies. We circumvented the 100-page limit by streamlining the bibliography by removing unnecessary fields and reducing the font size. Despite a successful upload, txyz.ai struggled to provide relevant answers about the PDF itself, requiring prompts to explicitly reference the uploaded resource to avoid delving into its own knowledge base. Furthermore, the tool exhibited limited continuity in chat interactions, with responses lacking context and coherence, particularly when posed with sequential questions. Despite these challenges, txyz.ai demonstrated strengths in processing bibliographies at the end of publications, with a user-friendly interface that facilitated quick reference checks, highlighting its suitability as a candidate for single-PDF interactions.

Evaluation of GPT Plus’ Personal GPT

Ability to Ingest Large Amounts of Data:

GPT Plus offers promising capabilities in ingesting large amounts of data, particularly through the use of CSV files. This format allows for efficient processing of extensive datasets, enabling researchers to leverage the tool’s functionalities for comprehensive analyses. This capacity awaits further testing to determine its full potential.

No Hallucination:

GPT Plus addresses the criterion of hallucination by providing users with the ability to prevent hallucinations through custom instructions. By allowing users to input specific directives, the tool enhances accuracy and reliability in generating responses, minimizing the risk of hallucinatory outputs.

Accuracy of Quotation:

During testing, GPT Plus demonstrated commendable accuracy in quoting text, effectively retrieving mentions and quotes from provided data. The tool exhibited proficiency in pinpointing specific information within texts, showcasing its ability to generate accurate and contextually relevant quotations.

Completeness:

The completeness of responses generated by GPT Plus is contingent upon the code deployed during analysis, with outcomes varying based on the applied methodology. Researchers may need to validate the generated content to ascertain its coherence and relevance, thereby optimizing the tool’s utility for information retrieval and analysis tasks.

Illustrative Example:
During the evaluation of GPT Plus, we attempted to build a GPT using the tool and assessed its performance with a bibliography provided in CSV format. However, we encountered challenges stemming from the tool’s reliance on code to analyze CSVs, particularly when the language of the data and query differed. For instance, GPT Plus struggled to translate phrases like “Three Acts of Goodness” (the English equivalent of 三好) into their corresponding equivalents, hindering its ability to locate relevant material accurately. We also streamlined the bibliographic structure, and we subsequent tested by querying the tool for mentions and specific sentences related to “三好.” The tool’s ability to generate accurate quotes demonstrated promise, showcasing its potential for extracting relevant information from diverse datasets. Furthermore, GPT Plus effectively generated chapter headings and corresponding URLs, facilitating access to specific sections of the bibliography. These findings underscore the need for further refinement and optimization to enhance GPT Plus’ suitability for complex research tasks in the field of Humanistic Buddhism.

Evaluation of customGPT.ai

Ability to Ingest Large Amounts of Data:

CustomGPT.ai offers flexibility in ingesting data by supporting multiple formats such as PDFs and text, with a maximum of 5000 files per GPT. However, during testing, we encountered limitations regarding the ingestion of CSV files, which are commonly used for structured data.

No Hallucination:

CustomGPT.ai provides users with the option to prevent hallucination by implementing custom instructions. During testing, we observed that the tool effectively adhered to specified guidelines, reducing the risk of generating inaccurate or irrelevant responses. 

Accuracy Quoting from Text:

In terms of accurately quoting from text, customGPT.ai demonstrated proficiency in providing accurate quotes from indexed data, but is hesitant to provide them.

Completeness:

CustomGPT.ai exhibited limitations in completeness, particularly concerning the potential for missing entire chapters or presenting less relevant information. While the tool effectively retrieved specific quotes and mentions, it occasionally overlooked broader contextual information or failed to capture comprehensive insights from the dataset.

Illustrative Example:

CustomGPT.ai demonstrates robust capabilities in ingesting and indexing large volumes of data, enabling effective similarity analysis through the RAG method. With its product offering, users can efficiently prompt GPT to generate citations from uploaded data while ensuring responses remain within the confines of the provided information. However, a significant limitation arises in terms of completeness, as evidenced by instances such as the failure to locate the uploaded preface and the document’s chapter on “Three Acts of Goodness” being erroneously ranked third, underscoring the need for enhancements in relevance ranking and comprehensive indexing to optimize research outcomes.

Conclusion and Recommendations

In summary, the choice of tool for extracting relevant data from bibliographies or primary sources depends on specific needs and preferences. For single PDFs under 100 pages, txyz.ai offers a suitable solution. However, for more extensive datasets like bibliographies, both GPT Plus and customgpt.ai are viable options. Considering completeness, GPT Plus stands out, provided that the data is in CSV format, custom instructions are meticulously crafted and tested for accuracy, and the researcher verifies the data analysis method for potential gaps in capturing word forms and synonyms. Additionally, leveraging a RAG method through customgpt.ai may unveil pertinent sources that traditional CSV-data analysis might miss, further enriching the research process and ensuring a comprehensive exploration of available data and sources.

___

William Chong conducted research at NUS Singapore before cofounding
D’Linkup Pte. Ltd., a company dedicated to collaborating with
researchers and research institutions on data projects. Currently,
William  is engaged in an exploration of applying generative AI to
Humanistic Buddhism resources, both academic and religious, in
collaboration with the Nan Tien Institute.

Acknowledgment: The completion of this article was made possible through the generous funding provided by the Hsing Yun Educational Foundation Australia. Their support has been invaluable in facilitating the research and development efforts that have contributed to this work. We extend our sincere gratitude for their commitment to advancing education and research in the field of Humanistic Buddhism studies.

Leave a comment