Opinion
The AI that's popular these days (LLM) tends to be a search engine summarizer, finding stuff and then summarizing it in a folksy writing manner. Questions are being asked by industry thinkers why AI can create cutesy artwork but not summarize business expenses.
So, I've come up with two questions to challenge AI systems with more than one dimension of knowledge. If answered correctly, they would provide actionable information. So far, all AIs I tested have failed to answer accurately.
Q1. Which financial institution in Canada offers the highest interest rate?
From this sentence, AI has to understand the following parameters:
- Context: The wording would tell humans that the interest rate is on savings, not for loans.
- Up-to-date data: Interest rates fluctuate, so last week's rate may not be today's rate.
- Discernment: In Canada, savings are achieved through flexible savings accounts or with inflexible GICs.
- Geography: Not all financial institutions are allowed to offer services to all parts of Canada.
- Marketing: Some high interest rates are offered for a limited duration to attract customers.
- Contracts: Some high interest rates have limitations, such as funds withdrawal is delayed by a week.
- Financial: Interest rates might be lower than stated due to the cost of bank service fees.
- Specific: Report the one best rate.
Needless to say (but I will say it anyhow), AI flunked answering all seven aspects of this useful question. The image above is typical: long-winded, and short on specifics. Standard Web sites like ratehub remain superior, as this AI response admitted.
Q2. Which books did Ralph Grabowski write in 1991?
This question has the following parameters:
- Date range: The data is limited to the year 1991.
- History: The era is pre-Internet.
- Identity: The Ralph Grabowski is a technical writer, not one of other Ralph Grabowskis living and dead.
- Plural: One or more books were written.
- Context: Knowing the difference between 'write' and 'publish.' (*)
- Tasks: Distinguishing between roles; I have been a sole author, a co-author, a contributing author (wrote one chapter), and a technical editor. I have written books, updated my books and the books of others, typeset books, and copy/technical-edited books.
- Format: Books can be printed on paper or issued as e-books.
- Publisher: Books can be published by a company or be self-published.
- ISBNs: Some of my books were given the same title by different publishers, such as Using AutoCAD (Que, 1991) and a different Using AutoCAD (Delmar, in later years); ISBNs distinguish between them.
- Re-issues: I re-published some of my books printed by publishers as e-books distributed by me.
Google's Gemini returning partial results (I wrote four books in 1991 that were published)
All AI engines got the author parameter correct: they returned names of books written by me. The lists are, however, always incomplete (not all of my 1991 books listed) or wrong (books I wrote in other years are listed), or implied incorrectly (I completed "Learn AutoCAD in a Day" in 1991).
---
(*) Write means a book was written, but not necessarily finished. Publish means the book was written, printed, and distributed. I wrote a few books that were never printed.
The Hockey Canada Rule 4.14 Delayed Penalties often causes problems for refs to understand and at a recent clinic we had our refs try to provide the correct answer for a rather complex delayed penalties situation. One of my adult children offered to see whether AI would be successful in providing the correct answer by inputting the Hockey Canada rule and the penalties that were called. Interestingly, it was unsuccessful, like many of the refs, in understanding the rule to provide the correct answer. Everything that is needed to interpret the rule correctly is included in the rule itself. The problem arises in understanding how to implement it.
Apart from the above, I, too, have noticed that AI tends to use a lot more words than are necessary.
Posted by: Dairobi Paul | Apr 05, 2025 at 04:35 PM