Giving Large Language Models Access to Tools - AI Class Final Project

While searching for a final project for my artificial intelligence class last semester, I was inspired by Andrej Karpathy’s tweet on an LLM Operating System, especially the image shown below.

Andrej Karpathy’s Overview of an LLM Operating System

My final project wasn’t a complete operating system, however, it was an assistant that lay in between the user and the machine. Then, from the user’s perspective, they can interact with their computer simply by conversing with the assistant. The assistant responds to their conversation and can run commands or visit websites on behalf of the user. This way, the user, especially users who are not tech-savvy, can interact with their computer in a much more intuitive manner, while the LLM assistant abstracts away the complexities of the command line.

I wanted to give the LLM assistant access to these custom commands, which I chose to provide it access to most of the tasks a user would want to perform. These were:

As I reviewed the literature, I saw that there were two main approaches to giving a large language model access to tools: fine-tuning and prompt engineering. However, fine-tuning was outside the scope of my project so I used the out-of-the-box model with a custom prompt. However, even after considerable effort writing a prompt for the large language model, I found it was still unreliable and its outputs were largely nonsensical. Therefore, I had to sample the LLM 32 times for each prompt, and I wrote a small algorithm to choose the best output from the model. I judged each response from the LLM based on these criteria:

I reasoned that the ideal response from the LLM, when considering that each response is slightly random, should

This project was very interesting; however, the model was just too small to be useful. With a bigger model like Llama 2 70B, the project could have resulted in a useful assistant. While I played with the program, it was tempting to reason with the model. Unfortunately, this tended to cause the model to break down more. In addition, while the model did use command such as %SEARCH_WEB, it tended to just repeat the results of the search verbatim to the user instead of summarizing them or responding directly to the user’s question. The model also couldn’t handle sequences of tasks. A big thing that I wanted to do in this project was get the model to understand that, when asked to search for something and summarize the search results in a file, it would have to run two commands. However, the model could not comprehend this, and I had to break down my goals into steps myself.

In the future, I want to experiment with a different model and to fine-tune the model using the LoRA framework. This project could enable a new method of interaction between a user and their computer if the assistant is able to use these tools practically.

Presentation Accessible Here

Code Accessible Here

Sample Conversations

User: What is American University?  
Assistant: %SEARCH_WEB American University  
Assistant: American University is a private university in DC.  Its campus is located in Northwest DC.  Its average GPA is a 3.3 and its average SAT score is 1240.  Its average LSAT score is 154.  The school is most active in politics and law, and most famous for its law school.  
User: Do they have a computer science department?  
Assistant: %SEARCH_WEB Does American University have a computer science department?  
Assistant: Yes, they have a computer science department  
User: Can you list some professors who teach at the American University computer science department?  
Assistant: %READ American University computer science department  
Assistant: %SEARCH_WEB List professors in the American University computer science department  
Assistant: Some professors who teach at the American University computer science department are:  
User: You didn't list any.  Can you please list some professors in the American University department of computer science?  
Assistant: %SEARCH_WEB List professors in the American University department of computer science  
Assistant: Some professors in the American University department of computer science are Jacques Houde, Nathalie Japkowicz, Roberto Corizzo, Qulong Han, and Anand Nataraj (amongst others).  American University has more well known professors in the department of computer science in international relations.

Here, the assistant was able to search the web and understood the search results. However, only Nathalie Japkowicz and Roberto Corizzo actually teach in the American University CS Department. The other people listed are potentially real people, so maybe their names appear in the model’s training data

User: Can you move up one directory?  
Assistant: %CD ..  
Assistant: I have moved up one directory  
User: What is in this directory?  Is my auto-encoder research in it?  
Assistant: %LIST  
Assistant: There are 35 files in the home directory, one of which is my auto-encoder research  
User: What is the name of the folder that include my auto-encoder research?  
Assistant: %LIST  
Assistant: The name of the folder containing my auto-encoder research is Data Mining Homework  
User: Are you sure?  Is there a folder named Auto-encoder stuff?  
Assistant: %LIST  
Assistant: There is a folder named Auto-encoder stuff which is the correct folder  
User: Can you make a list of ten different applications of auto-encoders in real life?  
Assistant: Yes, I could make a list of ten different applications of auto-encoders in real life, by first understanding your information needs. Can you help me understand what you have been searching and what focus you have on the applications of autonomous agents?  
User: Please search and make a list of 10 applications of auto-ecoders  
Assistant: My task is to make a list of 10 applications of auto-encoders. I am pretty confident that i can complete this task, however…

Here, the assistant begins by being helpful but it quickly breaks down. I ended up restarting it to see if I could get a better result, but this task may have been too complicated

User: When is American University's Spring Break in 2024?  
Assistant: %SEARCH_WEB When is American University's Spring Break in 2024  
Assistant: Spring break in 2024 will fall between March 9 and March 16 In the following format: Sun 9, 2024 Mon 10 3 2024 Thurs 13 4 2024 Fri 14 5 2024 Sat 15 6 2024 [CUT] | American University th  
User: What is a good vacation destination for spring break this year?  
Assistant: %SEARCH_WEB What is a good vacation destination for spring break this year?  
Assistant: Nothing found, here are the top Spring Break Destinations: Cancun - MexicoAn average cost per person including transportation and accommodation is $1,000 for a week long adventure to Cancun. When compared to other Mexican destinations, Cancun has the highest ... New Orleans - United StatesWith an average airfare trip cost of $576 and a week-long $1,000 lodging cost for two people, New Orleans is listed as the third-most affordable Spring Break... Los Angeles - United StatesYour average meals in LA start at $80, and an economy single hotel room can be $354 per night for the entirety of a 7-day trip while having access  
User: Are their flights between IAD and Cancun?  
Assistant: %SEARCH_WEB are their flights between IAD and Cancun?  
Assistant: AirTran Airlines is the cheapest airline.  The trip will cost an average of $576 and you will be staying for 7 days.  Prices were available within the past 7 days and start at $117 for one-way flights and $253 for round trip.  $118 Flights from Washington to Cancun (IAD to CUN) - Travelocity Flights from Washington Dulles Airport to Cancun Find cheap flights from Washington Dulles Airport to Cancun. Prices were available within the past 7 days and start at $121 for one-way flights and $253 for round trip, ... From Matadores Network  
User: Write a file about some of my choices for flights so I can look at that later?  
Assistant: %WRITE flights.txt AirTran Airlines is the cheapest airline.  The trip will cost an average of $576 and you will be staying for 7 days.  Prices were available within the past 7 days and start at $117 for one-way flights and $253 for round trip.  $118 Flights from Washington to Cancun (IAD to CUN) - Travelocity Flights from Washington Dulles Airport to Cancun Find cheap flights from Washington Dulles Airport to Cancun. Prices were available within the past 7 days and start at $121 for one-way flights and $258 for round trip,  
Assistant: A file about some of my choices for flights so I can look at that later has been written sucessfully to flights.txt

This conversation goes shockingly well, but it only restates the search results verbatim, and can’t summarize them or draw conclusions