MineLlama

Llama with Retrieval Augmented Generation
as A Decision Maker in Minecraft

Abstract

MineLlama is a novel localized Large Language Model (LLM) framework designed to enhance decision-making in the sandbox game Minecraft, without relying on external APIs or ex-tensive datasets.
MineLlama operates on a two-tier system:

  • a planning level that generates a sequence of interdependent subgoals to achieve a given objective
  • an executing level that determines and implements the immediate actions based on the current state and the subgoals.

relationgraph_generation.png

Relationship graph generation

executing module

The framework of executing module

Images and Videos

Belows are the results of several representive tasks achieved by MineLlama.

画像2 画像7

Case 1 : Cooking a beaf

画像6 画像7

Case 2 : Creating a stone pickaxe

画像7 画像7

Case 3 : Creating a wooden pickaxe

Team (Kyoto University)

  • Dr. Shiyao Ding (丁 世堯, HP)
  • Prof. Takayuki Ito (伊藤 孝行, HP)
  • Mr. Ryo Totake (藤武 稜)
  • Mr. Kyota Tamano (玉野 恭多)
  • Mr. Taiki Watanabe (渡邊 大起)
  • Mr. Taiyo Honda (本田 大洋)
  • © 2024 MineLlama