Implementation of AIXI in Python

AIXI is a mathematical and logical description of an Artificial General Intelligence or a strong AI agent which can indeed learn any intellectual task. AIXI is an agent based on reinforcement learning which interacts with some random and unknown environment that is computable. The interaction is progressed in time steps which range from 1 to m, where m is the lifespan of the agent.

While its life span, the agent chooses actions and executes them in the environment. In return, the environment responds with “percept,” which carries some observations and rewards. Moreover, rewards are distributed according to conditional probability, which is based on the history of actions, observations, and rewards. Therefore we can say the environment is the probability distribution over percepts and rewards, which depends on the overall history of the agent.

However AIXI model is not computable, although there is a C++ implementation of this agent, known as MC-AIXI-CTW, which is a computable approximation of a universally optimal AIXI agent.

The working principle of MC-AIXI-CTW is the same as AIXI, which is based on actions, observations, and rewards. In fact, the only purpose of this agent is to maximize rewards. Action selection of the agent is based on mainly two components. Firstly, it is the model of the environment in which the agent tries to predict the probability of any outcome, and second is an algorithm that estimates the reward outcome of each action. To know more about machine learning and AI algorithms, you can check this post.

Contents

pyaixi: Python Implementation Of MC-AIXI-CTW

pyaixi is the python implementation of MC-AIXI-CTW ( Monte Carlo-AIXI-Context Tree Weighting), which allows users to build models in a python environment. This algorithm is also an approximation of AIXI, which is capable of general learning. This implementation allows users who don’t have C++ programming language background to use this algorithm. Further, it helps in using MC-AIXI-CTW in other python based Artificial Intelligence projects. The access of this algorithm to a language that has simpler linguistics opens and boosts the prototyping of more AIXI approximations.

Getting Started With aixi python – pyaixi

pyaixi allows you to configure environments yourself. Likewise, you can use the given domain in the library to understand more deeply the functioning and parameters related to setting up an environment and implement your agent in that environment.

If you want to try an example environment, you need to follow different steps. Moreover, the steps vary according to which Operating system you are using. Let’s discuss these steps one by one.

Running Example On Windows

To run the example on Windows, you need to run the example code in the base directory of the python AIXI package pyaixi. This step remains common in all the examples we will discuss.

The syntax of the code is as follows:

python aixi.py -v conf\rock_paper_scissors_fast.conf

Running Example On Linux/Unix/Mac

Run the example code in the base directory of the python AIXI package pyaixi. You can run the code on the consoles of the Operating systems.

Syntax of the code is given below:

python aixi.py -v conf/rock_paper_scissors_fast.conf

However, the code runs best on pypy, which is an alternative to CPython and runs on a just-in-time compiler and even proves to be much faster than CPython. You can run the code on pypy by using the syntax below.

You have to type the version of pypy you are using, for example, version 3.

pypy-c3 aixi.py -v conf/rock_paper_scissors_fast.conf

Moreover, you can change the environment, and you can select from the various example environments given in the ‘examples’ directory.

coin_flip	A simulation of a biased coin flip
extended_tiger	An extended version of the Tiger-or-Gold door choice problem.
kuhn_poker	A simplified, zero-sum version of poker.
maze	A two-dimensional maze.
rock_paper_scissors	Rock Paper Scissors.
tic_tac_toe	Tic Tac Toe
tiger	A choice between two doors. One door hides gold; the other, a tiger.

Example environments in ‘examples’ directory of pyaixi – a python AIXI implementation.

You can use these example environments by replacing ‘rock_paper_scissors_fast.conf’ in the line of code and using an appropriate configuration file which can be found in the ‘conf’ directory accordingly.

The particular example we used will carry out 500 interactions of the agent with the environment. Further, those interactions will consist of the agent trying out different random actions which are permitted in that environment and will learn from the responses it gets from the environment in the form of observations and rewards.

For the example we have used, if the average reward gained by the agent is greater than 1, then the agent is winning.

Understanding Script Usage In aixi python – pyaixi

The syntaxes for every function and its explanation are given below. You can use this template to model your own agent, which will work in an environment which can also be configured by you.

Command	Args
pythonaixi.py	[-a \| agent <agent module name>]
	[-d \| explore-decay <exploration decay value, between 0 and 1>]
	[-e \| environment <environment module name>]
	[-h \| agent-horizon <search horizon>]
	[-l \| learning-period <cycle count>]
	[-m \| mc-simulations <number of simulations to run each step>]
	[-o \| option <extra option name>=<value>]
	[-p \| profile]
	[-r \| terminate-age <number of cycles before stopping the run>]
	[-t \| ct-depth <maximum depth of predicting context tree>]
	[-x \| exploration <exploration factor, greater than 0>]
	[-v \| verbose]
	[<environment configuration file name to load>]

Adding New Environment In pyaixi

Adding a new environment in the directory should inherit a base class which is inherited by all the environments present in the ‘environment’ directory. The base class is given by ‘.Environment’. Also, the environment will have to follow the methods of the base class.

In addition to this, you will have to construct a new configuration file for that environment and give your environment’s name in the ‘environment’ key.

Adding New Agent in pyaixi

Pyaixi only provides a single class of agents which you can access in the ‘agent’ directory. The line of code of the agent is:

mc_aixi_ctw - an agent implementing the Monte Carlo-AIXI-Context Tree Weighting algorithm.

This agent follows a prediction algorithm which can be found in the ‘prediction’ directory and has a line of code:

ctw_context_tree - an implementation of Context Tree Weighting context trees.

The search policy used by this agent is found in the ‘search’ directory, also the line of code given by:

monte_carlo_search_tree - an implementation of Monte Carlo search trees.

When adding a new agent, it should inherit a base class ‘.Agent’ and follow the methods listed in the base class to interact with the currently-configured environment.

The package uses the default agent ‘mc_aixi_ctw’ to use your own agent, and you have to use the ‘agent’ key to specify the python module name of your agent.

Alternatively, you can also use the new agent by overriding the default file value by using the ‘-a’/’–agent’ option on the command line.

FAQs

Is the project updated to the latest python?

No, it is not updated to the latest python version. According to setup.py, it’s compatible with Python 3.4.

Is the project compatible with the latest python3 version?

Yes, it is compatible with the latest python3 version. But it’s only tested will Python 3.4.

Conclusion

In this article, we have learned about AIXI, which is a powerful Artificial intelligence model and equally useful. Therefore how its implementation in python opens new possibilities in the field of AI and machine learning domain. Moreover, how can you use Aixi python implementation pyaixi in your own AI & ML models?

Command	Args
pythonaixi.py	[-a \| agent <agent module name>]
	[-d \| explore-decay <exploration decay value, between 0 and 1>]
	[-e \| environment <environment module name>]
	[-h \| agent-horizon <search horizon>]
	[-l \| learning-period <cycle count>]
	[-m \| mc-simulations <number of simulations to run each step>]
	[-o \| option <extra option name>=<value>]
	[-p \| profile]
	[-r \| terminate-age <number of cycles before stopping the run>]
	[-t \| ct-depth <maximum depth of predicting context tree>]
	[-x \| exploration <exploration factor, greater than 0>]
	[-v \| verbose]
	[<environment configuration file name to load>]