@yuichirominato 2019.02.22更新 358views

[game] tic tac toe on ising model.


Introduction

We just solved the sudoku problem so next step we are going to tic tac toe.

It looks that tic tac toe needs another strategy.

Stragtegy

It requires prediction of gaming. We want to predict some step forward but this time we adopt just one step prediction for tic tac toe problem.

First step

Ising machine. He just predict one step so, this time he select just one square from 9.

q0 q1 q2
q3 q4 q5
q6 q7 q8

Select 1 from 9

from blueqat.opt import Opt
a = Opt().add("(q0+q1+q2+q3+q4+q5+q6+q7+q8-1)^2",N=9).run()
print(np.reshape(a,(3,3)))

He selected the middle left square.

[[0 0 0]
 [o 0 0]
 [0 0 0]]

Human

Of course I choose the center one.

[[0 0 0]
 [o x 0]
 [0 0 0]]

Ising

Next step he select the square which is the most high cost strategy to finally win the game. Now we think about vertical or horizontal line.

a = Opt().add("10*(q0+q1+q2+q3+q4+q5+q6+q7+q8-2)^2-(q0+q1+q2)^2-(q3+q4+q5)^2-(q6+q7+q8)^2-(q0+q3+q6)^2-(q1+q4+q7)^2-(q2+q5+q8)^2",N=9).add(np.diag([0,0,0,-100,100,0,0,0,0])).run()
print(np.reshape(a,(3,3)))

He selected the left bottom square.

[[0 0 0]
 [o x 0]
 [o 0 0]]

Human

I selected the square not to let him win.

[[x 0 0]
 [o x 0]
 [o 0 0]]

Ising

This time he needs some strategy to block my turn. He just calculate the highest value of “my” strategy.

a = Opt().add("10*(q0+q1+q2+q3+q4+q5+q6+q7+q8-3)^2-(q0+q4+q8)^2-(q2+q4+q6)^2+q1*0+q3*0+q5*0+q7*0",N=9).add(np.diag([-100,0,0,100,-100,0,100,0,0])).run()
print(np.reshape(a,(3,3)))

He blocked me.

[[x 0 0]
 [o x 0]
 [o 0 o]]

Human

Of course I blocked him.

[[x 0 0]
 [o x 0]
 [o x o]]

Ising

Also the same strategy he will block me.

[[x 0 0]
 [o x 0]
 [o x o]]
 a = Opt().add("10*(q0+q1+q2+q3+q4+q5+q6+q7+q8-4)^2-(q0+q1+q2)^2-(q3+q4+q5)^2-(q6+q7+q8)^2-(q0+q3+q6)^2-(q1+q4+q7)^2-(q2+q5+q8)^2",N=9).add(np.diag([-100,0,0,100,-100,0,100,-100,100])).run()
print(np.reshape(a,(3,3)))

Easy for him.

[[x o 0]
 [o x 0]
 [o x o]]

And human and ising finally.

It’s draw.

[[x o x]
 [o x o]
 [o x o]]

The hardest point for me is to make the cost function of lines. I just used two cost function. One for the vertical and horizontal evaluation and another for the slating line.

The hyper parameters are easily found. The code and strategy is scattering. I try to refactoring it in near future.


Back To Top