Netflix × ZONe “Imawa no Kuni no Anata” (You in Borderland)

Game Development Using Behavior Generation Engine by Reinforcement Learning



In December 2020, Japanese science fiction suspense-thriller drama “Alice in Borderland” (Japanese: 今際の国のアリス) was released by Netflix based on the original series of the same name by Haro Aso. To celebrate its release, Qosmo created a new browser game  “You in Borderland” (Japanese: 今際の国のアナタ), in a collaboration with the energy drink “ZONe”. This browser game represents the imaginary world of the drama and has been recreated in real life using AI. The characters are engaged in a life and death struggle in the game and this browser game shows users “What would you do if you were to participate?”


In the story of “Alice in Borderland”, Ryohei Arisu (Alice), played by Kento Yamazaki and others participate in deathgames where their lives are at stake. This project featured one of the death games “Game of Tag” (Japanese: おにごっこ), and we focused on the psychology of how users would behave if they were to participate in the same game as a character from the story. In this game, rather than the player being in control, the AI, which is the alter ego, will follow its “instincts” and progress through the story through trial and error. The player answers some questions that appear at the beginning of the game, and parameters are assigned to the characters according to their answers. After that, the player cannot control the character by himself, and the character itself will take various actions based on the assigned parameters, and the story will unfold to reach the goal.

Behavior Tree Overview using Unreal Engine4
It controls the characters’ behavior based on their situation by combining different nodes.


Game AI

State Machine or Behavior Tree have been used for AI characters of games for a long time. We used Behavior Tree for this project because it is flexible and easy to customize. It was the best choice for implementing specific character AI in “Game of Tag”.

Reinforcement Learning

The answers to a personality test decide the player parameters. Therefore, the number of player parameters that correspond to the answer is huge. It is almost impossible to create behavior trees for ​​every combination of parameters, thus we tried to implement player AI using reinforcement learning. Reinforcement learning is one of the machine learning methods where an agent gradually adapts to the environment through trial and error.

How Reinforcement Learning Works

Depending on the results of the personality test, five parameters will be assigned to the player: Agility, Memory, Sensitivity, Friendliness and Strength. The system requires generating a lot of patterns of behaviors based on these five parameters. For example, the player who has high “Agility” parameters can move a long distance at once, and who has high “Friendliness” can associate with allies. Our big challenge was to see if we could generate such a variety of behaviors using reinforcement learning.

Game × Reinforcement Learning

We needed to develop the game and perform reinforcement learning in parallel instead of performing reinforcement learning in an already created game environment. Therefore, we adopted ML-Agents provided by Unity as the implementation environment. We can train character agents within the game environment developed in Unity. Using Unity for game development and ML-Agents as a library for reinforcement learning, we could quickly run an adjustment cycle of game balancing and training. This cycle speed of game development and training was essential since the use of reinforcement learning in games is being undeveloped.

Debugging screen when learning in Unity

At the learning phase, the player character learns to “go as fast as possible,” “encounter as few demons as possible,” and “open as many doors as possible” following the storyline of the drama. In this case, the results of the personality test are incorporated into the player character as parameters, and the character will take actions optimized to the combination of these values. These parameters are divided into five levels of Agility, Memory, Sensitivity, Friendliness and Strength, and each parameter has a different effect. “Agility” affects how far the character can move at once, “Memory” affects the amount of time it can remember when an ally tells it where the goal is, “Sensitive” affects its field of vision, “Friendliness” affects how often its allies appear, and “Strength “affects its stamina. As a result, if the user has a high “Sensitive” rating, he will be able to detect the approach of demons more quickly. We designed the game so that it creates various characteristics according to the personality test.

The actual game screen

There are not many use cases of reinforcement learning in game development, but it has great potential. We will continue to use reinforcement learning for procedural design and modeling, agent-based simulation, and a wide range of other fields. For more technical details, please check Qosmo Lab.



  • Planning / Production

    Dentsu+The Dentsu Tec+Qosmo+salvo

  • Creative Director

    Noriaki Onoe (Dentsu Inc.)

  • Planner

    Yuma Kishi (Dentsu Inc.), Yasumasa Mizuno (Dentsu Inc.), Hayato Namiki (Dentsu Inc.)

  • Director

    Hirokazu Tamai, Natsue Kokubo (salvo)

  • Producer

    Takayuki Kurihara (Dentsu Tec Inc.), Fumi Kawachi (Dentsu Tec Inc.), Daisuke Takahashi

  • Project Manager

    Sakiko Yasue (Qosmo, Inc.), Hidehiro Ando (salvo), Shiori Yamazaki (salvo)

  • Technical Director

    Nao Tokui (Qosmo, Inc.)

  • Machine learning / Game Development

    Ryosuke Nakajima (Qosmo, Inc.)

  • Programer / Researcher

    Bogdan Teleaga (Qosmo, Inc.)

  • Sound Effect

    Fumihito Uekusa

  • Account Exective

    Kazuki Yamamoto (Dentsu Inc.), Yusuke Mizukoshi (Dentsu Inc.), Kazuki Sakurai (Dentsu Inc.)

  • Frontend Engineer

    Harumu Ikeyama (salvo), Hiroyuki Mikami (salvo), Yoshihito Tanaka (salvo)

  • Serverside Engineer

    Yusuke Kawakami (KKcraft Inc.)

Get in touch with us here!