Reinforcement Learning for the Space Turtle
Do you have questions or comments about this model? Ask them here! (You'll first need to log in.)
## WHAT IS IT?
This model demonstrates the movement of a turtle to a target patch in a 3D environment according to a policy learned by reinforcement learning.
## HOW IT WORKS
The environment the turtle is moving within is a 3 x 3 x 3 cube. The target patch is the patch with maximal coordinates (i.e. 3 3 3). The coordinates of patch the turtle starts moving from are set in a random manner by pressing the respective button. The trajectory is selected based on the policy computed with the aid of {ReinforcementLearning} package in R.
## HOW TO USE IT
The 'Setup' button generates the world (i.e. the cube and the turtle). Original coordinates of the turtle are ( 1 1 1 ). One can start moving the turtle from these coordinates. Other coordinates to start with can be randomly set by pressing 'Set Start Patch' button (after presing 'Setup'). The 'Go' button/procedure will move the turtle to the target patch.
The 'Move' monitor shows the move direction. The moment the turtle reaches the target patch will be displayed as "Win" on the 'Satus' monitor. The number of moves is shown on the respective monitor.
## THINGS TO NOTICE
Both the NetLogo model file and the .csv file with the policy should be located in the same directory. The path to the .csv file should be 'corrected' in NetLogo code based on the actual location of the file on your computer.
## THINGS TO TRY
For a better visualization of the turtle position and trajectory the 3D View can be 'orbited'/moved.
## NETLOGO FEATURES
This is a 3D model based on a policy generated by reinforcement learning, using ReinforcementLearning R package. The '.csv' file with the policy should be located in the same directory where the NetLogo model is downloaded.
## CREDITS AND REFERENCES
This model was developed by Victor Iapascurta, MD. At time of development he was in the Department of Anesthesia and Intensive Care at University of Medicine and Pharmacy in Chisinau, Moldova / ICU at City Emergency Hospital in Chisinau. Please email any questions or comments to viapascurta@yahoo.com
The model was created in NetLogo 6.2.0, Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo/. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL.
Comments and Questions
extensions [csv] globals [ curr-patch start-patch status move-dir px py pz model-policy ;; reinforcement learning model policy selected-move ;; a particular move selected from the policy ] to setup clear-all file-close-all crt-world draw-cube draw-axes crt-turtle set status "Playing" set-start-patch set selected-move [] set model-policy [] set move-dir 0 set px 0 set py 0 set pz 0 orbit-right 5 orbit-down 90 reset-ticks end to draw-axes create-turtles 1 [ set shape "line" set heading 90 set color red set size world-width stamp die ] create-turtles 1 [ set shape "line" set color orange set heading 0 set size world-height stamp die ] create-turtles 1 [ set shape "line" set pitch 90 set color blue set size world-depth stamp die ] ask patch max-pxcor 0 0 [ set plabel "x-axis" set plabel-color red] ask patch 0 max-pycor 0 [ set plabel "y-axis" set plabel-color orange] ask patch 0 0 max-pzcor [ set plabel "z-axis" set plabel-color blue] end to draw-cube ask patches [ if (pxcor >= 1 and pxcor <= max-pxcor and pycor >= 1 and pycor <= max-pycor and pzcor >= 1 and pzcor <= max-pzcor) [ set pcolor [0 255 0 20] ] ] end to crt-turtle crt 1 [ set color yellow setxyz 1 1 1 set heading 90 set pitch 0 set roll 0 ] end to crt-world resize-world 0 3 0 3 0 3 end to go ifelse(status = "Win") [stop] [ play-model tick] end to set-start-patch ;; the patch to start with; coordinates are random set px one-of [1 2 3] set py one-of [1 2 3] set pz one-of [1 2 3] set start-patch patch px py pz set px 0 set py 0 set pz 0 end to move-to-start ask turtles [ move-to start-patch ] end to play-model ; the procedure of moving the turtle according to the RL model ask turtles [ set curr-patch patch-here ;; loading model policy from a separate .csv file, located in the same folder set model-policy csv:from-file "C:/Users/Victor/Desktop/RL/NetLogo_models/Final_RL_S_turtle/Policy_m.csv" ;; "~/path to .csv file/Policy_m.csv" ;; iterating over the model policy and selecting the optimal move foreach model-policy [ i -> if xcor = read-from-string item 6 item 0 i and ycor = read-from-string item 8 item 0 i and zcor = read-from-string item 10 item 0 i ;; selected-move is the particular move selected out of the policy [set selected-move lput i selected-move] ] pen-down if item 1 item 0 selected-move = "forward" [ set move-dir "forward" fd 1] if item 1 item 0 selected-move = "backward" [ set move-dir "backward" bk 1] if item 1 item 0 selected-move = "right" [ set move-dir "right" set heading 180 fd 1 set heading 90] if item 1 item 0 selected-move = "left" [ set move-dir "left" set heading 0 fd 1 set heading 90] if item 1 item 0 selected-move = "up" [ set move-dir "up" set pitch 90 fd 1 set pitch 0] if item 1 item 0 selected-move = "down" [ set move-dir "down" set pitch -90 fd 1 set pitch 0] set selected-move [] if (patch-here = patch 3 3 3) [set status "Win" stop] ] end
There is only one version of this model, created almost 4 years ago by Victor Iapascurta.
Attached files
File | Type | Description | Last updated | |
---|---|---|---|---|
Policy_m.csv | data | Model Policy | almost 4 years ago, by Victor Iapascurta | Download |
Reinforcement Learning for the Space Turtle.png | preview | Preview for 'Reinforcement Learning for the Space Turtle' | almost 4 years ago, by Victor Iapascurta | Download |
This model does not have any ancestors.
This model does not have any descendants.