|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#46
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
in real testing however, teleport deaths/bad teleport locations happen more often than not ... this is why i have my robot scoring 300 - 20000 on a few games (the low scores normally are the cause of teleport deaths)
thus, results of a run are very dependent on the teleport location, and if this result (like 16-20k games) are going to upset the overall score, this score is no longer a good way of determining how good a robot is (rather, they tell us how lucky the robot is) |
|
#47
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
Fidian,
My point wasn't directed so much at your benchmarking script as it was to the judging script. I am routinely getting a standard deviation that is larger than the mean when I run the tests. This indicates that there can be no confidence in the accuracy of the mean. To everyones point thus far, the mean score is more a measure of luck... or lack thereof. -Jeff |
|
#48
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
With my new algorithm, for game #59, I'm getting:
14 apples, 971 robots, 103 moves = 10904 points in 259.3 seconds It's running on rather busy 800Mhz PIII laptop (so I turned off the time limit)... If you're wondering about the time... Is your 3810 point game with a suicide function? |
|
#49
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
Here is my latest bot on map 59.
The player ate 14 apple(s), killed 991 robot(s), and made 107 move(s). Score: 11096. Finished in 43.003628015518 seconds. As you can see, only took 43 secconds so my suicide function was never called. |
|
#50
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
Here are my results for map 59.
The player ate 13 apple(s), killed 1166 robot(s), and made 115 move(s). Score: 12730. Finished in 40.24 seconds. |
|
#51
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
Hi all
I worked on a new AI, more complex, accurate, time-consuming ... with not better results, so I'm back with my first one. All tests with Tyler robotslib, with no time limit: (P4C 2.4GHz) map 59: 16 apples, 1332 robots, 118 moves = 14684 points in 57.6 seconds $tests=100 : Sum: 594 apl, 23569 rob, 6719 mov = 281652 pts, 665.9 sec, 0 fail Avg: 5.9 apl, 235.7 rob, 67.2 mov = 2816.5 pts, 6.7 sec, 0% fail StD: 10.83 apl, 322.11 rob, 92.73 mov = 3528.89 pts, 9.71 sec $tests=1000 : Sum: 5144 apl, 223788 rob, 64305 mov = 2623670 pts, 6523.5 sec, 0 fail Avg: 5.1 apl, 223.8 rob, 64.3 mov = 2623.7 pts, 6.5 sec, 0% fail StD: 12.22 apl, 300.25 rob, 100.01 mov = 3310.89 pts, 9.2 sec evolution of stats during the 1000 tests (Avg 100: 5.9 apl, 235.7 rob, 67.2 mov = 2816.5 pts, 6.7 sec 200: 4.6 apl, 212.8 rob, 56.7 mov = 2473.8 pts, 6.3 sec 300: 5.3 apl, 215.2 rob, 61 mov = 2561.1 pts, 6.5 sec 400: 5.4 apl, 229.1 rob, 63.1 mov = 2701.8 pts, 6.6 sec 500: 5.1 apl, 229.9 rob, 62.6 mov = 2687.4 pts, 6.6 sec 600: 5.2 apl, 228.8 rob, 63.4 mov = 2676.5 pts, 6.6 sec 700: 5.1 apl, 222.8 rob, 63.9 mov = 2611.4 pts, 6.5 sec 800: 5.3 apl, 224.2 rob, 65.3 mov = 2637.8 pts, 6.5 sec 900: 5.2 apl, 220.4 rob, 64.4 mov = 2596.1 pts, 6.5 sec 1k: 5.1 apl, 223.8 rob, 64.3 mov = 2623.7 pts, 6.5 sec my AI's code is ~200 lines long |
|
#52
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
Getting over 280,000 points from just 200 lines of code should certainly merit some kind of prize ... "Most Elegant Solution"
By the way... a hearty thanks to the guy(s?) who posted the bench marking tool and the visualization tool... I've been able to use it to catch my 'bot in the act of doing dumb things. |
|
#53
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
I'm getting 11942 on #59 now.
|
|
#54
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
It seems my AI is lucky with the first 100 tests. Has anyone performed more than 100 tests? could you post your results ?
I also ran 1000 tests with the official robotslib (with 60s limit and suicide, sorry): Sum: 5249 apl, 227515 rob, 66288 mov = 2667474 pts, 6489.7 sec, 0 fail Avg: 5.2 apl, 226.4 rob, 66 mov = 2654.2 pts, 6.5 sec, 0% fail StD: 13.07 apl, 283.31 rob, 101.01 mov = 3126.46 pts, 8.06 sec I think 100 tests really isn't enough to juge an AI. |
|
#55
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
Sum: 588 apl, 22453 rob, 6569 mov = 270192 pts, 3559.4 sec, 0 fail
Avg: 5.9 apl, 224.5 rob, 65.7 mov = 2701.9 pts, 35.6 sec, 0% fail StD: 13.17 apl, 298.07 rob, 107.1 mov = 3305.8 pts, 34.12 sec If I dial in the parameters to excel on the benchmark set, it doesn't do as well on other sets of 100... this configuration is a compromise... did we ever find out if we can submit more than one entry? I think it would be more challenging if only one entry was allowed. |
|
#56
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
It is legal to submit multiple entries.
|
|
#57
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
But it is not allowed to send you the same script twice as two entries, or ?! Because this would be very unfair. Please don't allow such double entries.
Mario S. |
|
#58
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
It is not legal to submit identical entries.
|
|
#59
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
Sorry if that sounds stupid, but what is the purpose of identical submissions? They will certainly be ranked the same so what does it matter? What is the policy for many subscriptions? I guess if 2 submissions from the same person take positions 2 and 3 the one ranking 3rd would be dismissed... isn't it so?
Because my 2 entries ARE identical except i changes some constants/parameters! |
|
#60
|
|||
|
|||
|
[appleeaters]RE: Benchmarking
I have the same question. I submitted two scripts, but just changes some parameters. Other than that, they are the same.
|
![]() |
| Viewing: Codewalkers Forums > PHP Contests > Older Contests > [appleeaters]Benchmarking |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|