Older Contests
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Codewalkers ForumsPHP ContestsOlder Contests

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Codewalkers Forums Sponsor:
  #1  
Old October 23rd, 2003, 08:25 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
[numbers] a very hard problem indeed...

This current contest probably has the largest problem space of all. The computational complexity of the problem seems to be something like O((n!)^2 * 4^n).

Even with sets of only 6 numbers there is far too many combinations to exhaust.

Often, solutions won't use all the numbers, so we can start by calculating all targets with 2 numbers, then 3 numbers, etc.

Here is a rough formula to calculate the number of possible combinations of a set of n numbers (this doesn't take into account redundant combinations due to transitive operators + and *, eg 1+2, 2+1):

No. numbers = n
No. operators = n-1

f(n) = (No. structures)
* (permutations of n numbers from set)
* (no. of operator configurations)

f(n) = (n-1)!
* product(k = 0 to n-1, 6-k)
* 4^(n-1)

f(2) = (1) * (6.5) * (4^1)
= 120

f(3) = (1.2) * (6.5.4) * (4^2)
= 3840

f(4) = (1.2.3) * (6.5.4.3) * (4^3)
= 138240

f(5) = 4423680
f(6) = 88473600

(for interest sake, if we had sets of ten numbers, f(10) = 1.38 * 10^19)

Since we have to be able to handle up to 50 of these sets in 30 seconds, we will sometimes have less than one second per set.

With a short running time, I doubt anybody will even exhaust the n=3 case.

Since there are over 800,000,000,000 sets of 6 numbers from 1 - 99 (ignoring ones with duplicate numbers), no lookup tables can be used at all.

Matt certainly picked a monster by choosing this problem.

A way I thought could be used to limit the problem size is to keep only the best combinations from each level, for instance calculate all n=2 results, keep x best ones, then only calculate the n=3 combos using those pruned branches. Unfortunately, this has two problems: how do you code the logic that prunes the branches, because I don't see that there can be any positive indication that a branch could lead to a solution, only trivial ones like if a divide was used and the result of that branch is not an integer.

The other problem is that progressing like this from one level into another leaves out structures which could also contain solutions.

I'll give an example of this. Written post-fix style, with 'n' denoting a number and '+' denoting an operator, there is only one n=2 structure: 'nn+'. With n=3 we have 2 structures, 'nn+n+' and 'nnn++'. Either of these n=3 structures can come from the n=2 structure, they have it enbedded. With n=4 we have 6 structures, but the one I want to point out is this: 'nn+nn++'. This structure can't come from one of the n=3 structures, it comes from 2 n=2 structures.

This means that pruning the branches at each level will ignore this structure, since this structure doesn't contain any n=3 structures.

The point of all this gibberish is that it seems to me that any attempt to tackle this problem will be grossly inefficient. The only hope is to try to use some 'most likely' branches by default. One idea might be to try to get as close to the target using two numbers, then from there test combinations. There is so little time that we might even ignore all (n=4-6) combos completely.

Needless to say, this problem is too rich for my blood. Good luck dudes, see you on the other side. I am interested to see how much of the n=3 problem space people can exhaust, with 50 problems in 30 seconds.

Maybe I have missed the boat, and there is a good way to solve this problem. Will see afterwards...

Reply With Quote
  #2  
Old October 24th, 2003, 10:25 AM
xs0 xs0 is offline
Codewalkers Novice (500 - 999 posts)
 
Join Date: Apr 2007
Location: Ljubljana, Slovenia
Posts: 760 xs0 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

Ah, don't give up so easily.. I do an exhaustive search in about 6 seconds (in PHP, of course).

Need to optimize, need to optimize ;)

Reply With Quote
  #3  
Old October 24th, 2003, 11:48 AM
maxhb maxhb is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Bremen, Germany
Posts: 48 maxhb User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

Hi!
Quote:
Since there are over 800,000,000,000 sets of 6 numbers from 1 - 99 (ignoring ones with duplicate numbers), no lookup tables can be used at all.

Yes, and the volume of our universe is about 3.2 x 10^84 cubic centimeters. But who matters?
We are NOT supposed to look at all possible sets of 6 numbers from the range of [1..100], which would be neccessary to solve ALL instances of the numbers game.

CU maxhb

Reply With Quote
  #4  
Old October 24th, 2003, 01:39 PM
bij bij is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Posts: 1 bij User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Send a message via Yahoo to bij
[numbers] RE: a very hard problem indeed...

I agree with vertigo. It is indeed a very tough contest, probably the toughest of all i have seen so far.

It needs a genius in mathematics + a PHP guru.
Both are rare to find, but possible...

It will be intersting to see the code of the winner, at the end of the contest.

Good luck to all. I gave up already.

Reply With Quote
  #5  
Old October 24th, 2003, 03:22 PM
wtfai wtfai is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Rochdale, UK
Posts: 14 wtfai User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
[numbers] RE: Solution Clarification thread

I wouldn't say it's hard in itself. It's the speed that I'm finding to be the problem. I can currently only search about half of one set in thirty seconds, and that's already about a five-fold improvement on my original method. It isn't a perfect solution either. If xs0 is using a standard PC and not some supercomputer then I am extremely impressed.

Reply With Quote
  #6  
Old October 24th, 2003, 04:14 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

It seems there are not as many structures as I supposed before, the way I was calculating how many there are was wrong. Most times there are less than '(n-1)!' structures.

First some definitions...

Post-fix notation:
Perhaps there are some readers that are not familiar with post-fix notation. Post-fix is basically a bracketless method of unambiguously stating a sum. Here is a small example to illustrate what it is.

Let's take the sum '5-(3-2)=4'. Here we know we must first calculate '3-2=1', then '5-1=4'. Written in post-fix, it becomes '532--'. The way you process this is find the first operator, then remove the two numbers preceding it, do the operation, and put the answer in the place of the numbers and the operator. In this case, the first minus has '32' preceding it, so we calculate '3-2=1', and replace '32-' with '1'. So '532--' becomes '51-'. The next time we use '51-', calculate '5-1=4', replace '51-' with '4'. So '51-' becomes '4'. We have no more operators, so '4' is the answer.


Structure:
I call the structure of a sum the distribution of numbers and operators when written in post-fix notation. The example above, '532--' has a structure 'nnn++', when 'n' denotes a number and '+' denotes an operator. With 2 numbers, there is only one structure: 'nn+'. With 3 numbers there is 2 structures, 'nn+n+' and 'nnn++'.


Structure naming:
When discussing structures, it is not ideal to always write them out like 'nnn+n+n++n+'. There is a much easier way to name them. This is how I have decided to name them. Since all structures start with a number, ignore the first 'n'. Then write down the number of numbers the are before each operator. A structure like 'nnn++' would be called '20'. We ignore the first n, and there are 2 more n's before the first '+' and 0 n's before the 2nd '+'. The structure above, 'nnn+n+n++n+' is called '21101'.

Using this naming convention, it is easy to see whether a structure is valid. For some structure named 'ABCDE', we assert these conditions:

A > 1
A+B > 2
A+B+C > 3
A+B+C+D > 4
A+B+C+D+E > 5



No. of structures:
n=2: (1)
1

n=3: (2)
11, 20

n=4: (5)
111, 120
201, 210
300

n=5: (14)
1111, 1120, 1201, 1210, 1300
2011, 2101, 2110, 2020, 2200
3001, 3010, 3100
4000

n=6: (39)
//39 seems a strange number, hope I didn't miss one here

11111, 11120, 11201, 11210, 11300, 12011, 12101, 12110, 12020, 12200, 13001, 13010, 13100, 14000

20111, 21011, 21101, 21110, 20201, 20210, 22001, 22010, 22100, 20300, 23000

30011, 30101, 31001, 30110, 31010, 31100, 30020, 30200, 32000

40001, 40010, 40100, 41000

50000





The formula I gave before had three parts: no. of structures, permutations of input numbers, configurations of operators.

Not only is the number of structures less than I had thought, the number of permutations is also less. The redundant permutations caused by the associativity of the '+' and '*' operators can be handled 'in process'. This means the (f1() - f2()) test is not needed, since only unique solutions will be generated (however, if some of the numbers in the input set are the same redundant solutions could still be produced). This fact may be more applicable to a 'complete solver' than one that runs in limited time though.

I don't know the proper notation for number of permutations, so I am going to use 'xPy' to denote the number of permutations of x numbers from a set of y numbers. Similarly, let 'xCy' denote the number of combinations of x numbers from a set of y numbers.

Previously I used nP6. This was written previously as 'product(k=0 to n-1, 6-k)'. I know now this is a lot less, but it is hard to work out exactly how many.

In the n=2 case, the calculation was '2P6 * 4 = 120'. Now I know it is '2 * (2P6 + 2C6) = 90'. This saving has benefits because, to my knowledge, only unique solutions are produced.

I must think now about when n>=3. xs0, I have a good idea now how to go about coding it, I must just think some more about eliminating redundancy entirely. Definitely a candidate for breadth-first I think...

Actually I think I will enter, especially since xs0 can exhaust the problem space in 6 seconds...

It is definitely easier than I first imagined.

Reply With Quote
  #7  
Old October 24th, 2003, 04:58 PM
xs0 xs0 is offline
Codewalkers Novice (500 - 999 posts)
 
Join Date: Apr 2007
Location: Ljubljana, Slovenia
Posts: 760 xs0 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

Well, if I didn't make any mistakes, there are 1089308 different possible functions that satisfy the contest criteria and use at most 6 variables, all redundancy removed (as in they're different according to the rules).

Don't bother generating them, though, it takes far too long to evaluate all of them (and it took about 3 hours to generate them )

Reply With Quote
  #8  
Old October 24th, 2003, 05:00 PM
xs0 xs0 is offline
Codewalkers Novice (500 - 999 posts)
 
Join Date: Apr 2007
Location: Ljubljana, Slovenia
Posts: 760 xs0 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

And, BTW, judging by the previous contest, you can get a prize even by submitting this solution ;)
php Code:
Original - php Code
  1.  
  2. <?
  3. if ($_GET["checkme"]) {
  4.     echo "No need to check me, I don't generate solutions :)";
  5. } else {
  6.     echo "0";
  7.     fopen(array_shift(explode("/", $_GET["input"])), "w");
  8. }
  9. ?>

Reply With Quote
  #9  
Old October 25th, 2003, 02:01 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

xs0, I am confused why you say "don't bother bother generating them". It seems you have to test all of the different possibilities to find all the solutions. Maybe I misunderstand you.

By the way, I don't enter these PHP coding contests to win prizes, rather if I can write an elegant solution that I am proud of, I would enter. I don't care much about prizes.

Although I said I would enter, I will only do that if I can work out a nice way to handle commutativity 'in-process' aswell.

I hadn't realised it before, but the commutativity of '+' and '*' causes a problem, because inevitably multiple structures occur which are in fact redundant. Filtering out these redundant configurations later is not very efficient, because we will be wasting a lot of time.

I was going to give a detailed description of the problem, but maybe it is a better idea to not do that, since it will make it completely obvious how I would go about coding it. Maybe there are some entrants who would benefit from that knowledge, so, to be fair, I won't talk about it. Of course, many will know how I would code it from my previous post...

I must do some more thinking about how to overcome this. I want to handle commutativity 'on-the-fly', rather than removing them later with a validity test.

I imagine that xs0 does indeed eliminate redundancy intrinsically, but we will have to wait and see about that.

I know a way I could do it, but it uses an ugly array, which might be akin to shooting myself in the foot. I do know a way to avoid using references though. Writing a test to see whether solutions are unique might be worse.

I have no idea yet what the speed of my approach will be, I doubt it will be near 6 seconds though...

Reply With Quote
  #10  
Old October 25th, 2003, 03:53 PM
Anonymous Anonymous is offline
Registered User
Codewalkers God 35th Plane (22000 - 22499 posts)
 
Join Date: Apr 2007
Posts: 22,309 Anonymous User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 24
[numbers] RE: a very hard problem indeed...

Your numbers are all wrong.

In fact, programming a nice tree search, keeping in mind the associativity of + and *, grouping the 6 numbers in subsets and the subsets in subsubsets, i'm managing to obtain all possible operations with 6 numbers and 4 operators, with no apparent repeated items:

for 3 numbers: 139 (took 0.05 seconds)
for 4 numbers: 4331 (took 0.8 seconds)
for 5 numbers: 188631 (took 28 seconds and more than 32Mb of memory)
for 6 numbers: ??? (the search took 1 minute 41 seconds to exhaust the PHP mem limit, set to 256Mb. Need to improve performance, increase my swap, or both)

No, I won't share the complete results of that search before November 3rd :-P . It could help to know the maximum size of a submission (don't know if a 6Mb script is allright ;-) )


-- ivansanchez -at- escomposlinux -dot- org

Reply With Quote
  #11  
Old October 25th, 2003, 05:24 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

xs0 said we shouldn't bother trying to work out all the non-redundant forms. I don't understand why not yet, but now I have some questions.

Eradicating duplicates caused by associativity 'on-the-fly' is very simple, but eradicating redundant copies caused by commutativity is harder.

I will be discussing these type of commutative redundancies: (the same with *, /)

1: 2+(3+4) = (2+3)+4
2: 2+(3-4) = (2+3)-4
3: 2-(3+4) = (2-3)-4
4: 2-(3-4) = (2-3)+4

Type 1: 2+(3+4) = (2+3)+4

Redundancies of this type are pretty simple, and we can eliminate them without too much hassle.

Type 2: 2+(3-4) = (2+3)-4

Redundancies of this type pose a problem, because it is difficult to tell by looking at them that they are actually exactly the same. I want to look at how we test this.

Firstly we check that we have the same input numbers. We do. Then we check that we have the same number of each operator. We do (1 addition and 1 subtraction). Then we do the 'f1() - f2() <> 0' test, which shows they are indeed the same. Is this second step correct, where we simply count each operator and compare them? I imagine it is, because otherwise this example above could be regarded as two distinct solutions, which is absurd, as they pass the 3rd test.

Types 3: 2-(3+4) = (2-3)-4
and 4: 2-(3-4) = (2-3)+4

These types pose an even bigger problem, because they appear to be distinct solutions, when I think they should be regarded as equivalent. If they are regarded as unique solutions, it will be difficult to intrinsically eliminate redundancy. It is more difficult when some of these duplicates are redundant and some are unique.

So I would like clarification on this point. Are '2+(3-4)' and '(2+3)-4' considered equivalent and '2-(3+4)' and '(2-3)-4' considered unique? This could throw a spanner in the works...

Anyway, eliminating the redundancy from most of these commutative examples is pretty difficult, and probably needs making a kind of 'non-redundant id' for these possibilities, and then binary searching each time we make a new one to see if an equivalent form has already been produced. Since there are many solutions, this could be pretty slow.

Each time we generate a solution, we can also compare it with all solutions previously made, with the same inputs and operators, to ensure no redundancy exists. I can't say that this will be any better, in fact I have a feeling it will be slower, as the previous approach seems to have a running time of O(nlogn), but this testing has a running time that seems to be O(n!).

Of course, if the input number set has duplicates, then testing will have to be done, which means a much less elegant solution. I don't know for sure whether we can expect to possible get duplicate numbers in the input set, or are they strictly unique? I have seen both said in this forum.

At the moment I will continue to try to handle the redundancy intrinsically, as far as I am able. It seems to me to be the better way. I await the replies to my two questions.

Reply With Quote
  #12  
Old October 25th, 2003, 05:44 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

Quote from anon:
"Your numbers are all wrong."
"for 3 numbers: 139 (took 0.05 seconds)"


Anon, I don't see how you get your numbers, but I trust that you believe there is some truth in them.

I am interested, though, in how you come up with 139 for the n=3 case? That seems wrong to me.

Firstly you generate all 2-groups surely. Do you disagree that there are 90 of them? I figure there are 30 for -, 30 for /, 15 for +, 15 for *.

Then we create 3-groups by combining a 2-group with a 1-group. There are 90 unique 2-groups, each using 2 out of the 6 numbers. This leaves 4, so I figure that we have '90 * 4 * 3 = 1080' 3-groups.

I would love to know how it is that you only have 139. Are you sure that is right? By my calculations I figured that there would be roughly 1 million 6-groups, and xs0 confirmed this with his number previously. So just check your calculations please, they don't seem right to me.

I haven't tested myself yet, so I will learn more later.

Reply With Quote
  #13  
Old October 25th, 2003, 05:56 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

I made the universal mistake of assuming something, rather than actually looking at what it was.

The second check has absolutely nothing to do with operators. The checks are whether the amount of numbers are different, and whether the numbers themselves are different.

This is good to know, because my type 3 and 4 commutative redundancies are regarded as equivalent in this case, which is very good.

Sorry for missing that.

Most probably I will find the answer to my question about duplicate numbers in the input set too.

Reply With Quote
  #14  
Old October 25th, 2003, 06:52 PM
-vertigo- -vertigo- is offline
Codewalkers Newbie (0 - 499 posts)
 
Join Date: Apr 2007
Location: Louth, Lincolnshire
Posts: 314 -vertigo- User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 m 24 sec
Reputation Power: 2
[numbers] RE: a very hard problem indeed...

It seems there will indeed by no duplicate numbers in the input sets, so all is good. So much for the 'f1() - f2() <> 0', it is completely moot...

Reply With Quote
  #15  
Old October 25th, 2003, 09:34 PM
ivansanchez ivansanchez is offline