Thursday, October 18, 2007

Word Numbers

If the integers from 1 to 999,999,999 are written as words, sorted alphabetically, and concatenated, what is the 51 billionth letter?

To be precise: if the integers from 1 to 999,999,999 are expressed in words (omitting spaces, 'and', and punctuation[1]), and sorted alphabetically so that the first six integers are:

eight
eighteen
eighteenmillion
eighteenmillioneight
eighteenmillioneighteen
eighteenmillioneighteenthousandand the last is
twothousandtwohundredtwo

... then reading top to bottom, left to right, the 28th letter completes the spelling of the integer "eighteenmillion".

The 51 billionth letter also completes the spelling of an integer. Which one, and what is the sum of all the integers to that point? [Bonus points for algorithms that will find the result for an arbitrary position, not just the 51st - Ben]

[1] For example, 911,610,034 is written "ninehundredelevenmillionsixhundredtenthousandthirtyfour"; 500,000,000 is written "fivehundredmillion"; 1,709 is written "onethousandsevenhundrednine".

The above was taken from the following website : http://www.itasoftware.com/careers/puzzles07.html

** Discussion of possible solutions, ideas, thought processes, etc is encouraged. If providing a solution, please offset the algorithm/idea in some way so that others may look around it. **

10 comments:

Benjamin P Lee said...

Writing up a program to do this is logically pretty simple assuming you don't worry about memory usage, etc and actually produce the list or some portion of it.

I found this last night before leaving work and bounced a few ideas around in my head on my way home.

I think I have come up with a way to make it relatively fast and with a minimal memory footprint but it will require a fair amount of prework.

I will post it when I know it works.

Josh Schramm said...

In the interest of making this interesting I dont have a solutiuon yet but I'm going to post a bit of my thought process.

The first thing i asked is, what are the real constraints of the problem. At first the problem seems incredably overwhelming but in reality it's much smaller than you think.

Between one and ninehundrednintyninemillionninehundrednintyninethousandninehundredninetynine there are exactly 30 distinct words.

one through twenty, then every ten digit place, then hundred, thousand, million.

My first instinct is that the counting doesnt matter at all, you're actually concerned with the possible combinations of those thirty values.


So what im trying to do now is come up with groupings. Certain numbers can only be used a certain number of times in a given number. Certain can be used multiple times, and some can only be in specific places.

Benjamin P Lee said...

Great minds think alike. Thats pretty much the road I was going down.

There are 30 "words" to be used, but they can be clasified as single digits (group A), two digits starting with 1 (B), tens >= 20 (C), "hundred", "thousand", "million".

Valid numbers as words must start with something from group A, B or C.

Just like you I figured you can build/define some sort of tree like structure to show all posibilities (I first tried a BNF grammer but couldn't perfect it prior to having to do more work) that way you can say for every group A starting number there are XXXX possibles under it, that way you can skip whole chunks of numbers and only need to walk the high levels of the tree until the next path would be more than your desired number,then go down a level and continue......

it seems like it would work, still trying to verify a few things though.

Benjamin P Lee said...

one problem with my logic is that the words-numbers are not grouped by first word.

e.g. the list starts with

eight
eighteen
.... rest of the eighteenxxxxxxx
.... rest of the eightxxxxxx

i.e. all of the eights are not be eachother. makes grouping them less helpful than I thought.

Josh Schramm said...

well something i figured out is they follow a consistant pattern. Each set of three has the pattern
(group A) hundred ((group C && group A) || (group B) seperator

where separator is million or thousand. SO a full number has the pattern.

**Ac1 = choose 1 from group A

(Ac1) hundred ((Cc1 && Ac1) || (Bc1) million (Ac1) hundred ((Cc1 && Ac1) || (Bc1) thousand
(Ac1) hundred ((Cc1 && Ac1) || (Bc1)

the only case being any given component can be null from left to right

Josh Schramm said...

bbah my parens are wrong in that last example but you get the point.

matt said...

i've got a "solution" but would require matlab to test it as there's a line i'm not sure would actually work...

i'd write it in .php but i don't know how to write a subroutine in php... oh well, i'll email you both the file i wrote, should be understandable enough since matlab is based on C

Benjamin P Lee said...

I am trying to find a copy of matlab via the internets but haven't yet, so I guess we can assume your solution worked (I will look at it more tonight, have been busy last few days).

I am surprised that betwene all of us, we weren't able to come up with a really clever solution. anyone else have any ideas??

Unknown said...

If you happen to still have a UD login (for some reason my email still works and I can probably log in to most systems still, plus they keep sending me info about applying for graduate school), you can just get Matlab off of the software site. I still have Matlab on my laptop, but I think the license ran out.

Benjamin P Lee said...

A) I think my was turned off ~1 year after I was done
B) I can't remember the software store URL
C) I wanted it for Mac
D) Its not a big deal, but thanks for the idea.

I say we agree that Kooz's solution worked, and push on. Hopefully, between the lot of us, we can actually solve future ones without any dispute.