# Latest content was relocated to https://bintanvictor.wordpress.com. This old blog will be shutdown soon.

## Sunday, June 21, 2015

### BBG algo IV - locate a pair producing a desired sum

Q: Given an unsorted array of positive integers, is it possible to find a pair of integers from that array that sum up to a given sum?

Constraints: This should be done in O(n) and in-place without any external storage like arrays, hash-maps, but you can use extra variables/pointers.

If this is not possible, can there be a proof given for the same?
-----Initial analysis----
I wish I were allowed to use a hash table of "wanted " values. (iterate once and build hashtable. For Each new value encountered, check if it is in the "wanted" list...)

I feel this is typical of west coast algorithm quiz.

I feel it's technically impossible, but proof?  I don't know the standard procedure to prove O(n) is impossible. Here's my feeble attempt:

Worst case -- the pair happens to be the 1st and last in the array. Without external storage, how do we remember the 1st element?  We can only use a small number of variables, even if the array size is 99999999. As we iterate the array, I guess there would be no clue  that 1st element is worth remembering. Obviously if we forget the 1st element, then when we see the last element we won't recognize they are the pair.
-----2nd analysis----
If we can find a O(n) sort then problem is solvable. Let's look at Radix sort, one of the most promising candidates.

Assumption 1: the integers all have a maximum "size" in terms of digits. Let's say 32-bit. then yes radix is O(n). Now, with any big-O analysis we impose no limit on the sample size. For example we could have 999888777666555444333 integers. Now, 32-bit gives about 4 billion distinct "pigeon-holes", so by the pigeon-hole principle most integers in our sample have to be duplicates! Sounds artificial and unreasonable.

Therefore, Assumption 1 is questionable. In fact, some programming languages impose no limit on integer size. One integer, be it 32 thousand bits or 32 billion bits, could use up as much memory as there is in the system. Therefore, Assumption 1 is actually superfluous.

Without Assumption 1, and if we allow our sample to be freely distributed, we must assume nothing about the maximum number of digits. I would simply use

Assumption 2: maximum number of digits is about log(n). In that case radix sort is O(n log(n)), not linear time:(