Friday, November 1, 2013

Digital Flights by R.F.I.

To the tune of "Paper planes" by M.I.A.
karaoke version

We get high like rockets, and fly like planes
We launch you at the web we got “Fuel” in our name
If you come around here, we make bids all day
Get conversion in a second if you wait

Sometimes we think sitting on chairs
Every campaign we get we’re clocking that game
Every client’s a winner now we're giving them fame
Insight Booster making our name

All I wanna do is Bid! Bid! Bid! Bid!
Makes smarter money

MediaMath and also AdRoll
Criteo, Quantcast, DataXu, Turn
Running when we hit 'em
Lower CPA wi’our system

No one on the exchange has Regulus
Hit me on my Synergy Lift Booster
We perform and deliver and make you bucks
Already going hell just pumpin’ those ads

All I wanna do is Bid! Bid! Bid! Bid!
Make smarter money

World class datacracy
Yeah, we got more info than the K.G.B.
But just to get yo’ business
We already do!

All-ah All-ah All-ah All I deliver
None-ah-none I let down
All-ah All-ah All-ah All I deliver

None-ah-none I let down

Saturday, August 3, 2013

Two stories – Oakland First Fridays, Black Ravens and Pink Flamingoes

Only one of the following two stories is true.

Oakland First Fridays: Maya and Elsa were with me at work yesterday, August 2nd, and overheard me making plans to go to Oakland's First Friday – a street/gallery food/art/music/people-watching festival. Maya asked whether I was going to take them. I said, “No, you will be with your mom tonight.”.
Elsa - “Are we with you next Friday?”
I - “Yes.”
Elsa - “Can we go with you to Oakland First Friday next week?”

Black Ravens and Pink Flamingoes:
Maya, Elsa and I were walking along the bay in Redwood Shores with Heather, a colleague. It was close to dusk and we saw a number of birds: a small number of red-winged blackbirds, crows, moor hens and cormorants, all mostly black; and a few killdeer, snowy egrets, a mourning dove nesting in a flowerpot at the entrance to Heather and Chris' home, scrub jays, great blue herons and clapper rails, all non-black. We also saw nine black ravens. I casually said, “Hmm, looks like all ravens are black.”, which seemed to go un-noticed at the time. Soon after, we saw another raven, which was also black. Heather said, “Oh, that raven is black, looks like you are right that all ravens are black.”

A short while later, Maya pointed out a pink flamingo.

Maya said, “Look Dad, a pink flamingo! Looks like you are even more right that all ravens are black!”

“What in the world do pink flamingoes have to do with black ravens?”

“Daa-ad! They do, I read it in my 'Great Philosophers' book and searched for it online! You said that 'All ravens are black.' That is logically equivalent to it's contrapositive, 'All non-black objects are non-ravens.' When Heather saw an additional raven and it was black, she pointed out (and you did not disagree) that it was evidence in favor of your original proposition that 'All ravens are black.' The contrapositive I just mentioned is logically equivalent. So when we see a non-black object (it is pink) and it is not a raven (since it is a flamingo), you should agree that it is evidence in favor of 'All ravens are black.'!”

“Oh god! Can't you just read some normal book about tribes of cats or that girl Parsnip or something who shot her little sister with an arrow? I suppose you are right!”
“Dad, okay, it seems to make sense logically, but I don't know if it is true in some statistical sense, all that “SQL, SQL, standard error and p-value” stuff you keep talking about now.”

“Let's take a crack at it. Let's denote
Proposition 1: “All ravens are black.” For the sake of simplicity let's just say that we are talking about birds and that we are trying to see if (the set of individual birds that are) ravens have some special property that distinguishes them from all birds (as individuals, not species). Then, practically speaking, the contrapositive and logical equivalent of Prop. 1 is
Proposition 2: “All non-black birds are non-ravens.”
Prior to Heather's observation, the evidence we had looked like
All birds of all colors Black non-Black All colors
Ravens 9 0 9
non-Ravens 50 8 58
All birds 59 8 67

Then she saw a black raven, and our evidence changed to
All birds of all colors Black non-Black All colors
Ravens 10 0 10
non-Ravens 50 8 58
All birds 60 8 68
And we agreed that this helped verify that prop 1 is true, in the sense that it allowed us to be more confident that Prop. 1 is true.

Then you saw a pink flamingo, and our evidence became
All birds of all colors Black non-Black All colors
Ravens 10 0 10
non-Ravens 50 9 59
All birds 60 9 69
And you are saying that this also allows us to be more confident that Prop. 1 is true. Let's forget about Prop.2 for a while. So for verifying that “All ravens are black.”, a pink flamingo is worth as much as a black raven. I can't think with concrete numbers, so let's use some abcedra:

All birds of all colors Black non-Black All colors
Ravens a 0 a
non-Ravens b c b+c
All birds a+b c a+b+c

On the basis of this evidence, the probability of any bird being non-black is
p = c/(a+b+c).

How much evidence is there to reject the hypothesis that some ravens are non-black? The specific null hypothesis in this case is
H : “Some raven-birds are non-black.”,
since we are hypothesizing that raven-birds are just any other individual bird, some of which are non-black.

Now, if the above H were true, and we make 'a' observations of birds which happen to be ravens, the probability P of getting 0 non-black is the product of the probabilities that each observed bird in this set of ravens is non-black, i.e. P = (1-p)*(1-p)*... a times = (1-p)a.
(Or you can use B[p,a](i) = C(a,i)*(1-p)^(a-i)*p^i for i = 0.)

This is the P-value, the probability of the consequence of the null hypothesis in this experiment, that we can use to reject the null hypothesis with 1-P degree of confidence. Recall that the smaller that P is, the more confidence we have in rejecting the Null Hypothesis, or “accepting the validity of proposition 1”.

So now we can ask about the relative merits of observing a black raven or a pink flamingo. Which additional observation reduces the P-value more? Let's use calclueless, since I am not secretive enough to do discrete math. What that means is we want to compare the marginal change (partial derivative) in the P-value when we make an additional observation of a black raven a → a + 1 vs. when we see a pink flamingo c → c + 1.

Since ln(P) = a*ln(1-p),
(d/d a) ln(P) = ln(1-p) and some work shows that the change in the P-value by observing a black raven, adding 1 to a:
(d/d a)P = P * ln(1-p),
which is less than 0, meaning that observing a black raven does reduce the P-value and increases the confidence in “All ravens are black.”!

(d/d c)p = (1-p)/(a + b+ c) ( > 0)
(d/d p)P = -a*P/(1 – p) (< 0) .

Simplifying, the change in the P-value by observing a pink flamingo, adding 1 to c:
(d/d c)P = -a*P/(a+b+c),
which is less than 0, meaning that observing a pink flamingo also reduces the P-value and increases the confidence in “All ravens are black.”!

The question that now remains is wether pink flamingoes are more valuable evidence than black ravens, i.e. which change decreases the P-value more:
|(d/d c)P| ?> |(d/d a)P|, which is equivalent, since P > 0, to
|(d/d c)ln(P)| ?> |(d/d a)ln(P)|
working through algebra
a/(a+b+c) ?> -ln(1-p).

the condition we are looking for is

e^(a/(a+b+c)) ?> 1+ c/(a+b)

For a = 10, b = 50 and c = 8, it turns out that this is just marginally true!
A pink flamingo is just as valuable as a black raven in verifying that all ravens are black!
See the plot below

Since most birds are actually non-black, had we seen already a very large number of non-black non-ravens:

All birds of all colors Black non-Black All colors
Ravens 10 0 10
non-Ravens 50 90 140
All birds 60 90 150
Then the incremental value of seeing a pink flamingo would have been much less than that of seeing a black raven, for two reasons, one since we would have had a large proportion of non-black birds, the expected proportion of non-black ravens would have been correspondingly higher, making it all the more unlikely to see no non-black ravens amongst the additional ravens. Second, as the graph above shows, the incremental value of each non-black non-raven when we've already seen a lot does very little to increase our confidence that all ravens are black. For example, in the Rann of Kutch “Another pink flamingo, ho hum!
So, Maya, does it now make statistical sense that your pink flamingo sighting was just as important as Heather's black raven sighting for verifying that all ravens are black?”

Yes, Dad, I think I want to be a doctor when I grow up.”

The next day, we went out for lunch, and suddenly Elsa piped up, “Dad! More black ravens! I just saw an orange chicken!”

Can they fly?”

Thursday, July 25, 2013

In case you survive a disaster ...

... prepare a statement, lest you get caught unawares and make some foolish but unfortunately common statement, as in the following case.
At the end of an interview with a survivor of the train wreck in Galicia, after the survivor had described the chaos and panic and described hearing victims and seeing a trapped friend, the interviewer asked her, "How do you feel at having survived?". The survivor would have been justified in decking the journalist, or if she really wanted to respond with banalities, she could have said, "Numb." or "Shocked." or "I don't know what to think." or "It was so recent, there is so much turmoil inside me, I don't know what I feel."

Instead, her response was the all too common, "Thanks be to God! A miracle!"

If you don't have problems with that statement, you have rocks in your head. Now, since she is from a good little culturally Catholic country, we can safely assume she meant "miracle" and not "Miracle". But what was the miracle? That 300+ survived? That 80 people died? That only 80 people died? That 80 people died and she was one of the 300 odd survivors? That her God made the right choice in deciding which people to off and which ones to save? That this was a sign that she is special, and that as a corollary, the victims were not? That the families of the survivors should take consolation that they were simply the necessary collateral damage from the miracle? after all if every one had survived or everyone had died then there would have been no miracle.  

Or perhaps the miracle is that her god tolerates such lack of humility and empathy with the victims.

Tuesday, July 2, 2013

Solutions: Shortest distance between two skew lines in 3D

See the previous post for the question.

U mode: there are only 4 given things, two (n1 and n2) are already vectors, each tangent to its line. From the other two things, construct the difference, d = C1C2, which is a vector joining the two lines. Since the tangent vectors may be parametrization dependent, which we don't want, perhaps they have to be normalized.
Then the shortest distance has to be proportional to |d·n1 X n2|. The simplest way to determine the proportionality constant would be to do it for some test cases: pairing in each case the x-axis (s, 0, 0) with one of: (0, s, 0), (s, 1, 0), (0, s, 1), (s, s, 1) etc. and for different choices of points along the lines.

BTW the conjectured formula fails for one of the above, even though the shortest distance is well-defined. From the computational geometry point of view, the formula is computationally fragile in that is certain special cases it will be very sensitive to machine precision rounding errors.

I mode: The shortest distance between the two lines has to be the length of a segment perpendicular to both. (Why?) Hence that line segment has to be parallel to S = n1 X n2, and the distance between the two given lines is the length of that line segment.
How do we find its length if we don't even know where it lies?
We could, in M-mode, find that line by doing all sorts of stuff requiring that it intersect L1 and L2, then finding its intersection points and finally the distance between them.
Or, in U-mode, we could notice that the projection onto the unit vector s = S/|S| -of a segment joining any two points one each on the two original lines- is exactly the length of the shortest segment. How so? The line joining any two points one each on the original lines is a linear combination of s, n1 and n2. So projecting it onto s, since s n1 and s n2, will yield exactly its component along s, which is d. I know this sounds as if I've been going around in circles, but what all the above means is that we can take any two arbitrary points on L1 and L2, say C1 and C2, and calculate
d = |(C1C2)s|.

This yields the proportionality constant we were looking for in U mode, the reciprocal of the norm of S.

Another way of thinking about the solution in I mode is that the answer we are looking for is the distance between two parallel planes each of which contains one of the given lines.

Friday, June 28, 2013

Shortest distance between two skew lines in 3D

This is one of the problems from the 1980 JEE that I didn't get, and one of two that still bother me. It is easy enough to remember, the entire problem is stated above.

Assume that the two lines are given in parametric form
L1(s1) = n1s1 + C1 and
L2(s2) = n2s2 + C2
where s1, s2 in (-infinity, infinity), C1 and C2 are points on their respective lines and n1 and n2 are tangent vectors.

(If they are not in parametric form, or the parametrization is not linear (w.r.t. the coordinates), a linear parametrization can always be found. E.g, starting from two planes (themselves specified as a linear relation between the three Cartesian coordinates), the two normals to each can be found. The tangent to the intersection line is the cross product or wedge product of these two normals. Then C can be found by requiring that the line belong to the two planes.)

(What is a non-linear parametrization of a line? Consider the line given by:
L(s) = (1/s)i + (0,2,3). The closure of the set of points is the line parallel to the X-axis which intersects the y-z plane at y=2, z= 3, although in terms of the s parameter you never reach that point. This line can be parametrized as L(t) = t i + (0,2,3).)

M mode: Construct |L1L2| or its square and minimize by taking the derivative w.r.t. s1 and s2. What a lot of work!

Pretty answer next week!

Saturday, May 11, 2013

Richard Feynman's 95th Birthday

Thank you Antra for the reminder!

Here are a few picks from the internet:

This video captures beautifully my thoughts on art and the endless arguments I have with artists about nature, science and art.

But Bloomsbury celebrates Feynman

what is science in 63 seconds

Feynman wiki

Gleick's fabulous biography

Feynman's video, and revisiting my own post, brought up some memories:

I'd once asked a fellow-scientist who was also an artist why she painted, and she responded with I believe a quote, that painting is an excuse to gaze upon nature. Not quite as terse as "because I can" nor "because it is there", but in the same space. I knew a painter in Madrid - Javier Fernandez Lizan, and on my first visit to his studio, at some point, after the above conversation, I asked Javier why he painted. Javier responded by taking me to a massive 8'X10' abstract still life with human figures, in oil, and showed me a 2 or 3 square inch area, and said "to capture this blue".

Why did I do science? Why do I analyse data now?
"Because it is there!"
"Because I can!"
"Because it is an excuse to gaze upon nature!"
"Beacuse I want to capture "this blue", which is in my head, and speaks to me, and I want to express it!"
"Because I am an instrument of "this blue"'s meme, and it expresses itself through me!"

So, I have never understood why it is that while most scientists find commonalities with artists ( my thoughts on art notwithstanding!), most artists keep trying to distance themselves from scientists! 

Monday, March 11, 2013

Tennis Ball Cannon

Made as part of a team-building contest at my company.

Some stills:
not bad for accuracy!
 The tennis ball is top center left, its shadow is just left of target! At 25 m distance, the precision was 3m in length and 1m in width. The pieces of bark and stones mark the "hits".

 Immediate post-launch: Elsa wanted to launch as many as possible simultaneously, we got up to 4!
Four in the air

Piston at full post-launch extension.

Looking down the mouth of a loaded cannon - see how much I trust Maya.
A 2 -minute video compilation