Friday, November 24, 2017

Stright lines in photos

In this post I'll answer the following questions:
1: If someone takes a photo something with a straight line, perhaps the edge of a building, then will it still be a straight line in the photo?
2a: If an image contains a vertical line, then will the line be vertical in the photo?
2b: What about a horizontal line?

Consider a static image that is taken with a camera. We'll label the direction the camera is pointing as the \( x_3 \) axis. Suppose we render the image on a screen perpendicular to the \( x_3 \) asix, a distance \( \nu \) from the observer. The image of an object will the be rendered on the screen in precisely in the spot that is covering the original object.
With 3 dimensional coordinates, we'll specify the position of the observer to be at the origin. The screen is in the pane \(x_3 = \nu \)
The \(x_1 \) axis is parallel to the ground and the \( x_2 \) axis is perpendicular to both \( x_1 \) and \( x_3 \).
When the camera pointing in the horizontal plane, \( x_2 \) is vertical.
If the object is at \( \underline{x}^a = ( x_1^a , x_2^a, x_3^a ) \)
then the image should be somewhere along the line from the origin 0 to \( \underline{x}^a \)
In other words for some real number \( \alpha \) the location of the images is given by: \[ Image(\underline{x}^a) =\underline{x}^a \alpha = ( x_1^a , x_2^a , x_3^a ) \alpha \] The image is on the screen so \[ \nu = x_3^a \alpha \] and so \[ \alpha = \frac{\nu}{x_3^a} \] hence \[ Image(\underline{x}^a) = ( \frac{x_1^a}{x_3^a} , \frac{x_2^a}{x_3^a} , 1 ) \nu \] Now suppose we had a line segment running from \( \underline{x}^a \) to \( \underline{x}^b \)

Let \( \underline{x}^{\lambda} \) be a point on that line segment such that: \[ \underline{x}^{\lambda} = \underline{x}^a ( 1 - \lambda) + \underline{x}^b \lambda \] where \( 0 \le \lambda \le 1\)
So \[ Image(\underline{x}^{\lambda}) = \left( \frac{x^a_1 (1 - \lambda ) + x^b_1 \lambda }{x^a_3 (1 - \lambda ) + x^b_3 \lambda} , \frac{x^a_2 (1 - \lambda ) + x^b_2 \lambda }{x^a_3 (1 - \lambda ) + x^b_3 \lambda} , 1 \right) \nu \] If we let \[ \beta =\frac{ x^b_3 \lambda}{x^a_3 (1 - \lambda ) + x^b_3 \lambda} \] then after doing some algebra we find: \[ Image(\underline{x}^{\lambda}) = \left( \frac{x^a_1}{x^a_3} ( 1 - \beta) + \frac{x^b_1}{x^b_3} \beta , \frac{x^a_2}{x^a_3} ( 1 - \beta) + \frac{x^b_2}{x^b_3} \beta , 1 \right) \nu \] And as we vary \( \lambda \), \( \beta \) changes and the image traces out a line segment on the screen from \( ( \frac{x^a_1}{x^a_3} , \frac{x^a_2}{x^a_3}, 1 ) \nu \) to \( ( \frac{x^b_1}{x^b_3} , \frac{x^b_2}{x^b_3}, 1 ) \nu \)
So we have shown that in our model, the image of a straight line segment is a straight line segment.
It took some effort using algebra to get to that statement. It turns out we could have arrived at that statement using some geometry. The image of the line segment from \( \underline{x}^a\) to \( \underline{x}^b\) will be along the intersection of the plain \( x_3 = \nu \) and the plane containing \( \underline{x}^a\), \( \underline{x}^b\) and the origin. The intersection of two planes is a line and so the image of the straight line segment will be a straight line segment.

Again let us consider the image of a line segment from \( \underline{x}^a\) to \( \underline{x}^b\) but this time let's assume that both \( \underline{x}^a\) and \( \underline{x}^b\) are equidistant from the screen at \( x_3 = \nu \).
So \[ x^a_3 = x^b_3 \] In this case, the image will be rendered along the line segment from \( \left( \frac{x_1^a}{x^a_3}, \frac{x_2^a}{x^a_3} , 1 \right) \nu \)
to \( \left( \frac{x_1^b}{x^a_3}, \frac{x_2^b}{x^a_3} , 1 \right) \nu \)
Ignoring for a moment the \( x_3 \) axis, we find the slope of the line segment from \( \underline{x}^a\) to \( \underline{x}^b\), in the \( x_1 \) , \(x_2 \) plane is: \[ slope( \underline{x}^a, \underline{x}^b) = \frac{x^b_2 - x^a_2}{x^b_1 - x^a_1} \] which is the same as the slope of the line segment from \( Image( \underline{x}^a ) \) to \( Image( \underline{x}^b ) \).
So in this case the slope is preserved.
From a practical point of view, if a camera is pointing in the horizontal plane and it is used to take a photo of a building with some vertical lines, then those vertical lines will be preserved in the photo.

The image of a horizontal line segment which is parallel to the \( x_3 = \nu \) pane ( parallel to the screen ), will be rendered as a horizontal line in the photo.
So if you take a photo with the camera pointing perpendicularly at a wall, then horizontal lines on that wall will be horizontal lines in the photo.
If you are perpendicular to a flat straight road, then a line at the edge of a road will appear as horizontal, going across a photo.

On the other hand, if the photographer is very close to a building and points the camera upwards, then the real vertical axis will not be parallel to the \( x_2 \) axis and so a real-life vertical line, may not appear to be vertical in the photo. In fact the building may appear to lean back in the photo.

All bets are off when a fish-eye lens is used and then then image is rendered on a flat screen. The objects in the image will not be aligned with the real objects and you may observe straight lines to appear to become curved in the photo.

Saturday, April 1, 2017

Efficient machine learning

Suppose we have a set of data from which we wish to train a machine learning algorithm and the data keeps growing and growing. We'd like an efficient algorithm that will put more weight on the recent results and we would like it to be straight forward to keep up dated, even if we have hundreds of millions of rows of data. What can we do? Well, in this post I will suggest an algorithm that is very efficient at dealing with large data sets. It can learn from old data, but it doesn't need the old data to be stored.
Let's start with the basics. Suppose we have a set of T input vectors \(\underline{x}^t \in \mathbb{R}^n\) with \[1 \le t \le T\] and for each \(\underline{x}^t\) there is a resultant \(y^t \in \mathbb{R} \).
We want to find the optimal vector \(\underline{\theta}\) so that \( \underline{\theta } . \underline{x}^t \) is a good predictor of \( y^t \).
We choose the \(\underline{\theta}\) which minimises the cost function: \[ J( \underline{\theta} ) = \frac{1}{2T} \sum^{T}_{t=1} ( \underline{\theta} . \underline{x}^t -y^t)^2 \] We could expand the inner product \( \underline{\theta} . \underline{x}^t \), which would give us: \[ J( \underline{\theta} ) = \frac{1}{2T} \sum^{T}_{t=1} \big[ \sum_{j=1}^n ( \theta_j x^t_j ) -y^t \big]^2 \] When we have an optimal \( \underline{\theta} \), the partial derivatives of J will be zero: \[ 0 = \frac{\partial J}{ \partial \theta_i} = \frac{1}{T} \sum^{T}_{t=1} x_i^t \big[ \sum_{j=1}^n ( \theta_j x^t_j ) -y^t \big] \] We can change the order of the summation and do some rearranging to obtain: \[ 0 = \sum_{j=1}^n \big[ \frac{1}{T} \sum_{t=1}^T ( x^t_i x^t_j ) \theta_j \big] - \frac{1}{T} \sum_{t=1}^T ( y^t x_i^t) \hspace{24 mm} (eqn 1) \] Now, if we define the matrix \(\underline{A}\): \[ A_{ij} = \frac{1}{T} \sum_{t=1}^T ( x^t_i x^t_j ) \hspace{24 mm} (eqn 2) \] and we define vector \(\underline{B}\) \[ B_i = \frac{1}{T} \sum_{t=1}^T ( y^t x_i^t) \hspace{24 mm} (eqn 3) \] Plugging those into equation 1, we find: \[ 0 = \sum_{j=1}^n \big[ A_{ij} \theta_j \big] - B_i \] If we can invert the matrix \(\underline{A}\) then we can obtain \( \theta \) \[ \underline{\theta} = \underline{A}^{-1} \underline{B} \] If we look back at equation 2, we see that each vector \( \underline{x}^t\) effectively has the same weighting.
We could rewrite equations 2 and 3 as: \[ A_{ij} = \sum_{t=1}^T ( x^t_i x^t_j \omega_t) \hspace{24 mm} (eqn 4) \] and \[ B_i = \sum_{t=1}^T ( y^t x^t_i \omega_t) \hspace{24 mm} (eqn 5) \] where \[ \omega_t = \frac{1}{T} \hspace{24 mm} \forall t: 1 \le t \le T \] However, suppose we wanted more recent values of \( \underline{x}^t\) to have higher weighting. We could fix the ratio of consecutive weights: \[ \frac{\omega_{t+1}}{\omega_t} = 1 + \lambda \] with \( \lambda > 0\)
So we would have: \[ \omega_t = \frac{\omega_T} { (1+\lambda)^{T-t}} \] Suppose want to find the speed at which the weight drops down to half the weight of the most recently added entry, then we would seek \(\tau\) such that \[ \frac{\omega_T}{\omega_{T - \tau}} = 2 \] which implies: \[ (1 + \lambda)^{\tau} = 2 \] so: \[ \lambda = 2^{1/ \tau } -1 \hspace{24 mm} (eqn 6) \] So, if we want the most recent data to have double the weighting of a point 1,000 rows back, then we would set \(\tau = 1000 \) and use equation 6 to determine \( \lambda \).
The parameter \( \lambda \) determines how much bigger the weights of the recent data will have, when compared with older data. When choosing it, it may be helpful to first choose \( \tau\) ( which is something like a half-life ) and then use equation 6 to evaluate \( \lambda \)

If we wish to preserve the condition that the sum of the weights is 1, then we can do the geometric sum and after a little algebra we find that: \[ \omega_T = \frac{\lambda}{(1+\lambda)^T-1} \hspace{24 mm} (eqn 7) \]
For a given set of \(\underline{x}^t \) and \(y^t\) with \(1 \le t \le T \)
we can evaluate \(\underline{A} \) and \( \underline{B}\).
Since they were generated with T rows of data, we could label them \(\underline{A}^T \) and \( \underline{B}^T\).
Now suppose we have already evaluated \(\underline{A}^{(T-1)} \) and \( \underline{B}^{(T-1)}\)
with T-1 rows of data and we want to introduce one more,
then we find: \[ A^T_{ij} = \omega_T x^T_i x^T_j + ( 1 - \omega_T ) A^{(T-1)}_{ij} \hspace{24 mm} (eqn 8) \] and \[ B^T_i = \omega_T y^T x^T_i + ( 1 - \omega_T ) B^{(T-1)}_i \hspace{24 mm} (eqn 9) \] where \( \omega_T \) has been defined in equation 7.
So when we have evaluated \( \underline{A} \) and \( \underline{B} \) for a given set of data and then later want to include the contribution from a new \( \underline{x}^T \) and \( y^T \), we can amend the existing \( \underline{A} \) and \( \underline{B} \) using equations 8 and 9, without the need to retrieve all the past \( \underline{x}^t \) and \( y^t \).
As a result, when data ( \( \underline{x}^T \) and \( y^T \) ) comes in, we can use it to update \( \underline{A} \) and \( \underline{B} \) and then we can discard the \( \underline{x}^T \) and \( y^T \). They have made their contribution to \( \underline{A} \) and \( \underline{B} \) and we can continually update \( \underline{A} \) and \( \underline{B} \) without the need to look back at old values of \( \underline{x}^t \) and \( y^t \). When T is very large, say in the hundreds of millions, using this algorithm to continually update the machine learning results is rather efficient. Since we don't need to keep retrieving the old data. We just keep the \( \underline{A} \) and \( \underline{B} \) up to date.

Monday, February 27, 2017

Backward Propogation Algorithm in a Neural Network

This post was inspired by the Coursera course on Machine Learning. In particular by the lesson in week 5. In that lesson the backward propagation algorithm is presented but the students were asked to take the formulae on faith. A derivation was not given. To rectify that, here I present a derivation of the main equations.

Suppose we have an input vector \( \underline{x} \) and we wish to make a prediction for the resultant vector \(\underline{y}\). We start by defining our activation level zero to be: \[ \underline{a}^0 = \underline{x} \] We define \[ z^{k+1}_i = \sum_{j}\theta^{k+1}_{ij} a^k_j \hspace{24 mm} (eqn 1) \] and we use the sigmoid function S(z) to evaluate: \[ a^{k+1}_i= S(z^{k+1}_i) \hspace{24 mm} (eqn 2) \] When we have \(\underline{a}^0\), we can use equations 1 and 2 to evaluate \(\underline{a}^1\).
We then use equations 1 and 2 again to evaluate \(\underline{a}^2\) and so on up to \(\underline{a}^L\)

Our forecast for \(\underline{y}\) is activation level L: \(\underline{a}^L\)

Suppose we are given a training set of T input vectors \(\underline{x}^t\)
and a corresponding set of resultant vectors \(\underline{y}^t\) where \(1 \leq t \leq T \).
For each input vector \(\underline{x}^t\) we will have a corresponding activation level L: \(\underline{a}^{Lt}\)
We can measure the discrepancy between \(\underline{y}^t\) and \(\underline{a}^{Lt}\). We'll call this D. We could have: \[ D(\underline{y}^t , \underline{a}^{Lt} )= \parallel \underline{y}^t - \underline{a}^{Lt} \parallel^2 \] For the rest of this posting we won't refer again to the right hand side of that equation. So if the discrepancy function D were defined differently, then it wouldn't make any difference to the formulae below.

When we sum up all the discrepancies and we'll call the result the cost function: \[ J = \frac{1}{2T} \sum_t D(\underline{y}^t , \underline{a}^{Lt} ) \] We are interested in finding the \(\theta\)s which minimise this cost function and so gives us accurate predictions.
One approach would be to find the partial derivatives \[ \frac{\partial J}{\partial \theta^k_{ij}} \] for all i,j,k and then use a gradient descent method to minimise J. \[ \frac{\partial J}{\partial \theta^k_{ij}} = \frac{1}{2T} \sum_t \frac{\partial D(\underline{y}^t , \underline{a}^{Lt})}{\partial \theta^k_{ij}} \hspace{24 mm} (eqn 3) \] We are now going to focus on the term: \[ \frac{\partial D(\underline{y}^t , \underline{a}^{Lt})}{\partial \theta^k_{ij}} \] For simplicity we are going to drop the t superscript. So we will write: \[ \frac{\partial D(\underline{y} , \underline{a}^L)}{\partial \theta^k_{ij}} \] where the t is implied. Though in the end, when we want to evaluate the partial derivative of J, we will need to do the sum over t.

We define \[ \delta^k_i = \frac{\partial D}{\partial z^k_i} \hspace{24 mm} (eqn 4) \] and when we apply the chain rule to the right hand side we get \[ \delta^k_i = \sum_j \frac{\partial D}{\partial z^{k+1}_j} \frac {\partial z^{k+1}_j}{\partial z^k_i} \] we substitute in \( \delta^{k+1}_j\) and we find: \[ \delta^k_i = \sum_j \delta^{k+1}_j \frac {\partial z^{k+1}_j}{\partial z^k_i} \hspace{24 mm} (eqn 5) \] Now we want to evaluate the last term \[ \frac {\partial z^{k+1}_j}{\partial z^k_i} \] to do that we first combine equations 1 and 2 to obtain: \[ z^{k+1}_j = \sum_l \theta^k_{jl} S(z^k_l) \] and so \[ \frac {\partial z^{k+1}_j}{\partial z^k_i} = \sum_l \theta^k_{jl} S'(z^k_l) I_{li} \hspace{24 mm} (eqn 6) \] where \[ S'(z) = \frac {d S(z) }{dz} \] and \[ I_{lj} = \{ 1 \hspace{4 mm} when \hspace{4 mm} l = j, \hspace{4 mm}otherwise \hspace{4 mm}0 \} \] So, returning to equation 6, we find only one term in the sum survives, i.e. when l = j.
Hence eqn 6 becomes: \[ \frac {\partial z^{k+1}_j}{\partial z^k_i} = \theta^k_{ji} S'(z^k_i) \hspace{24 mm} (eqn 7) \] When we substitute that into equation 5 we find \[ \delta^k_i = \sum_j \delta^{k+1}_j \theta^k_{ji} S'(z^k_i) \hspace{24 mm} (eqn 8) \] It follows from equation 1 that: \[ \frac{\partial z^{k+1}_i}{\partial \theta^k_{jl}} = a^k_l I_{ij} \hspace{24 mm} (eqn 9) \] Using the chain rule we find: \[ \frac{\partial D}{\partial \theta^k_{ij}} = \sum_l \frac{\partial D}{\partial z^{k+1}_l} \frac{\partial z^{k+1}_l}{\partial \theta^k_{ij}} \] Substitute in equation 9 and we find only one element of the sum survives and we obtain: \[ \frac{\partial D}{\partial \theta^k_{ij}} = \delta^k_j a^{k+1}_i \hspace{24 mm} (eqn 10) \] So we can go forward (increasing k's) using equations 1 and 2 to evaluate \( a^k_i\)
and then using equation 8, go backwards ( decreasing k's) to evaluate \( \delta^k_j \).
We will then be able to work out all the partial derivatives of D with respect to \( \theta\).
Remember there is an implied superscript t in equation 10. And to work out the partial derivatives of J we will need to do the sum over all t's as shown in equation 3.
After that, finding the optimal \( \theta \) 's using gradient descent will be as easy as sliding down a hill.

Wednesday, January 25, 2017

Solitaire of sorts

Suppose you had a well shuffled standard deck of 52 cards. From the top of the pack you turned over each card one at a time. As you turned over the first card, you called out 'Ace', then 2 for the next, followed by 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King. After the 14th card you called out 'Ace' again followed by 2, 3, 4 and so on till you reached the end.
What would be the probability that none of the cards that were turned over had the value (rank) that you called out?

Before working it, lets consider a related, but simpler question. Suppose you had 3 cards containing values 1,2,3. After you shuffled them, what would be the probability that none of the 3 were in the same position as they were at the start? In mathematical speak you'd ask what is the probability of a derangement.
In this case we can look at all 3! ( i.e 6 ) permutations. And we see that 2 out of 6 are derangements. So the probability of a derangement is 1/3.
You might be tempted to say that for each card, the probability that it is not in its original position is 2/3 and so for the 3 cards the total probability of a drangement is that to the power of 3, i.e. 8/27.
However the flaw with that method is that the probabilities are not independent.

Returning now to that deck of 52 with 4 suits and 13 ranks. There is of course more than one way to work it out. One method would be to write a Monte Carlo simulation. I've done that and the code is below.
Here is a sample output after 100 million simulations:

Found 1623551 survivors out of 1.0E8
So the survival probability is estimated to be: 0.01623551
with a standard deviation of: 1.2638005465673727E-5
Time elapsed: 77.742 seconds


So the probability of surviving to the end without getting any cards right is approximately 1.62%

OK, OK, MC is useful, but not very satisfying. You might think that you could just write a program to work out all 52! permutations and then work out the answer. But alas 52! is a very big number. So if you want an answer before you die, then it might be a good idea to try a different approach.

Well there is another way... In fact I'm sure there any many other ways. I present here a method that I used. Suppose we have slots numbered 1 to 13 and for simplicity we'll number the cards 1 to 13. At the start we have 13 different ranks and each have 4 cards and there are 4 slots available for each of the ranks.
We could represent that as a table of Ranks:
Cards0  1  2  3  4  
Slots
000000
100000
200000
300000
4000013

Now suppose we pick one card, then we'll have 12 card ranks with 4 cards remaining
and we'll have 1 card rank that has 3 cards remaining, but still 4 slots.
So in our table of ranks we'll represent that as:

Cards0  1  2  3  4  
Slots
000000
100000
200000
300000
4000112

But we'll have to put the card in one of the available slots. Of the 52 slots 48 are allowed.
When we do that we'll have
 1 rank with 4 slots and 3 cards remaining
 1 rank with 3 slots and 4 cards remaining
12 ranks with 4 slots and 4 cards remaining

In our table we represent that as:

Cards0  1  2  3  4  
Slots
000000
100000
200000
300001
4000112

Using such a table with the probabilities of the transitions, we can write some code that recursively solves the problem.

This is what the code looks like:

import java.io.PrintWriter;

// check if the back slash char is causing problems

public class Calculator {
       
    public static void main ( String[] args){
       
        long startTime = System.currentTimeMillis();

        System.out.println("Starting...");
       
        Calculator calc = new Calculator();

        calc.calc(); // calc() is the main calculator, calcDebug() is for debugging
       
        long   endTime     = System.currentTimeMillis();
        double elapsedSec  = ((double) endTime - (double) startTime) * 0.001d;
       
        System.out.println("\nElapsed time: " + Double.toString(elapsedSec) + " sec.");       
        System.out.println("Finished.");
    }
    ///////////////////////////////////////////////////////////////////
    public void calc(){
        System.out.println("Doing calc.");

        int suits = 4;
        int ranks = 9; // Total number of cards will be: suits * ranks
       
        State s = new State(suits, ranks);
        double survivalProb = s.getSurvivalProb();
       
        String key = s.getKey();
        String resStr = "\nFor key: " + key + " found prob to be: "  + String.format("%1.14f", survivalProb);
        System.out.println(resStr);
       
        String pathToFile = "C:/temp/res_" + key + ".txt";
        writeToFile(pathToFile, resStr );

    }
    ///////////////////////////////////////////////////////////////
    public void calcDebug(){
        System.out.println("Doing calc2.");
        //           "0000000000111111111122222"
        //           "0123456789012345678901234"
        String key = "1001010000000000100000000";

        State s = new State(key, 0);
        double survivalProb = s.getSurvivalProb();
       
        String resStr = "For key: " + key + " found prob to be: "  + String.format("%1.14f", survivalProb);
        System.out.println(resStr);
       
        String pathToFile = "C:/temp/res_" + key + ".txt";
        writeToFile(pathToFile, resStr );
    }
    ///////////////////////////////////////////////////////////////
    public void writeToFile( String pathToFile, String str){
        try{
            PrintWriter writer = new PrintWriter(pathToFile, "UTF-8");
            writer.println(str);
            writer.close();
        } catch(Exception e){
            System.out.println("Caught exception: " + e.getMessage());
        }
    }
    /*  Prob(1,4)  = 0.375
     *  Prob(2,2)  = 0.166667
     *  Prob(4,4)  = 0.011869
     *  Prob(4,8)  = 0.014967   ( takes 11.6 secs )
     *  Prob(4,13) = 0.016232
     */
   
}

///////////////////////////////////////////////////////////////////////////////        ///////////////////////////////////////////////////////////////////////////////        ///////////////////////////////////////////////////////////////////////////////        


import java.util.Random;
import java.util.TreeMap;

// prob survival = sum ( numCardsLikeThis * numSafeLocations / TotalslotLocations
//                         * ProbSurvival( next))

// The key is a string 25 chars long,

public class State {
   private static final boolean            m_debug               = false; 
   private static final long               m_seed                = 1;
   private static final int                m_stepsBetweenCaching = 0; //

   private static int                      m_numSuits;
   private static TreeMap  m_map;
   private static Random                   m_rnd;

   private        String                   m_key;  
   private        double                   m_survivalProb;  
   private        int                      m_totalSlots;
   private        int                      m_depth;
  

   ///////////////////////////////////////////////////////
   State ( int suits, int ranks){
      
       m_depth    = 0;
       m_numSuits = suits;
       String      zeros = new String(new char[(suits+1) * (suits+1)]).replace("\0", "0");
       String      key   = adjustKey(suits, suits, zeros, ranks);
       initialize( key );    
      
       if ( m_debug) printKey  ( key );
   }
   ///////////////////////////////////////////////////////
   State( String key, int depth){
       m_depth = depth;
       initialize(key);
   }
   ///////////////////////////////////////////
   private void initialize(String key){
      
       if ( m_rnd == null)
            m_rnd =  new Random(m_seed); // formerly had: ThreadLocalRandom.current();

      
       if ( m_map == null)
            m_map =  new TreeMap();
      
       if( m_numSuits == 0){
           double sqrt = Math.sqrt((double) key.length());
           m_numSuits  = (int) Math.round(sqrt - 0.5f) -1;
           if ( m_debug)  {
              System.out.println(   "Have key of length: " + Integer.toString(key.length())
                                     + " and num suits: "     + Integer.toString(m_numSuits)   );
           }
       }  
      
       m_key          = key;
      
       m_survivalProb = -1;   // This indicates that it has not been calculated.      
       m_totalSlots   = calcTotalSlots();   
      
       if ( m_debug)  printKey(key);
   }
   ///////////////////////////////////////////////////////
   private int calcTotalSlots(){
       int sum = 0;
       for ( int slot = 0 ; slot <= m_numSuits; slot++){
           sum += getNumSlots(slot);
       }
       if ( m_debug)  System.out.println("Key: " + m_key + ", total number of slots found is: " + Integer.toString(sum));
       return sum;
   }
   ///////////////////////////////////////////////////////
   String getKey() { return m_key; }
   ///////////////////////////////////////////////////////
   double calcSurvivalProb(){
      
       if ( m_totalSlots == 0)
           return 1.0; // if we reach the end and there are no slots left, then we have survived.
      
       double prob = 0;
      
       for     ( int slot = 0; slot <= m_numSuits; slot++){
           for ( int card = 1; card <= m_numSuits; card++){
              
                  double probForCard = calcProbForCard(slot, card);
                  prob += probForCard;
                 
                  if ( m_depth < 3) {
                      System.out.println(  "Depth: "       + Integer.toString(m_depth)
                                         + ", slot: "      + Integer.toString(slot)
                                         + ", card: "      + Integer.toString(card)
                                         + ", num slots: " + Integer.toString(m_totalSlots)
                                         + ", prob: "      + Double.toString (probForCard)   );
                  }
           }
       }
      
       if ( m_debug) {
          printKey(m_key);
          System.out.println("Found prob to be: " + String.format("%1.14f",prob) + "\n");
       }      
       return prob;
   }
   ///////////////////////////////////////////////////////
   public double calcProbForCard(int slot, int card){
      
       if ( card == 0)
           return 0;

       int count = getCount(slot, card);
      
       if ( count == 0)
           return 0;
      
       int numCards = count * card;
      
       double probOfCardChoice = (double) numCards / (double) m_totalSlots;
      
       double probOfSlot = 0.0;
      
       for     ( int destSlot = 1; destSlot <= m_numSuits; destSlot++){
           for ( int destCard = 0; destCard <= m_numSuits; destCard++){
              
                  int destCount = getCount(destSlot, destCard);
                  if ( (slot == destSlot) && (card == destCard))
                      destCount--;  // a card cannot go into its own slot.
                 
                  int availableSlots = destSlot * destCount;

                  if ( availableSlots > 0){
                      double probOfContinuedSurvival = getProbOfContinuedSurvival(slot, card, destSlot, destCard);
                     
                      if ( probOfContinuedSurvival > 0) {          
                          probOfSlot += probOfContinuedSurvival * probOfCardChoice *(double) availableSlots /(double) m_totalSlots;
                          if ( probOfSlot > 1.0000001) { // We don't have >= 1.0 to allow a small rounding error
                              printKey(m_key);
                              System.out.println("ERROR: Prob is too big: " + Double.toString(probOfSlot));
                          }
                      }
                  }
           }
       }
       if ( m_debug) System.out.println("Found prob of slot to be: " + Double.toString(probOfSlot));
       return probOfSlot;
   }
   ///////////////////////////////////////////////////////
   double getProbOfContinuedSurvival(int sourceSlot, int sourceCard, int destSlot, int destCard){
      
          String keyForChosenCard   = adjustKeyForTakenCard (sourceSlot, sourceCard, m_key);
          String keyForContinuation = adjustKeyForCardInSlot(destSlot,   destCard,   keyForChosenCard);   
         
          // check if state obj is in map
          Double contProbDouble = m_map.get(keyForContinuation);
          double contProb;
         
          if ( contProbDouble == null ){
              State contState = new State(keyForContinuation, m_depth + 1);
              contProb        = contState.getSurvivalProb();
             
              if (    ( m_stepsBetweenCaching == 0 )
                   || (m_rnd.nextInt(m_stepsBetweenCaching) == 0 )) { // we'll only insert one in n key,value pairs to the map.
                 contProbDouble  = new Double( contProb);
                 m_map.put(keyForContinuation, contProbDouble); // we store it for later use
             
                   if ( m_map.size() % 10000 == 0)
                    System.out.println(   "Map elm: "+ Integer.toString(m_map.size())
                                       + ", key: "  + keyForContinuation
                                      + ", prob: " + Double.toString(contProbDouble));
             
                if ( m_debug)
                    System.out.println(  "Added new key to map: " + keyForContinuation
                                          + " that is item: " + Integer.toString(m_map.size()) );
              }
          } else {
              contProb = contProbDouble.doubleValue();
          }

          if ( m_debug) {
             System.out.println("Key: " + keyForContinuation + ", found cont prob: " + Double.toString(contProb));
             printKey(keyForContinuation);
          }
          return contProb;
   }
   ///////////////////////////////////////////////////////
   double getSurvivalProb() {
      
       if ( m_survivalProb == -1) // i.e. not yet set
            m_survivalProb = calcSurvivalProb();
             
       return m_survivalProb;
   }
   //////////////////////////////////////////////////////
   int getCount(int slot, int card){
             
       return ( getCount(slot, card, m_key));            
   }
   //////////////////////////////////////////////////////
   static int getCount(int slot, int card, String key){
      
       int    index = getIndex(slot, card);
       // System.out.println("Key: " + key + ", index: " + Integer.toString(index));
       String c     = key.substring(index, index + 1);
      
       return charToInt(c);
       // return ( Integer.parseInt(c, 16));            
   }
   //////////////////////////////////////////////////////
   public int getNumSlots(int slot){
      
       if ( slot == 0)
           return 0;
      
       int sum = 0;

       for ( int card = 0; card <= m_numSuits; card++)
           sum += slot * getCount(slot, card);
      
       return sum;      
   }
   //////////////////////////////////////////////////////
   int getTotalSlots(){
       return m_totalSlots;
   }
   //////////////////////////////////////////////////////
   public static void printKey(String key){
   
       if ( key == null){
           System.out.println("Key is null.");
           return;
       }

       System.out.println("key: " + key);

       String header = "   Card ";
      
       for( int j = 0; j <= m_numSuits; j++){
           header += Integer.toString(j) + " ";
       }
      
       System.out.println(header);
      
       for ( int slot = 0; slot <= m_numSuits; slot++){
           String str = "Slot " + Integer.toString(slot) + ": ";
          
           for ( int card = 0; card <= m_numSuits; card++){
               int index = getIndex(slot, card);
               str += key.substring(index, index+1) + " ";
           }
           System.out.println(str);
       }   
   }
   //////////////////////////////////////////////////////
   public static int getIndex(int slot, int card){
       int index = slot * (m_numSuits + 1) + card;
       /*
       System.out.println(    "For slot: "   + Integer.toString(slot)
                           + " and card: "   + Integer.toString(card)
                           + " have index: " + Integer.toString(index));
       */
       return (index);
   }
   //////////////////////////////////////////////////////
   public static String adjustKeyForTakenCard(int slot, int card, String key){
      
       if( card == 0)
           return null; // no cards to give

       String decrementedCurrent = adjustKey(slot, card   , key,                -1);
       return                      adjustKey(slot, card -1, decrementedCurrent, +1);
   }
   //////////////////////////////////////////////////////
   public static String adjustKeyForCardInSlot(int slot, int card, String key){
       if ( slot == 0)
           return null; // have no slot
      
       String adjKey  = adjustKey(slot    , card, key   , -1);
       adjKey         = adjustKey(slot - 1, card, adjKey, +1);          
             
       for( int i = m_numSuits; i > 1; i--){
          
           int countWithZeroCards = getCount(i,0, adjKey );
           if( countWithZeroCards > 0){
               adjKey = adjustKey(i,0, adjKey, -countWithZeroCards);
               adjKey = adjustKey(1,0, adjKey, countWithZeroCards * i);
           }

           int countWithZeroSuits = getCount(0,i, adjKey );
           if( countWithZeroSuits > 0){
               adjKey = adjustKey(0, i, adjKey, -countWithZeroSuits);
               adjKey = adjustKey(0, 1, adjKey,  countWithZeroSuits * i);
           }
       }
      
       int zeroZeroCount         = getCount(0,0, adjKey);
      
       return adjustKey( 0,0, adjKey, - zeroZeroCount);
   }
   //////////////////////////////////////////////////////
   public static String adjustKey(int slot, int card, String key, int adj){
      
       if ( key == null)
           return null;
      
       int count = getCount(slot, card, key);
      
       if ( 0 > count + adj){
           System.out.println("Cannot adjust below zero for slot" );
           return null;
       /*      
       } else if ( count + adj > 15){
           System.out.println("Cannot adjust above 1 char hex limit.");
           return null;       
        */  
       } else {
           int    index   = getIndex(slot, card);
          
           String start   = key.substring(0, index );
           String end     = key.substring(index + 1);
          
           String current = intToChar(count+ adj);
           return ( start + current + end);          
       }          
   }
   //////////////////////////////////////////////////////
   public static String intToChar(int i ){
       char c = (char)(i+48);
      
       return "" + c;
   }
   //////////////////////////////////////////////////////
   public static int charToInt(String s){
       char c = s.charAt(0);
       int  i = (int)c - 48;
      
       if ( m_debug) System.out.println("Char: " + s + " is deemed to be: " + Integer.toString(i));
      
       return (int) i;
   }
   /////////////////////////////////////////////////////
}
 

///////////////////////////////////////////////////////////////////////////////       





And here's the Java code for the Monte Carlo:

import java.util.Random;

public class RanksAndSuitsSurvival {

    private final long   m_simsPerBatch = 1_000_000L;
    private final int    m_numBatches   = 100;

    private final int    m_numSuits     = 4;   // should be  4
    private final int    m_numRanks     = 13;  // should be 13
   
    private       int    m_numCards;
    private       int[]  m_hand;
   
    private       Random m_rnd;
    private final int    m_rndSeed      = 1;
    ///////////////////////////////////////////////////////////////////////////////
    public static void main ( String[] args){
               
        System.out.println("Started...");
       
        RanksAndSuitsSurvival  rass = new RanksAndSuitsSurvival();
        rass.calc();
       
        System.out.println("\nFinished");   
    }
    ///////////////////////////////////////////////////////////////////////////////
    public RanksAndSuitsSurvival(){
       
        m_rnd      = new Random(m_rndSeed);
       
        m_numCards = m_numSuits * m_numRanks;
        m_hand     = new int[m_numCards];
       
        for ( int i = 0; i < m_numCards; i++)
            m_hand[i] = i % m_numRanks;
    }
    ///////////////////////////////////////////////////////////////////////////////
    public void calc(){
       
        long startTime = System.currentTimeMillis();
               
        long survivors = 0;
       
        // The only reason for the nested 'for' loops rather than a single 'for' loop
        // is because we wanted to print out the progress intermittently and we didn't
        // want to introduce another 'if' inside the main 'for' loop for reasons of speed.
        for ( int batch = 1; batch <= m_numBatches; batch++ ) {
       
            for ( long counter = 0; counter < m_simsPerBatch; counter ++){
       
                if (survived())  
                    survivors++;
            }
           
            System.out.println("Have now completed batch: " + Integer.toString(batch) + ", after "
                               + Double.toString((double)( System.currentTimeMillis() - startTime) * 0.001) + " seconds");
        }
   
        double numSims = (double) m_numBatches * (double) m_simsPerBatch;
        double prob    = (double) survivors              / numSims;
        double stdDev  = Math.sqrt( (1.0 - prob ) * prob / numSims);

        long   endTime = System.currentTimeMillis();
       
        printoutResults(survivors, numSims, prob, stdDev, endTime - startTime);
    }
    /////////////////////////////////////////////////////////////////////////////////
    public void printoutResults(long survivors, double numSims, double prob, double stdDev, long elapsedTimeMS){
   
        System.out.println("\nFound " + Long.toString(survivors) + " survivors out of " + Double.toString(numSims) );
        System.out.println("So the survival probability is estimated to be: "           + Double.toString(prob     ) );
        System.out.println("with a standard deviation of:                   "           + Double.toString(stdDev   ) );
           
        System.out.println("Time elapsed: " + Double.toString((double)( elapsedTimeMS) * 0.001d) + " seconds");
       
    }
    ///////////////////////////////////////////////////////////////////////////////
    public boolean survived(){
        
         shuffleHand();     // will shuffle m_hand
         return checkHand();
    }
    ///////////////////////////////////////////////////////////////////////////////   
    // Implementing Fisher–Yates shuffle
    public void shuffleHand(){         
        for (int i = m_hand.length - 1; i > 0; i--)  {
          int index     = m_rnd.nextInt(i + 1);
          // Simple swap
          int a         = m_hand[index];
          m_hand[index] = m_hand[i];
          m_hand[i]     = a;
        }
    }
    ////////////////////////////////////////////////////////////////////
    // Returns true when the hand 'survived'.
    public boolean checkHand(){      
                   
           for (int i = 0; i < m_hand.length; i++){
              
               if ( ((m_hand[i] - i) % m_numRanks) == 0 )
                      return false; // did not survive
           }
              
           return true;  // survived              
    }     
    ///////////////////////////////////////////////////////////////////////////////   
}


Sunday, May 29, 2016

Space Elevator Calculations

In this post we use the equations derived in the previous posts and allow the user to plug in some numbers.
In particular, when a cable thickness has been chosen to have the optimal thickness, we determine the mass of satellite required to achieve the given tension at the earth's surface.
We also work out the total mass of the cable.

Inputs
Cable Specific Strength M Yuri
Safety factor -
Tension at earth's surface Newtons
Satellite height above earth km

Outputs
Total Cable mass - kg
Satellite Mass - kg
Tension at Satellite - Newtons

Saturday, May 7, 2016

Lifting a cable above the equator

Suppose you were at the equator and decided to hold a cable at some height above the earth's surface. The cable dangles vertically down below, touching the earth, but it is not anchored there. As you go higher and higher, you would stay above the same point on the earth. I.e. the top of the cable would be geostationary as the earth spins.
If the cable were not made of a very strong material then as you tried to go higher, it would eventually break.
It turns out that if you have a cable of specific strength 48.5 M Yuri, then you'd be able to reach all the way up to a geo-stationary satellite (35,797km up).
If you reach that height, then you'll be able to go on out to about 143,937 km without requiring a stronger cable.
Something a tad surprising ( to me at least) happens 143,937 km up. You will no longer need to hold the top end. The tension at both the top and the bottom will be zero. The maximum tension will be at the height of a geostationary satellite orbit.
A cable that long, will be able to support itself. The gravitational attraction is exactly the required centrifical force to maintain a circular orbit.
If you then attempted to use a longer cable and go higher, the cable would lift off the ground. In other words the net force would be upwards. So if you want to go out further, then you'd need to anchor one end to the surface of the earth.

I've made lots of statement there, now lets start to justify some of that mathematically.
This post is a follow on from the previous one: Polar Escape Strength Threshold .
We'll use the same variables as defined there.
The cable is vertical and as the earth spins, the top of the cable remains over the same point on the equator. So the cable is geostationary and hence the net force acting on a small section of cable is the centrifical force: \[ \delta m\ \omega_s^2 r = \frac{G M_e \delta m}{r^2} - \delta T \] but we know: \[\delta m = \lambda \ \delta r\] and so: \[ \delta T = \delta r \left( \frac{G M_e \lambda}{r^2} - \lambda \omega_s^2 r\right) \] We can integrate that and use the fact that the tension in the cable at the earth's surface is zero. \(T(r_e) = 0 \).
We find:
\[ T(R) = G M_e \lambda \left( \frac{1}{r_e} - \frac{1}{R} \right) + \frac{\lambda \omega_s^2}{2} \left( r_e^2 - R^2 \right) \ \ \ \ \ \ \ \ \ (eqn 1) \] We define the specific strength Y to be the tension at breaking point divided by the mass per unit length: \[ Y = \frac{T(R_b)}{\lambda} = G M_e \left( \frac{1}{r_e} - \frac{1}{R_b} \right) + \frac{ \omega_s^2}{2} \left( r_e^2 - R_b^2 \right) \]
We now return to eqn 1 above. We can differentiate that with respect to R to find where is the maximum tension: \[ 0=\frac{\partial T}{\partial R}= \frac{G M_e \ \lambda}{R^2}- \lambda \ \omega_s^2 R \] We find that the maximum tension occurs at the geostationary orbit: \[ R_g = \left( \frac{G M_e}{\omega_s^2} \right)^{1/3} \] The threshold specific strength that would enable the cable to reach this max tension is: \[ Y_E = GM_e \left( \frac{1}{r_e} - \frac{1}{R_g}\right) + \frac{\omega_s^2}{2} \left( r_e^2 - R_g^2\right) \approx 48.5 M Yuri \] Returning to eqn 1, we see that \(T(R)=0\) when \(R=r_e\), in other words the tension is zero at the earth's surface. The tension reaches a maximum at the orbit of a geostationary satellite. It then decreases again. But we can find R such that \(T(R)=0\) \[ 0 = G M_e \left( \frac{1}{r_e} - \frac{1}{R_s} \right) + \frac{ \omega_s^2}{2} \left( r_e^2 - R_s^2 \right) \] That implies: \[ 0 = R_s^3 \ \frac{ \omega_s^2} {2} - R_s \left( \frac{ \omega_s^2 \ r_e^2 }{2} + \frac{G M_e }{r_e} \right) + G M_e \] That is cubic in \(R_s\). We solve that numerically and we find that the distance from the centre of the earth to the point on the cable where the tension is zero is: \[R_s \approx 1.5030 \times 10^8 m\] which 143,937 km from the surface of the earth.
A cable which is that long and is placed vertically above a point on the equator has zero tension at both ends. It neither drifts off into space nor falls to the earth. It just remains where it is.

Returning once again to my (current) favorite equation i.e. eqn 1, we see that it implies that for \(R>R_s\) the tension becomes negative. But we can't allow negative tension since we can't push with our cable. Without being held down the cable would rise up. So we would need an anchor at the surface of the earth. With that anchor we no long would have the conditon \(T(r_e) =0 \) . That was used in the derivation of eqn 1. And so eqn 1 would not be valid.
What we find is that as the cable gets longer beyond 143,937 km, the tension would keep increasing. An infintely long cable would require infinite tension at the anchor.

Polar Escape Strength Threshold

Suppose we had a cable with mass per unit length \(\lambda\).
So the mass of a small section of length \(l\) is: \[m \ = \lambda \ l \] If the cable is hanging vertically in a constant gravitational field \(g\), then the tension at the top will be the \[T = m \ g = \lambda \ l \ g \] We define the specific strength to be the tension at breaking point divided by the mass per unit length: \[Y=\frac{T_b }{\lambda} \] This is measured in units of: \[[Yuri] = \frac{[Newton] [ meter]} {[kilogram]^2} \] With constant \(g\), a cable with specific strength \(Y\) will be able to support up to its breaking length: \[l_b =\frac{T_b }{\lambda \ g} = \frac{Y}{g}\]
Now how about the case when the gravitational follows a one over r squared law. Where r is the distance from the centre of the earth. Lets start by considering a vertical cable that isn't moving: A small section from \(r\) to \((r + \delta r)\) has forces acting up and down. If it is not accelerating, then the forces are balanced: \[F_{up}=F_{down}\] \[T(r + \ \delta r )= T(r)+ \frac{G M_e \delta m }{r^2} \] But we know the mass: \[\delta m = \lambda \ \delta r\] So \[\delta T = \frac{G M_e \ \lambda \ \delta r }{r^2} \] We can integrate both sides of that for a cable that goes from the earth's surface \(r=r_e\) to \(r=R=r_e + h \) a height h above the earth.
We note that the end that touches the surface is not anchored. So the tension: \[T(r_e)=0\] So \[T(R)=G M_e \ \lambda \left( \frac{1}{r_e} - \frac{1}{R} \right)\] Suppose the cable breaks at \(R_b\), returning to the definition of the specific strength: \[Y=\frac{T(R_b)}{\lambda} \] Thus: \[Y=G M_e \ \left( \frac{1}{r_e} - \frac{1}{R_b} \right)\] rearranging we find: \[R_b = \left( \frac{1}{r_e} - \frac{Y}{G M_e}\right)^{-1}\]
So the breaking height above the earth's surface: \[h_b = R_b - r_e = \left( \frac{1}{r_e} - \frac{Y}{G M_e}\right)^{-1} - r_e\] Now when: \[\frac{1}{r_e} - \frac{Y}{G M_e} = 0 \] we can reach an infinite height! In other words the cable can completely escape the earth's gravitational field.
The polar escape strength threshold is: \[Y_p = \frac{G M_e}{r_e} \approx 62.6 M Yuri \] We use the word polar here since is most relevant for a cable above either the north or south pole. If we were to have a geostationary cable above the pole, then we could ignore the spin of the earth. On the other hand for a geostationary cable above the equator, the centrifical acceleration should not be ignored. I'll show the details of how to deal with that in my next post.