A bit of maths

Monday, March 4, 2024

Seeing Far Away Mountains

When you're on a hill on one island and you look out to sea, sometimes you can see mountains on other islands. On days when humidity is low, it is possible to see mountains that are rather far away. But how far can you see?
Let's look at the maths, assuming a spherical earth and ignoring the bending of light due to refraction or indeed general relativity.

Suppose you're at a height of \(h_a\) above sea level and you look out to sea.
We'll let \(d_a\) be distance from the observer to the horizon, where the sky appears to meet the sea. We'll measure this distance along the surface of the earth at sea-level. And we'll let \(a\) be the angle at the centre of the earth between the observer and the horizon.
We know that the arc-length is the radius multiplied by the angle in radians, so: \[d_a = r a\] where r is the radius of the earth.
We'll now let \(d_b\) be the distance from the horizon point to the object being observed (along the surface of the earth at sea level), so: \[d_b = r b\] And let d be the full distance from the observer to the object being observed (along the surface of the earth at sea level), so: \[d = d_a + d_b = r ( a + b )\] We also know that: \[ cos(a) = \frac{r}{r + h_a}\] \[ cos(b) = \frac{r}{r + h_b}\] Where \(h_b\) is the minimum height of the object in order for it to be visible to the observer.
Rearranging those equations, we find the distance to the horizon is: \[d_a = r \hspace{2mm} cos^{-1} \left( \frac{r}{r + h_a} \right) \] and \[h_b = r \left( \frac{1}{cos(b)} - 1 \right)\] where \[ b = \frac{d}{r} - cos^{-1} \left( \frac{r}{r+h_a} \right)\] So, given our current height \(h_a\) and the distance to the object \(d\), we can work out \(h_b\) the minimum height the object needs to be in order to be visible.
Let's try that out with some numbers.

Inputs
Observer height above sea level		meters
Distance to object		km

Results
Distance to horizon		km
Height required to be visible		meters

If you change one of the inputs above and hit enter, then the results will be updated.

If you're on a hill south of Dublin, is it possible to see the mountains of Wales? Well mostly the answer is no, because the humidity in the air obscures the view. Anyway, let's find out what our calculator tells us about the visibility on an exceptionally clear-skyed day.
For example, Killiney Hill is 153 meters above sea level and Snowdon in Wales is about 138 km away. Using the calculator above, we see that Snowdon would need to be at least 692 meters tall to be visible.
In fact it is 1,085 meters high and so we should be able to see the top few hundred meters of the mountain. That is indeed the case, as the photo below shows:

That's the view across the Irish sea of Snowdon (Yr Wyddfa) in Wales, from Killiney Hill in south Dublin, Ireland.
Holyhead (in Wales) is closer to Killiney (99km) than Snowdon is to Killiney (138km), but you won't be able to see Holyhead, because there aren't any hills in Holyhead high enough to be visible. The minimun height required would be 236 meters.
If you travel south from Killiney to County Wicklow and go up one of the higher mountains you might expect to be able to see more of the hills and mountains in Wales. But once again, on most days there will be too much humidity and indeed too many clouds to be able to see Wales, but on a good day, with an really clear sky, you'll be able to see lots.

Thursday, December 14, 2023

Transport in The Line City, Saudi Arabia

Suppose you were to build a city, but instead of it being clustered round a central district, it was to be built in along a 170 km line, then what would the consequences be? This actually isn't a hypothetical question. The Line is currently under construction in Saudi Arbia. It is 170 km long and just 0.2 km wide. That's just 34 km². So it has the same area as a square of 5.8 km x 5.8 km.
One of the most appealing attributes of cities, is that many things that you might want are close by, unlike in rural areas. But if a city is built as a long very stretched out rectangle that resembles a line, then conveniences, that might have been expected to be close, turn out to be far away. In fact in a 170 km x 0.2 km city everything is just over 18 times further way than would be the case in a city of the same area that was contined in a square measuring 5.8 km by 5.8 km.

In this blog post we'll explain why things are 18 times further away and delve into the transport options.

Let's start by asking the question: in a rectangular city, what's the average distance between points?

When we have a rectrangular city, of dimensions A x B, we can evaluate the average distance between points in the city by calculating the following quardruple integral: \[D(A,B) = \frac{1}{A^2 B^2}\int_{0}^{A} \int_{0}^{A} \int_{0}^{B} \int_{0}^{B} \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2} \hspace{3mm} dy_1 \hspace{1mm} dy_2 \hspace{1mm}dx_1 \hspace{1mm}dx_2\] A pure mathematician looking at that integral might think that it would be easier to deal with the average squared distance between points rather than the average distance because when were calculating the average distance we have a square-root that we need to deal with.
An applied mathematician might reply that we can work out the quadruple integral numerically and so the square-root won't cause us any serious problems. If fact the code to work out that quadruple integral is written out below and indeed has been embedded into this page, so if you'd like to try it here please do:

Inputs
Width		km
Breadth		km
Steps in Calc

Result
Average Distance		km

If you change one of the inputs above and hit enter, then the result will be updated.
The "Steps in Calc" above is the number of steps that the numerical integral is divided into. A higher number of steps should give a more accurate answer, but doubling the number of steps will cause a slow down of a factor of \(2^4=16\) in the quadruple integral.

Using that calculator we find that in a square city of dimension 5.8km x 5.8km, the average distance between points is 3.02km.
But in a streatched out city with dimensions 170km by 0.2km, the average distance between points is 56.63km, in other words 18.75 times further apart, so instead of travelling 1km to go to a convenience store, you'll need to travel over 18km.

That average distance of 56.63km that we got for the line city is a little less than one third of the line's length (170km). In fact on a line segment of lenght L, the average distance between points is L/3.
That statement can be justified by evaluating the following integral formula for the average distance between points on a line segment: \[ H = \frac{1}{L^2} \left( \int_{0}^{L} \int_{0}^{L} | x_1 - x_2 | dx_1 dx_2 \right) = \frac{L}{3} \]
Now if the city is set up with one main transport artery in each direction for people, then what kind of flow of people might be expected?
Let's start with a slightly unrealistic assumption: when travelling between points, people in the city choose destinations that are far away just as likely as close destinations. That could be the case when transport is cheap, fast and comfortable.
Suppose, for every two people in the city, there is one journey during the rush-hour commute. In the real world some people will make more than one journey at rush hour, for example parents making two journeys, one accompanying their child to school and another go on to their final destination.
If people didn't have a perference for near-by locations, then half of the destinations chosen by people who live in the east of the city will be in the western side of the city.
With a proposed population of 9 million, half of them (4.5m) are participating in the morning communte, half of those (2.25m) live in the eastern size of the city and half of them (1.125m) will cross the mid point of the city. In other words over one million people will be travelling on the main artery heading west during one hour of rush hour. I do indeed accept that not all the assumptions leading to that conclusion are absolutely realistic. But it does give an indication of the sort of flow of people that might be expected.
1.125 million people in an hour is about 312 people per second.
On a regular road with cars, suppose they are keeping a safe distance apart, with a 2 second gap between cars and an average of two people per car, then with 1 lane we can achieve a flow of 1,800 cars per hour which are carrying 3,600 people. But we seek 1.125 million people per hour and so we would need about 313 lanes of road in each direction. That would not be possible in a city whose total width is 200 meters.

An alternative approach would be to use a commuter train, with say 10 cariages each containing 300 people squashed in. They arrive at the stations at 2 minute intervals. So we would have 3,000 passing every 120 seconds. In other words 90,000 per hour. That's pretty good, but not nearly good enough. We're targeting over one million per hour.
So what's required is a very different approach.
Something quite different to cars or commuter trains.

Supoose people were arranged in double decker pods of 10, with 5 people at each level. The large 10 person pod could be made up of mini pods that would give solo passenger privacy as they travel.
If the pods travelled at 340km/hr. Then the journey time from one end of the 170km city to the other would be just 30 minutes. If we want to enable a flow of 1.125 million people per hour then we'll need 112,500 pods per hour and at a speed of 340 km / hour,
that will enable 340 x 1000 / 112,500 = 3.02 meters per pod, so we could have pods say 2 meters long with a 1 meter gap between pods.
When on the main artery, all commuters will travel at 340 km / hr. When approaching their destination they'll take an off-ramp and then decelerate. Conversely, those who wish to join the artery, will first accelerate along an on-ramp and then join after they have reached 340 km /hr. On the main artery, since all will be travelling at the same speed, there will be no accelerating, decelerating or over-taking.

If we now stick with our previous assumptions that people don't have a preference for local destinations. Then the number of people who will want to pass a point at say x on the main artery travelling east will be some constant times the proportion of people who are starting at points west of x multipled by the proportion of destinations that are to the east of x.
So the flow of people in an easterly direction past point x during rush hour will be: \[ F(x) = \alpha \left( \frac{x}{L} \right) \frac{L - x }{L} = \frac{\alpha}{L^2} (L x - x^2)\]
That will be at a maximum when the derivative with respect to x is zero.
So we have maximum flow at:
\[ 0 = L - 2x \] \[ \implies x = L / 2 \]
So we see the maximum flow will be at the mid-point of the line, with zero flow at the two end points \(x = 0\) and \(x = L\) .
One way to reduce the flow of people would be replace the line with a ring. The flow would then be much more even along the arteries. Also if the artery in one direction was blocked, people would still be able to get to their destination by going in the other direction, going the long-way round.
Another advantage of a ring over a line would be that the average distance between points (travelling along the artery and not cutting across the ring) would be L/4, where L is the cicumfirance of the ring. For the line, we had an average distance of L/3. So the ring would indeed make an improvement. However it would be very difficult to extend the ring-city, unlike a line-city which could grow at both end-points.

Below is a javascript function that can be used to calculate the quadruple integral above, evaluating the average distance between points in a rectrangle.




//////////////////////////////////////////////////////////  

// A function written in Javascript to work out the average disance 

// between points inside a rectrangular city.

function calcAvgDist(cityWidth, cityBreadth, steps){

      const xStep   = cityWidth    / steps;

      const yStep   = cityBreadth / steps;



      const epsilon = 1e-8;

      const xLimit  = cityWidth   - epsilon;

      const yLimit  = cityBreadth - epsilon;


  
      let runningSum = 0;

      for(             let x1 = 0;   x1 < xLimit;   x1 += xStep){

          for(         let x2 = 0;   x2 < xLimit;   x2 += xStep){

              for(     let y1 = 0;   y1 < yLimit;   y1 += yStep){

                  for( let y2 = 0;   y2 < yLimit;   y2 += yStep){

              

                  let xDiff = x2 - x1;

                  let yDiff = y2 - y1;



                  runningSum += Math.sqrt((xDiff * xDiff) + (yDiff * yDiff) );

                  }

              }

          }  

      } 

      return (runningSum * Math.pow(steps, -4));

}

Tuesday, September 19, 2023

Connecting Dots

Suppose we're given a 5 by 5 grid squared grid of nodes. We're told the bottom left point is the starting point. The second from the right on the bottom row is the finishing point. We have to travel to every node from the start to the finish, visiting all 25 nodes exactly once. We can travel from each node to a neighbouring node either by going horizontally or vertically one step. We cannot go diagonally.

So what path can we choose? Here is an example of a path we might try:

In that case, although we have successfully travelled from the start to the finish. We have skipped one node.
We can try lots of different paths and each time we find that something goes wrong. We can't get from the start to the finish visiting every node exactly once.
In fact it turns out it can't be done. And here's the proof:

Suppose we lable each node on the grid by its position (1,1) to (5,5). We deem a node to be odd if the sum of its two indices is odd. It is deemed to be even if the sum of its indices is even.
The starting node (1,1) is even since 1+1=2 is even.
The finishing node (4,1) is odd since 4+1=5 is odd.

Every step we take in the path will toggle from odd to even or even to odd, since in each step one index is changed by 1.
If we find a path to get to get from the start to the finish, we'll need to visit 25 nodes, that's 24 steps. So the final node will be even since the starting node was even. However we know that the final node is odd. So we have a contradiction. Hence no such path exists.
So you can spend as long as you like searching for a path, but you won't find one.

Sunday, August 20, 2023

Using a Neural Network to solve a Rubiks Cube

In a standard 3x3x3 Rubik's cube, each of the 6 faces contain 9 squares. As a puzzle, it is deemed to be solved when each face just contains one colour, i.e. all the greens squares are on one face, all the blues on another etc.
Some people work out how to solve Rubik's cubes, but it is more common for people to memorize methodologies the others have published online and in books. Generally the algorithms that are relatively easy to remember involve many steps. A more advanced cuber will be able to find solutions with fewer steps. In speed cubing competitions, participants who can find a method of solving the cube with less steps will have an significant advantage.
In this article we'll describe how a neural network can be used to help find an efficient way of solving a Rubik's Cube. (Here's a link to the source code which uses Keras.)

Suppose we have a neural net which takes as input a vector X which represents the current state of a cube.
The output will be a vector Y with Y[i] = (probability that the cube can be solved in i steps) assuming a zero based indexing.
So if: \[\underline{Y}=(0,0,1,0,0, ..., 0)\] then we are 100% certain that the cube can be solved in 2 steps.
And if: \[\underline{Y}=(1,0,0, ..., 0)\] then it is a cube that can be solved in 0 steps, i.e. it is already solved.
We can create training data by starting with a solved cube and then randomly rotating the faces say n times. We then write a row in the training data file with the number n and the representation of the cube.
If you're interested in how this could be implemented in Java, then have a look at this code. On a regular laptop, it takes the code a few seconds to generate a training set with one million instances (rows).
The neural net that we built used a number of dense layers and then the last layer was softmax. It was implemented using Keras. The source code that we wrote can be found here.

How can the neural net be used to solve the cube?
Suppose the NN indicates that it can solve the cube in 8 steps. Then we can try 12 different rotations (each side of the six sides can be rotated either clockwise or anticlockwise). The neural net should then indicate that one of those options results in a cube that can be solved in 8-1=7 steps. And the other 11 can be solved in 8+1=9 steps. We choose the option that reduces the number of steps and then repeat until we have a solved cube.
In our first implementation the NN was very good at identifying if there were less than 6 steps to solve it. However if the solution involved more than 6 steps then it struggled.
One of the improvements we could make now is to use the NN to improve the quality of the training data.
When the training data is being generated, suppose one face is rotated consecutively 3 times clockwise. That could have been achieved with a single rotation anticlockwise. So the neural net may be able to find a faster way to solve the cube than is indicated in the data.
So we could apply the neural net to the input data and when shorter solutions are found, the data could be updated. The model can then be retrained with the improved data, which may enable the data to be improved even more.

Tuesday, June 20, 2023

Companies buying each other

What happens when two companies buy eachother's stock? Is it like when two snakes start eating eachother tail first?
In this blog post, we'll look into the mathematics of what is going on.
But let's start with a simple case of a company owning its own stock.
Suppose we have a stock price S, with a total of N shares issued, of which n are held by the company itself.
Let M be the value of the companies assets, excluding the shares that it holds in itself.
So we have the total valuation is the total number of shares multiplied by the share price, which is also equal to the valuation M plus the shares it holds in itself: \[S N = M + S n\] Rearranging, we have: \[M = S(N-n)\] So we can say that the main company's valuation M, is the share price times the adjusted number of shares: total number of shares less the number of shares that the company holds in itself.
So far so good.
On the other hand, if we were to let a company declare its own shares on its balance sheet and treat them as a regular asset, then we could have a company valuation of (N S) and the owners of shares in the company would have an odd recursive ownership. When they bought shares they would be buying fractional ownership of more shares that the company held and those shares would have a fractional ownership of more shares etc.
Indeed if the company held many of it own shares, we could end up with a valuation being a large multiple of the main assets M.
Suppose we have a company valuation: \[V=M + S n \] Using the equations above we have: \[S=\frac{M}{N-n}\] and so: \[V= M \left( 1 + \frac{n}{N-n} \right) =M \frac{N}{N-n}\] And in the limit as \(n \rightarrow N\) , the valuation of the company goes to infinity.
So, if we don't want to allow companies to give themselves infinite valuations, then we shouldn't let them declare the own stock on their balance sheet.

Suppose we have two companies and each buys stock in the other. Is this like a case of two snakes consuming eachother? If we look at the numbers, what happens to the company valuations and stock prices?

Let's label the companies 'a' and 'b'. We can say that the valuation of company 'a' is the number of outstanding shares times its share price. And we'll break that valuation into two parts, the main company and shares that the company holds in company 'b': \[N_a S_a = M_a + n_b S_b\] where company 'a' holds \(n_b\) shares in company 'b'.
We have a similar expression for company b which holds \(n_a\) shares in company 'a'. \[N_b S_b = M_b + n_a S_a\] If we add those two equations and rearrange a little, we find: \[M_a + M_b = S_a ( N_a - n_a) + S_b ( N_b - n_b)\] The equation suggests that the summed valuation of the main assets of the two companies is equal to the share price of each multiplied by the adjusted number of shares, which is total shares less shares owned by the other firm. Or in terms of corporate ownership, we can get full ownership of the two companies main assets \(M_a\) and \(M_b\) by buying the remaining share of each company that aren't already held by the other. That sounds reasonable.

Suppose now company 'a' decides to buy a few more share in company 'b', what impact will that have on the two share prices?
We let the investment that 'a' makes in 'b' be \(\delta I_b\). And suppose the price that it pays for each share is \(S_b\).
Company 'a' will need to get the cash for the shares from somewhere, so we need subtract \(\delta I_b\) from \(M_a\).
When the transaction is complete, we'll see what our equations suggest is the appropriate impact on the share price of companies 'a' and 'b'.
We'll let the adjusted share prices be \(S_a + \delta S_a\) and \(S_b + \delta S_b\). So we have a total valuation for company 'a': \[N_a ( S_a + \delta S_a) = M_a - \delta I_b + ( N_b + \delta N_b) ( S_b + \delta S_b)\] where the number of new shares in 'b' that 'a' purchased was: \[\delta N_b = \frac{\delta I_b}{S_b}\] After doing a little bit of algebra we find: \[N_a \delta S_a = N_b \delta S_b + \frac{\delta I_b}{S_b} \delta S_b \hspace{22 mm} eqn(1)\] In the case we are considering, company 'b' hasn't bought any new shares in company 'a', but we're interested to know if its valuation has been impacted, we have a valuation for company 'b': \[N_b ( S_b + \delta S_b) = M_b + n_a ( S_a + \delta S_a) \] Subtracting out the original valuation, we find: \[N_b \delta S_b = n_a \delta S_a \] And so: \[ \delta S_b = \frac{n_a}{N_b} \delta S_a \hspace{22 mm} eqn(2)\] We can substitute that into eqn(1), we find: \[N_a \delta S_a =(N_b + \frac{\delta I_b}{S_b} ) \frac{n_a}{N_b} \delta S_a \] So: \[ 0 = \delta S_a \left[ (N_b + \frac{\delta I_b}{S_b} ) \frac{n_a}{N_b} - N_a \right] \] And so we see that \[\delta S_a = 0\] and consequently, eqn(2) tells us: \[\delta S_b = 0\]
So, our model suggests that two firms buying shares in eachother doesn't cause price instability.

Wednesday, March 8, 2023

Latitude Longitude Distances

Suppose we have the latitude and longitude of two points on the earth's surface, then how can we work out the distances between them?
One approach would be to make the following assumptions:

The earth is a sphere.
We are interested in the distance along the earth's surface.
Altitude can be ignored.
The distance from the equator to either pole is 10,000 km.

To calculate the distance we could first find the cartesian coordinates of the two points. We set the origin to be the centre of the earth. For each point on the earth's surface: with latitude \( \phi \) and longitude \( \theta \): \[x = r cos( \phi ) cos ( \theta ) \] \[y = r cos( \phi ) sin ( \theta ) \] \[z = r sin(\phi )\]

If we have two points A and B, with coordinates \( (x_a, y_a, z_a) \) and \( (x_b, y_b, z_b) \)
then we can use a 3 dimensional version of the Pythagorean theorem to determine (d) the straight-line distance between the points using the equation: \[ d^2 = (x_a - x_b)^2 + (y_a - y_b)^2 + (z_a - z_b)^2 \] But that distance is not quite the answer that we are looking for. We seek the distance along the surface of the earth, not the straight-line distance which cuts through the earth.

We let the angle between the points be \( \lambda \), when we know that, we can multiply it by the radius to get the distance along the surface, which is the arc-length.
Looking at the diagram above we can see that: \[sin ( \lambda / 2) = \frac{d/2}{r} \] and so we can evaluate \( \lambda \) using: \[ \lambda = 2 arcsin \left( \frac{d}{2r} \right) \]
Now it is time to try out some numbers:

Inputs
	Latitude	Longitude
Point A
Point B

Result
Distance		km

If you change one of the inputs above and hit enter, then the result will be updated.
The latitude figures are degrees north of the equator. For degrees south, use a negative.
The longitude figures are the degrees west of the prime meridian. For degrees east use a negative number.

If you're interested in the javascript code used to do the calculation, then have a look at this in github.

An alternative approach is to use the Haversine formula.

Sunday, October 30, 2022

A Bit of Algebra

In this blog post we're going to look at finding the real positive x such that: \[x^{x^3} = \frac{1}{2^{1/6}} \hspace{10 mm} (Equation 1)\]
By all means, do have a go at solving it, before reading our solution below.

The first, slightly non-obvious step is to raise both sides of the equation to the power of 3: \[ \left( x^{x^3} \right)^3 = \left( \frac{1}{2^{1/6}} \right)^3 \] Rearranging that a bit, we have: \[ x^{3x^3} = \frac{1}{2^{3/6}} \] And so: \[ \left( x^3 \right) ^{x^3} = \left( \frac{1}{2} \right) ^ {1/2} \hspace{10 mm} (Equation 2) \] If \[ a^a = b^b \] then we have a solution: \[a=b\]
So, returning to Equation 2, we have: \[x^3=1/2 \] And hence: \[x=\frac{1}{2^{1/3}} \] So, we have a solution to equation 1. But are there any others?
In fact, it is possible have \[ a \neq b \] when \[ a^a = b^b \] For example: \[ \left( \frac{1}{4} \right)^{1/4} = \left( \frac{1}{2} \right)^{1/2} \] So, returning to equation 2, we can see that another solution for x is when: \[ x^3 = \frac{1}{4} \] And hence: \[ x = \frac{1}{2^{2/3}} \] So, our solutions for x are: \[ \frac{1}{2^{1/3}} \hspace{4 mm} and \hspace{4 mm} \frac{1}{2^{2/3}} \] But are there other solutions?
If we let \[ z=x^3 \] then, returning to equation 2 we have: \[ z^z= \frac{1} {\sqrt{2}} \hspace{10 mm} \] We can plot \(z^z \) for positive z:

We can see that \(z^z\) decreases in the range 0 to 1/e, after that it increases and keeps on increasing. So for positive z, we will have a maximum of two solutions to \( z^z = k\), where k is some constant. We have two solutions already, so we now know there aren't any more,
unless of course we want to investigate negative or complex solutions...