In this warped map of Poland, straight line distance (emphatically not distance measured along the segments) between points in the network is approximately proportional to their rail distance. I do not have a good name for this kind of map. All I know is: if it showed the travel time, we would call it a time-space map. By the way, here is the unwarped version.

Interestingly, the map is the best possible in the sense of minimizing the squared error of the approximation. The rest of this article explains how to produce it.

- Scrape the data about Polish railway lines from http://semaforek.kolej.org.pl and put it into an SQLite database. Here is a sample:
SELECT line, name, kind, metrage, another_line FROM Lines JOIN Names USING(name_id) JOIN Kinds USING(kind_id) WHERE line = 137 ORDER BY metrage LIMIT 21; 137 | Katowice | stacja węzłowa | 381 | 1 137 | Katowice | stacja węzłowa | 381 | 138 137 | Katowice | stacja węzłowa | 381 | 139 137 | Katowice | stacja węzłowa | 381 | 656 137 | Katowice | stacja węzłowa | 381 | 713 137 | Katowice Załęże | przystanek osobowy | 2528 | 137 | Katowice Towarowa KTC | stacja węzłowa | 2913 | 713 137 | Chorzów Batory | stacja węzłowa | 6166 | 131 137 | Chorzów Batory | stacja węzłowa | 6166 | 164 137 | Chorzów Batory | stacja węzłowa | 6166 | 713 137 | Świętochłowice | przystanek osobowy | 8413 | 137 | Ruda Chebzie | stacja węzłowa | 11740 | 187 137 | Ruda Chebzie | stacja węzłowa | 11740 | 189 137 | Ruda Śląska | przystanek osobowy | 14068 | 137 | Zabrze | stacja | 18926 | 137 | Gliwice | stacja węzłowa | 27100 | 141 137 | Gliwice | stacja węzłowa | 27100 | 147 137 | Gliwice | stacja węzłowa | 27100 | 168 137 | Gliwice | stacja węzłowa | 27100 | 200 137 | Gliwice | stacja węzłowa | 27100 | 671 137 | Gliwice | stacja węzłowa | 27100 | 711

- Construct a list of 1838 edges between all terminal and junction nodes of the network, labelled with their length in metres:
edges = [ (u’Katowice’, u’Katowice Towarowa KTC’, 2532), (u’Katowice Towarowa KTC’, u’Chorzów Batory’, 3253), (u’Chorzów Batory’, u’Ruda Chebzie’, 5574), (u’Ruda Chebzie’, u’Gliwice’, 15360), ... ]

- Convert the list of edges into an adjacency matrix, putting infinite distance between nonadjacent vertices. The matrix has 1258 rows and columns.
- Compute the shortest distances between all pairs of vertices with the Floyd–Warshall algorithm, obtaining the so-called proximity matrix
**D**. - Perform multidimensional scaling (MDS) of the proximity matrix
**D**, in its classical version invented by Torgerson in 1952:coords = sklearn.manifold.MDS( dissimilarity=’precomputed’).fit_transform(D)

Under the hood, it would perform four steps:

- set up the matrix of squared proximities
**D**^{(2)}

[if the*i*th point has coordinates*x*,_{i}*y*,_{i}*z*,…, then_{i}*d*_{ij}^{(2)}= (*x*−_{i}*x*)_{j}^{2}+ (*y*−_{i}*y*)_{j}^{2}+ (*z*−_{i}*z*)_{j}^{2}+⋯], - perform the “double centring”: subtract the row mean and the column mean from the elements of
**D**^{(2)}, add the total mean, and multiply by −1/2, obtaining matrix**B**

[*b*=_{ij}*x*+_{i}x_{j}*y*+_{i}y_{j}*z*+⋯],_{i}z_{j} - find the singular value decomposition of
**B**into the product of three matrices**USU**^{T}, - take the initial two singular vectors
**U**_{2}and two corresponding singular values**S**_{2}— the rows of matrix**X**=**U**_{2}**S**_{2}^{1/2}give the coordinates of the points

[the initial two coordinates computed by SVD follow the principal axes of**B**, the remaining coordinates are less important, so*b*≈_{ij}*x*+_{i}x_{j}*y*or in matrix form_{i}y_{j}**B**≈**XX**^{T}=**U**_{2}**S**_{2}**U**_{2}^{T}],

were it not using another algorithm, called SMACOF (Scaling by Majorizing a Complicated Function).

- set up the matrix of squared proximities
- The obtained map is randomly oriented. Rotate it to move Gdańsk Główny (G) straight north of Katowice (KO).
- If Lublin (Lb) happens to lie west of the Katowice–Gdańsk Główny meridian, flip the X axis.

On my machine, steps 2–7 take about 20 seconds. The code is available on Bitbucket.

Great work Marcin, I should do this with languages.

LikeLike

[…] node positions of a matrix like this. Marcin Ciura did this with the Polish rail network, see here Warping Maps with SVD. Marcin shared the code and with this method one can calculate distances in 2D and 3D for […]

LikeLike

[…] connection to squeeze the data into a 2D or 3D graphic. Marcin Ciura programmed and warped the Polish railway network in this […]

LikeLike