Thursday 28 February 2019

Python Scripting [3D]: Layer Fractional Segments for NZ Rail Maps 4

Continuing from our last post, we now want to find tile segments that need world files written for them and generate those files. 

The code to do this (from the steps outlined previously is:
  1. Strip the extension off the base file (92XJ7-92MKF.jgw) so that we get 92XJ7-92MKF:
    baseNameBase = os.path.splitext(baseName)[0]
  2. Split the filename at the hyphen so we get two pieces, baseColDescriptor and baseRowDescriptor respectively being 92XJ7 and 92MKF in this example:
    baseNameSplit = baseNameBase.split("-")
    baseColDescriptor = baseNameSplit[0]
    baseRowDescriptor = baseNameSplit[1]
  3. Set gridRowDescriptor to be a concatenation of baseRowDescriptor and gridY e.g. 92XJ7x4.1
    gridRowDescriptor = baseRowDescriptor + gridY
  4. Set gridColDescriptor to be a concatenation of baseColDescriptor and gridX e.g. 92MKFx4.3
    gridColDescriptor = baseColDescriptor + gridX
  5. Set gridFilename to be gridColDescriptor + "-" + gridRowDescriptor + ".jpg"
    gridFileName = gridColDescriptor + "-" + gridRowDescriptor + ".jpg"
  6. Search for a filename that ends with gridFileName (it will start with something else like K1977full-)
    rootFilesList = os.listdir(rootPath)
    for rootFile in rootFilesList:
           if rootFile.endswith(gridFileName):
  7. If the file name exists then change the full name into ".jgw" extension and write the 6 lines needed in the file:
    rootNameBase = os.path.splitext(rootFile)[0]
    segmentName = rootNameBase + ".jgw"
    segmentFileName = rootPath + segmentName
    segmentFile = open(segmentFileName, "w+")
    segmentFile.write(str(pixelSize) + "\n")
    segmentFile.write(str(baseSkewX) + "\n")
    segmentFile.write(str(baseSkewY) + "\n")
    segmentFile.write("-" + str(pixelSize) + "\n")
    segmentFile.write(str(segmentX) + "\n")
    segmentFile.write(str(segmentY) + "\n")
    segmentFile.close()
Testing so far indicates that the jgw files that the script wrote so far in testing are correct. Only one layer has been tested so far but the segment tile was presented in the right place relative to the base tile.
After looking at the workflow it would be ideal to provide the .xml and .jpg.aux.xml files for each jpeg file and have the script copy these automatically for each jgw file that it writes, so that all four files needed for each segment are complete. I expect to write a piece of code to perform that function and then present the complete script in the next part. So far the script is 72 lines total.

Python Scripting [3C]: Layer Fractional Segments for NZ Rail Maps 3

Last time we had a look at how to read data from our files and store it in a list. This time we are going to get that data and perform the calculations we need on it.

First thing is to convert the numbers read out as strings to floats. The numbers we need are in lines 5 and 6 in each of the three files.

baseSkewX = float(baseData[1])
baseSkewY = float(baseData[2])
baseX = float(baseData[4])
baseY = float(baseData[5])
rightX = float(rightData[4])
rightY = float(rightData[5])
downX = float(downData[4])
downY = float(downData[5])

Note we also have two more numbers from the base file only: baseSkewX and baseSkewY which we need to write out to every .jgw file that we generate. It is assumed in this script that we don't need to do any calculations with these.

Calculations needed are to generate, in this case, a total of 16 pairs of values (the original tile which is now 16 segments in a 4x4 grid, the actual number of rows and columns being passed in as the counter parameter on the command line). The grid then has 4 rows and 4 columns. The columns are numbered from 0 to 3 and the rows are also numbered from 0 to 3. For each segment, the X coordinate of the top-left corner is calculated by this formula, iterating through values of colNum from 0 to 3:

segmentX = ((rightX-baseX)/counter)*colNum)+baseX

Breaking that down:
rightX-baseX gives us the width of the base tile
divide by counter to give us the width of one segment
multiply by the column number to get the offset for a particular column
add to baseX so that we have the absolute coordinate, that is base plus offset.

Likewise segmentY is calculated in a similar way from rowNum in 0-3:

segmentY = ((downY-baseY)/counter)*rowNum)+baseY

So a double loop is employed to calculate 16 pairs of values:

for colNum in range(4);
       for rowNum in range(4):
             segmentX = (((rightX - baseX) / counter) * colNum) + baseX
             segmentY = (((downY - baseY) / counter) * rowNum) + baseY
             gridX = "x4." + str(colNum + 1)
             gridY = "x4." + str(rowNum + 1)
             print(gridX + " " + gridY + ":" + str(segmentX) + "," + str(segmentY))

That prints out 16 lines of data. gridX and gridY are interesting as they are the segment descriptors which are added to the end of the row and column descriptors of the original file name (basename in the script). For example 92XJ7-92MKF becomes 16 segments whose names run in the sequence from 92XJ7x4.1-92MKFx4.1 through to 92XJ7x4.4-92MKFx4.4

The next step is to write out the  .jgw files. Here we have a choice of just writing 16 files with the 16 pairs of values in them, which would be really easy to do here, or with a few more lines of code we can look for just the files we need to find by creating the filename pattern to search for and checking to see if that file exists. In other words, the user has put the segment tile (jpg file) into the segments directory for us to look up, and we shall write out a corresponding jgw file.

The actual spec of the file's contents would be as follows (using our existing variable names):
  • Line 1: pixelSize
  • Line 2: baseSkewX
  • Line 3: baseSkewY
  • Line 4: -pixelSize (in other words the pixelSize string with a - in front of it)
  • Line 5: segmentX
  • Line 6: segmentY  

The steps needed (in English rather than Python):
  • strip the extension off the base file (92XJ7-92MKF.jgw) so that we get 92XJ7-92MKF
  • split the filename at the hyphen so we get two pieces, baseColDescriptor and baseRowDescriptor respectively being 92XJ7 and 92MKF in this example.
  • set gridRowDescriptor to be a concatenation of baseRowDescriptor and gridY e.g. 92XJ7x4.1
  • set gridColDescriptor to be a concatenation of baseColDescriptor and gridX e.g. 92MKFx4.3
  • set gridFilename to be gridColDescriptor + "-" + gridRowDescriptor + ".jpg"
  • search for a filename that ends with gridFileName (it will start with something else like K1977full-)
  • If the file name exists then change the full name into ".jgw" extension and write the 6 lines mentioned just above.
  • There may be more than one filename that ends with gridFileName so we need to go through all the filenames in the directory.
The code needed to do that will appear in the next part in this series.

It would be fair to say this script is rather more complex than the first one I did (the files copying one) but the benefits of it are great because of the number of manual steps and potential errors eliminated. But it definitely has taxed my brain at times. I guess just as if we don't handwrite these days because of computers we lose the skills, if we don't do much manual calculations with our brains we lose that as well.




 

Wednesday 27 February 2019

Python Scripting [3B]: Layer Fractional Segments for NZ Rail Maps 2

Yesterday we looked at how to collect command line parameters easily in a Python script. Today we need to turn the parameters into filenames and then read their contents to memory.

First thing is to get the arguments into variables we can refer to and manipulate. This is pretty simple as the arguments are stored in the args object which as we can see argparse provides us with.

basename = args.base
rightname = args.right
downname = args.down
counter = args.counter
pixelsize = args.pixelsize

We are using a fixed directory for all the files so it's defined here:
rootpath = "/home/patrick/Sources/Segments/"

Then we test the source files exist. I changed my filespec to assume the user has typed in the full file name. They only need to put the name because the directory is fixed and defined above.

basefilename = rootpath + basename
rightfilename = rootpath + rightname
downfilename = rootpath + downname

Then for each file we can read all the contents straight into an object using readlines

basefile = open(basefilename,"r")
basedata = basefile.readlines()
basefile.close()

repeating the same pattern for the other two files. 
In this example basedata is now a list object containing the lines (6 in total) read from the jgw file. The spec of this file is as follows:
  • Line 1: A: x-component of the pixel width (x-scale)
  • Line 2: D: y-component of the pixel width (y-skew)
  • Line 3: B: x-component of the pixel height (x-skew)
  • Line 4: E: y-component of the pixel height (y-scale), typically negative
  • Line 5: C: x-coordinate of the center of the original image's upper left pixel transformed to the map
  • Line 6: F: y-coordinate of the center of the original image's upper left pixel transformed to the map
We write out our target jgw files in the same way. Lines 2 and 3 are copied straight from the base file. Lines 1 and 4 are the pixelsize parameter (which in line 4 has to have a - in front of it). Lines 5 and 6 are generated by our calculations in the script (in part 3 we will look at these calculations).

The result of the readlines() call is to put all the lines in the file into a type of object called a list. A list object is referenced like an array, so I can get the first line in basedata by referencing it like this:

x = basedata[0]

Zero is the number of the first item in the list, so we look at 6 items numbered 0 to 5 in this case.

So far the complete script looks like this:

import argparse

parser = argparse.ArgumentParser(prog='segments')
parser.add_argument('-b', '--base', required=True)
parser.add_argument('-r', '--right', required=True)
parser.add_argument('-d', '--down', required=True)
parser.add_argument('-c', '--counter', type=int, default=4)
parser.add_argument('-p', '--pixelsize', type=float, default=0.1)

args = parser.parse_args()
basename = args.base
rightname = args.right
downname = args.down
counter = args.counter
pixelsize = args.pixelsize

rootpath = "/home/patrick/Sources/Segments/"
basefilename = rootpath + basename
rightfilename = rootpath + rightname
downfilename = rootpath + downname

basefile = open(basefilename, "r")
basedata = basefile.readlines()
basefile.close()
rightfile = open(rightfilename, "r")
rightdata = rightfile.readlines()
rightfile.close()
downfile = open(downfilename, "r")
downdata = downfile.readlines()
downfile.close() 


Part 3 will look at the calculations we need to do in order to get the coordinates of each segment.


Python Scripting [3A]: Layer Fractional Segments for NZ Rail Maps 1

This is a technical description of the scripting project which is for dividing a large layer into segments for NZ Rail Maps. As discussed in some previous posts I am taking for example a 0.4 metre resolution tile and scaling it to 0.1 metre resolution making it 16 times the area of the original. Then because Qgis is difficult to use with tiles of this size I am cutting it into 16 segments with a suffix added to describe where they fit in a row-column grid on the original layer. Making up these grids and images in Gimp is relatively simple, the hard part is working out the coordinates of where to position each new tile segment in Qgis which I am doing by a calculation based on existing layer data.

The project is to carry out the following
  1. The script is to read in the following arguments passed on the command line:
    1. The base layer name (the original layer that has been divided into segments)
    2. The layer name to the immediate right (or left) - first version will only use right
    3. The layer name immediately below (or above) - first version will only use below
    4. The divisor (how many segments across or down the base layer is divided into)
    5. The pixel size of each segment in metres.
  2. The script carries out the following steps:
    1. Using the base layer name find the .jgw file for the base layer.
    2. From the base layer read the 5th and 6th lines which are respectively the x and y coordinates of the top left corner of the base layer.
    3. Repeat these steps for the layer to the right and the layer to the bottom.
    4. Calculate the coordinates for each segment using the formulas worked out in the spreadsheet.
    5. Work out which layers need .jgw files produced for them.
    6. Create and write the .jgw files for these layers - 6 lines of text data in each file.
I am making this a top priority for NZ Rail Maps at the moment because I have about 10 new tiles to work out, each of which has multiple segments to generate, and doing this for each one individually is very tedious and error prone because of all the manual steps needed.

The first requirement is to work out how to collect command line arguments. Python provides a library called argparse which can be used to interpret the command line. By looking at that I would probably set up the command line arguments as follows:
  • -b | --base  : base layer name (excluding the extension that will be automatically added)
  • -r | --right : right most layer name
  • -d | --down : lower most layer name
  • -c | --count : counter (divisor)
  • -p | --pixelsize : pixel size
argparse does all the dirty work for us in grabbing the arguments and letting us know if any are missing. These ones are generally mandatory so if any of them are omitted the intention is the script will complain. However I decided to make only the first three mandatory to be passed in and put in default values for the other two making them optional.

Our development editor is KDevelop which is part of KDE. We can write the script in the editor with the advantage of full syntax parsing, and run it in a bash shell using the python command.

So this first step is about as far as I expect to get with this today. I have written the following code into KDevelop as follows:

import argparse
parser = argparse.ArgumentParser(prog='segments')
parser.add_argument('-b', '--base', required = True)
parser.add_argument('-r', '--right'
, required = True)
parser.add_argument('-d', '--down'
, required = True)
parser.add_argument('-c', '--counter'
, type=int, default=4)
parser.add_argument('-p', '--pixelsize'
, type=float, default=0.1)
args = parser.parse_args()

The required=True bit on each add_argument statement makes sure the switches are mandatory.


Running that script (segments.py) and passing -h as the only parameter returns the help text that looks like this:
usage: segments [-h] -b BASE -r RIGHT -d DOWN [-c COUNTER] [-p PIXELSIZE]

optional arguments:
-h, --help                  show this help message and exit
-b BASE, --base BASE
-r RIGHT, --right RIGHT
-d DOWN, --down DOWN
-c COUNTER, --counter COUNTER
-p PIXELSIZE, --pixelsize PIXELSIZE

So all of that help text is generated automatically by argparse. The name of the variable that the argument gets stored in is parsed automatically (a default action) by add_argument  from the positional parameters passed into it. In the format I have called add_argument as above, it will look for the long form switch (with -- in front of it) and use that, if there is no -- it will look for the short form switch with - in front of it and use that. However you can actually explicitly define this by passing dest= as a parameter to add_argument.

If you fail to pass any of the mandatory parameters it will stop with a message flagging the missing parameter, but only the first missing one will actually be flagged. Essentially it prints out the usage: line as shown above, stating what arguments need to be provided. The default variable type is string, but as you can see there are two add_argument calls that provide type= parameters, which define the variable type as both of these variables are numeric values (one is integer and the other is real). For both of these I have omitted the required parameter and substituted a default value, so those two are optional.

Anyway that is the first bit completed. Having collected arguments the next step is to use them. So part 2 is going to cover the steps of finding and reading the contents of the three files that are referred to in the first three arguments.

Tuesday 26 February 2019

Python Scripting: Planned Scripting Projects

I've written a few posts about different scripts I am planning to look at. At the moment the ones that are most important are:
  1. A script to do segmented layer fractions for NZ Rail Maps. I am doing this very slowly and laboriously with a spreadsheet and keep making mistakes and having to paste things into files by hand, it does work but is a real drudge to do as the steps bring in a lot of mistakes inherently. This one is a bit more tricky than some of the other ones I am looking at as it has to have multiple input parameters either from a popup dialog(s) or command line parameters. The idea is it will write out the segment jgw files itself so therefore saving a whole lot of manual steps. To get a popup you need a third party library and EasyGUI is the one I am looking at.
  2. A script to auto rename my photos off the cameras from IMG_XXXX to the format I name them as which incorporates date/time and camera name. This script has to be able to read the EXIF tags and a third party library will be needed, then rename the files directly. 
  3. A script to auto sync video and music folder trees as previously mentioned. Needs to call ffmpeg for which there is not a python API, instead it is being called command line style. A native API is always best for interfacing. There are some third party libraries that may be able to interface it.
Scripting is not something I spend a lot of time doing but I shall need to start working on all of these ideas fairly soon as the needs of them are all pressing. At the moment various manual steps are used to get these tasks done and a lot of it is very slow.

On the other hand copyfiles.py and movefiles.py for copying raster layers for the maps have been a great success much better than the system I used before which was half script and half manual process that was time consuming and doing my first ever python scripting has been a great move forward.

Monday 25 February 2019

Firefox Developer exhausts system resources

One week after reinstalling mainpc I had a repeat of the situation that caused the reinstallation. Firefox Developer, which is the main browser I use, somehow got into a situation where it was using full CPU, RAM and swap for a period. I use the KDE system load monitor widget on the panel to monitor what the system is doing on my computers that have KDE. The system slowed right down and I noticed all three bars on the widget were all the way up. After a period of the system being bogged down and almost impossible to use due to being maxed out in resources, Firefox Developer crashed (or at least the instance that has the most tabs open, the other instances stayed open) and then the bargraphs all dropped to almost nothing.

Unlike the previous occasion the /tmp space was not all consumed so everything is still working properly. Possibly this is because I put /tmp onto its own partition and possibly also because the system had 100 GB of swap available after the reinstall. All the disks had the usual amount of free space available.

When Firefox was reopened there were no obvious issues and the resource use stayed low. However it is worth noting that the open tabs won't all be reloaded at startup, they are reloaded when they are clicked on. I did go through them later to reload and saw no issues with resource use.

It looks like there is some sort of scenario where Firefox can "run away" and crash after using up all the free resources. The whole thing is a bit bizarre but it has happened twice in one week and so I will have to see what I can do to try to find out exactly what is going on. But this could be quite hard to achieve. However I can guess that it probably leaks memory. It probably doesn't release or reuse memory used by a tab when the tab is closed, so it eventually uses up all the memory in the system, crashes and releases it all, then when it reopens, of course it doesn't need all that memory that it had.

The thing you see in KDE's system activity monitor if you open it up (press Ctrl and Esc together) is a few processes called firefox-bin which are the individual browser windows, and a number of processes called Web which don't mean a lot but are using significant amounts of memory, CPU or whatever. When I dig further in, Web  turns out to be using Firefox library files, so it must have something to do with Firefox. The common factor in most cases is using libxul.so, a library that is installed as part of Firefox Developer. 

At any rate this seems to be yet another example of the enormous resource hog that web browsers just seem to all be by default. All the tabbed ones seem to just use ridiculous amounts of memory in the computer.

UPDATE: The process that crashes is called Web Content (as top shows it) and it may only crash a tab rather than the whole browser. I currently have a bug report lodged into the Bugzilla database to see if it can be addressed as a bug in the browser, as I have been able to replicate this with one particular web site which crashes Firefox Developer every time.

Friday 22 February 2019

NZ Rail Maps: Optimising Gimp and using 4x4x4 grid for mosaics [1]

As recent posts have outlined I am using a new grid system for the NZ Rail Maps mosaics to standardise on the use of 4800x7200 tiles for historical aerial images, because Qgis limitations means it tends to run out of memory exponentially more rapidly and be much slower to update the screen if larger sizes are used. I have standardised on using 0.1 metre resolution aerial imagery when making use of official New Zealand Railways survey scans from Retrolens, in order to take advantage of the large scale (high resolution) of these survey images, but this means a lot more work to get the best out of them by ensuring we use high resolution Linz imagery, because Gimp uses a lot more resources obviously working with the higher resolution.

This means if I have 0.4 metre resolution Linz tiles they have to be increased to 400% of their original size to give me a tile that matches native 0.1 metre resolution tiles. And then the next step is to make up new tile names using a grid of 4x4, so that that scaled up tile now gets divided into 16 tiles at 4800x7200 pixels. The earlier posts talked about how to number the grids and to begin with I was actually splitting the large tile into 16 smaller ones using the Guides and Guillotine tools and then positioning them all correctly. I have now got to the point where instead of making 16 layers I will keep one layer and only split them at extraction (export) time. Instead I will have background grid tiles that are named and can be saved in a template, and when I export the tile it will get named according to the grid reference which will be visible in the layers list.

The last post on this blog, which was not tagged to NZ Rail Maps, talked about optimising Gimp to use memory and Linux swap space together which you do by treating them as a single resource, in this case totalling 132 GB, so that I can set Gimp to use 66 GB as tile cache (half the total) before it goes to using the Home drive for its own swap, and there is still memory/system swap space left for other applications. I have been monitoring the usage of these resources to see how well this new optimisation works with the largest mosaic ever. It ended up having to be a very large file because for one station it took four of the original tiles to cover the station yard due to a skew alignment, being north-east to south-west roughly centred at the intersection of the four tiles, which doesn't usually happen and most yards need at most two tiles at that resolution and some only one. So I was left with unavoidably having to provide for four 0.4 metre resolution tiles in the layout which once they are scaled up uses a lot of disk and memory space to work with.

So I have tracked the resource usage while working on this very large file and whilst I did eventually exceed the 66 GB cache setting for Gimp, it was almost not happening until I had finished creating the image. As usual the main memory resource usage is the ten original tiles that have been scaled up to 400% in both directions. The grid base tiles and the Retrolens images only added relatively small amounts of resource usage. The final version of the file saved to 14.8 GiB of disk space and the system resource usage at that time was the entire Gimp Cache being used and a small amount of Gimp Swap. On the system resource meter the usage overall for Debian is 32 GiB of memory and around 50 GiB of swap. Gimp is the main application running but it also has 8 GiB allocated to the undo buffer. Since there is still swap available, setting Gimp to have access to 100 GiB for its Cache setting is possible for handling still larger images. The image size is 38400x57600 pixels (2.2 GP). The system has been far easier to use than has ever been the case before with an image of that size without massive paging to the hard disk which is very slow and noisy. Definitely the noise I hate the most with my computer.

Given that about two months ago I spent quite a lot on speeding up this system with a new board, CPU and extra 16  GB RAM, which cost around $500, I have come to the conclusion that the most cost effective solution for increasing performance in future is to put in a larger SSD, which will be around $60 for 240 GB (about 200 GiB) or maybe $120 for 480 GB (about 400 GiB) that will double or quadruple the Linux swap partition size for really quite a low price. Admittedly SSD is a bit slower than main RAM, but silent in operation and faster than the regular HDD. In the meantime I will set Gimp to use more of the Linux swap partition which is currently 100 GB but will have to carefully watch it to make sure there is no risk of a system crash if other applications are running and use the swap. It will be interesting to see how that works out with different images and map drawing tasks this year.

Thursday 21 February 2019

How to optimise Gimp resource usage

Since I have one computer for the NZ Rail Maps project that is mostly used just for Gimp for mosaic creation and as it has got 32 GB of RAM and 100 GB of Linux swap on a partition of the SSD I need to make sure it uses the available system resources efficiently. One of the nice little things in Gimp 2.10 is the Dashboard dock which can show you how Gimp is using the resources in your system.

The computer has two disks that are relevant: The boot/install SSD,  which also now has separate root and tmp partitions, and a separate partition for Linux swap; and the home volume which is made up of two separate 2 TB disks in a software RAID-1 array (mdadm).

Gimp uses two different areas in particular, called Cache and Swap on the dashboard. Cache is system memory, including Linux swap space. Swap has nothing to do with operating system swap, but is actually the directory shown as Swap folder in the top level of the Folders tree in Gimp's Preferences dialog box. So it's important to understand the terminology and what it means. Because you can't point that Swap folder value at your operating system swap partition or file.

So in the System Resources area in the Preferences dialog the main setting that comes into usefulness there is the Tile cache size and to start with in my system this was set to half of the actual physical RAM or about 16 GB. And what happens is as soon as half the memory is filled then Gimp will start saving stuff into the Swap folder which is its own space wherever you have put it, not the system swap partition.

Now there are a lot of possible ways to optimise the way Gimp uses its resources. For example:
  • You could have a SSD partition that is specifically for Gimp's use (I did have one before called AppSwap) which is the folder for Swap folder mentioned above. I was able to make this work in earlier versions of Gimp. However in Gimp 2.10 being a Flatpak install, because of the inherent sandboxing in this packaging environment, you can't use a different disk as a swap location. (It might be possible to use a link to redirect to a different disk but I haven't tried this). I made this by chopping the swap partition in half to get the AppSwap partition on the SSD where the swap partition is also stored.
  • The other idea that came to mind was to remove the linux swap partition and replace it with a linux swap file that is stored on the SSD in the same partition as what you have set Gimp to use as swap. This is just a tweak of the previous description that makes sure the same partitiion is used flexibly for all swap whether OS or application specific.
SInce we use the flatpak installed version of Gimp I have had to go back to using linux swap as my preference and then making sure Gimp prefers it. The way to do this is quite simply to set that Tile cache size value to, for example, 66 GB - which is half of the total of RAM and swap space on my system - or we could even set that higher if we were confident that was available. I have to do some playing sometime to see how Gimp performs with other applications running, but it is showing this usage as soon as the memory gets full that it is going into the SSD swap and is being reported as such by the OS. Right now I have an image loaded that is using 49 GiB of Cache and the Swap isn't being used at all. As I have some stuff to try to add into this very big mosaic (12.6 GiB when saved to disk) we can see this is easily the biggest mosaic I have worked on (the previous record was 11.5 GiB) and it will be interesting to see how much bigger it can get before the system is too slow to be usable. 

Funnily I can't remember what values I had the previous installation of Gimp set to use, and I am only really familiar with Qgis usage of swap. And funnily all this big mosaic stuff is partly due to trying to optimise resources so that Qgis won't run out of memory - by making smaller tiles, which means in this case a grid of 64 tiles each 4800x7200 pixels. But mostly the extra resource usage is from combining several sets of tiles together in one file and in this case with four separate locations in the tiles I may yet have to split some of them out if the system can't cope with the resource usage. So this is an interesting trial.

Monday 18 February 2019

The end of NZ Rail Maps

This post isn't in the NZ Rail Maps blog, nor is it tagged as NZ Rail Maps because it is not an official announcement for the project.

I do however expect NZ Rail Maps will last a few more years and then be mothballed. Exactly how many years I can't say. But after a break of about a week for various reasons there are definitely some speed wobbles to consider.

One is that simply my personal interests are changing. Rail heritage used to be a big thing for me but hasn't been for the best part of 20 years, and each year I get more and more distant from it. NZ Rail Maps has been my only interest in rail heritage for a long time. So my level of engagement with the rail enthusiast community is essentially declining.

The second consideration is that the Qgis software is getting more difficult to use. With the transition from Qgis 2 to 3 there seems to have been a significant drop off in the Qgis developer community activity level. The bug tracker for version 3 has hundreds of bugs that are not even being looked at let alone fixed. This all impacts on public perceptions of the software. Whilst I am not aware of viewpoints in the wider free software and Gis community I am perceiving in my own way that the engagement is just not there than there was with earlier versions.

For me this means I am not even bothering to engage with the Qgis developer community and go to the trouble of reporting new bugs or providing information for existing bug reports any more. Today I wasted a whole lot of time trying to pin down a new bug and have just put in a workaround although I may yet attempt to resolve it with a virtual machine for an old version of Qgis or perhaps running on mediapc if 2.18 is still available for Debian. The stuff I published on a couple of posts back about putting grids together to keep tiles at 4800x7200 is just an example of the extra work I am having to do in order to deal with problems in the software that are not being addressed by the developers.
I have new interests that I knew would vie for my time against NZ Rail Maps and this is what is starting to happen this year and will grow in momentum as we get further out. So right now I can't put a time frame on what will happen beyond this year having put in place the priorities for 2019.

Friday 15 February 2019

General update 2019-02-15

Last week I had some issues with mainpc resulting in it running out of /tmp space which of course is caused or impacted by running out of space on the install volume and caused a number of user files in /home I was working on to become corrupted. I haven't looked at the bigger question of how we can prevent issues like this from bringing down the whole system or how it actually happened. Instead I just chose to reinstall the system from scratch.

There was a delay to getting this done because issues with my internet here created problems for the Debian installer trying to connect to the local mirrors to download install files so in fact mainpc was down for something like 5 days but with the other computers available I was able to keep working to an extent with the main constraint being unable to access any files over the network shares from mainpc.

Key objectives of the reinstall include:
  • Recover the entire free space on the SSD for swap file. Previously an attempt was made to allocate half of the swap to Gimp specific temporary/cache allocation. This proved difficult to implement so I decided to revert the full space (99.9 GB) back to swap as Gimp does use this space to cache as needed.
  • Allocate /mnt to a separate partitition on the SSD like done to mediapc to ensure there is no possibility of a mount failure allowing files to be written to the SSD instead of a mounted volume on another disk filling up the install volume.
  • Allocate /tmp to a separate partition to ensure there is no possibility of a filling up of the rest of the root volume causing a disk space crisis for /tmp or vice versa.
With the system back up and running having rejoined to /home on the RAID-1 array and therefore reusing my existing user profile I gave attention to software reinstallation. The previous installation used the Debian packages for Thunderbird which installed the Daily channel build and caused problems with extensions like Lightning being incompatible and therefore unavailable. I chose this time to download Thunderbird directly from their site and install it directly into my user profile to save space on future reinstalls and ensure it is the Release channel which allows me to use Lightning again.

mediapc has had a new R/W share created in samba to give access to the Video Archive folder which contains a volume of material not in the media share and therefore I can work on the material in that folder from mainpc for convenience. mediapc is about to get various video editors installed as mentioned in a previous post some time ago.

So questions still remain over managing disk space in some of the areas of a typical Linux install and maybe I can address these.

I had a look at rearranging screens because the arrangement was less than optimum for mediapc. Basically at eye level I have four screens: two for mainpc and one each for serverpc and mediapc. Three of those are vertical to save space and keep them as close to me as possible without having to turn around too much. The problem is for video editing on mediapc you want it to be a horizontal aspect ratio for best use of space. However because mainpc and serverpc are used the most the screen layout suits them best and is not really set up for anything else so after trialling an alternative layout I decided to leave things as they are, except for the two 3 metre DVI cables I purchased which will now not be needed but still have to be paid for. At the end of the day the layout really only can be optimised for the two most used computers so video editing will just have to be improvised or I remote to the computer doing the editing from mainpc.

Saturday 9 February 2019

NZ Rail Maps: Scaling up map mosaics & creating new tile grids

One of the issues that comes up with doing the map mosaics for NZ Rail Maps is that the resolution of Retrolens aerial imagery can often be very high, when using the official NZR survey images at the scale that they appear on Retrolens. A typical scale is only 1:4300 which gives us a lot of detail on the ground. However to mosaic that over the Linz stuff can be problematic when some of the latter comes in at a much smaller scale. This week I was working with source imagery of 0.5 metres (pretty unusual for Linz aerials from the last five years) and the downscaling of the NZR stuff to match this would result in way too much detail being lost.

So the obvious solution is to go the other way and upscale the Linz stuff but the problem is you end up with a massive tile size that the current version of Qgis can't handle very well. As an example, the railway station at Tokoroa is covered by one single tile at 4800x7200 pixels with each pixel measuring 0.5 metres on each side. This is called 92VJD-92MCL in the EPSG:3857 grid that Linz supply the imagery in. Scaling that up in Gimp to 0.1 metre (500%) gives us a tile that is now 24kx36k pixels which means 25 times in total area and even the smallest export will be over 200 MB which Qgis finds very hard to handle. 

So the solution arrived at here was to split the scaled up tile into 25 tiles at the original dimensions. This has all been done by hand this first time, laboriously cutting out the grid sections and pasting them one at a time as individual layers into a new image, but apparently it should be possible to set up some guides in Gimp and use the guillotine tool to automatically slice the original into pieces (hopefully with each piece as a separate layer).

The next step is to come up with a new grid where a set of numbers get added to the original column and row references of the original tile. 92VJD refers to a column name and 92MCL refers to a row name. So I cobbled together a spreadsheet and obtained the top left coordinates from various world files in a table in the sheet so that I could work out the height and width of the original tile and then divide these numbers into fives so that I can work out the top left coordinates to put into our new world files for the new grid. So in accordance with the way I have done it in the past, 92VJD-92MCL will become 92VJDx5.1-92MCLx5.1 and this decipers as, x5 means that we have scaled the original tile up five times, and .1 means this tile is a piece of the full tile. As we have a 5x5 grid of tiles then the column and row numbers range from x5.1 to x5.5. In other words the part that is actually a row or column number is the part after the decimal point.

Then with the spreadsheet, it did the calculations to work out the top left coordinates to put in the copies of the world files that were produced for each new tile, and after a bit of trial and error, it worked just fine. Qgis is now using a much more reasonable amount of memory because it only needs to load five tiles (1/5 of the original tile size) of around 15 MB each instead of the very large original of over 200 MB and the refresh is much faster and Qgis takes much longer to run out of memory. So that has been worthwhile but it took ages to set up so I need to work out ways to speed it up and that guillotine tool, if I can make it work, could speed up the slicing a lot, and a Python script or the like to work out the new coordinates and maybe even write the new world files directly so as to save a lot of manual editing and copying files.

I have a similar thing happening in respect of the map tiles for Waipara where I have simply added new tiles that were not in the original supplied imagery grid because of covering some original areas to the side of the supplied grid and so I have to again calculate manually the correct coordinates for the new tiles. But this case is not one of having scaled the original images and I did not have to create the grid like I did in the example mentioned above. So that one will not be anything like as much work.

UPDATE: For the time being I have created a Gimp template that creates a grid of up to 5x5 4800x7200 layer tiles. When a new mosaic project is created these tiles are in the background and enable the grid numbering system to be followed for exported tiles out of the full image when the full image is cropped to individual tiles. This does not directly address the purpose of scripting which is to auto generate World files for individual tiles that are created out of a full mosaic. The scripting task is still being considered at this stage, meanwhile the World Slice Calculations spreadsheet has been in use to calculate the coordinates of each tile.

Thursday 7 February 2019

How to auto start x11vnc on LXQt or LXDE (including Raspbian)

LXQt is the desktop environment of the current edition of Lubuntu. Previous editions of Lubuntu used LXDE as the desktop environment. My personal preference is to use LXQt on Debian instead of XFCE on Debian, due to XFCE development being too slow (the same reason I switched most of my more powerful computers to KDE). I would only use LXQt on a computer like my Toshiab R500 which only has 2 GB of RAM, or in the case of the Raspberry Pi, they ship Raspbian using LXDE. I installed LXQt over the top of XFCE after installing Debian onto this laptop, and it works great.

LXQt and LXDE (and Raspbian) feature an easy autostart capability which can be configured from the GUI and can also be set up by placing a file called x11vnc.desktop into the ~/.config/autostart directory in the user's profile. If you use the GUI it creates the file with default settings. 

We want to use x11vnc to share the default desktop so that we can remote into it with a VNC client (I use Remmina) to remote control the laptop. This means I can use the laptop as my tethered desktop. The laptop sits on a docking station and can be quickly docked and undocked so the docking station can be left in place with all cables connected and all I need to do is bring out the laptop, dock it, turn it on and away we go.

The settings to be put into the desktop file (e.g. x11vnc.desktop) in the folder mentioned above are as follows. Note we have omitted -usepw as we are running it here without a password on a home network where there is no other user. If you want to use it with a password there are additional steps on first installing it to set the password and the extra switch to cause it to use the password which I have left out here.

[Desktop Entry]
Encoding=UTF-8
Type=Application
Name=X11VNC
Comment=
Exec=x11vnc -forever -display :0
StartupNotify=false
Terminal=false
Hidden=false