Saturday, 2 March 2019

Python Scripting [3F]: Layer Fractional Segments for NZ Rail Maps 6

Today's little bit of fun and games has been to change the script so that it can read in a list file and process a list of source layers and produce the world and auxiliary files for them. Last night I discovered there was some additional aerial photos available for OtiraNorth area, for which I opened up the existing xcf project in Gimp that covered four existing areas with a total of 10 base tiles, I then added two more 0.4m base tiles rescaled to 0.1m, added the historical aerial photo, and as a result had six segments to export from the project. The result was a 17.3 GB Gimp file which it was able to handle OK with the extra swap space that it now has. If I can get a bigger swap disk for the computer then it should be able to handle very large files in future.

This means as there were three segments from each of two base tiles, I could put the parameters for the two base tiles into a file and save that, and then pass it to the script with a few modifications. These turned out to be rather more complex than I expected due to the different data types involved. 

Basically when you get arguments off the command line from sys.argv this is not a string containing arguments. It is a list of arguments. parse_args expects to be passed a list. But if you are reading from a file, with readlines, you get a list of strings, and when you loop through them, you have a string to pass in, which isn't what parse_args expects. So you have to turn that into a list, which typically you'd do by calling the string's split method.

So I have had to do some extra work to make sure parse_args is getting the list it expects to receive, because  otherwise it doesn't work as expected.

Here is the first part of the script up to the point where it reads the world files, showing the extra code needed to handle the extra and different parsing:

# declarations
import argparse
import os
import shutil
import sys

rootPath = "/home/patrick/Sources/Segments/"

# set up command line argument parser
parser = argparse.ArgumentParser(prog='segments')
parser.add_argument('-l', '--listfile')
parser.add_argument('-b', '--base')
parser.add_argument('-r', '--right')
parser.add_argument('-d', '--down')
parser.add_argument('-c', '--counter', type=int, default=4)
parser.add_argument('-p', '--pixelsize', type=float, default=0.1)

# check first for list file and handle if found otherwise assume single line input
argList = sys.argv[1:]                              # drop the script name parameter
args = parser.parse_args(argList)
if args.listfile == None:                           # single line input from command line
    listData = [" ".join(argList)]
else:                                               # multi line input from file
    listDataFileName = rootPath + args.listfile
    listDataFile = open(listDataFileName, "r")
    listData = listDataFile.readlines()
    listDataFile.close()

for listLine in listData:

    # parse arguments
    listLine = listLine.strip("\n")
    listItems = listLine.split(" ")
    args = parser.parse_args(listItems)
    #save the parameters
    baseName = args.base
    rightName = args.right
    downName = args.down
    counter = args.counter
    pixelSize = args.pixelsize

The major differences in this script therefore are:
  • import sys is needed in order to use sys.argv which is the parameters typed on a command line.
  • parser.add_argument calls are different. -b -d and -r are no longer mandatory and we put a new one in which is -l for the listfile.
  • The next block is to parse the arguments to look for the list file parameter (-l). This time instead of using the default to parse_args which gets sys.argv itself, I have saved this into a string. We used sys.argv[1:] in order to drop the script name which is actually passed in as the first parameter; parse_args itself uses the same syntax to achieve the same thing.
  • If a list file name was passed in then we read the list file into a list of strings. Each string contains one set of parameters. Otherwise we create a list containing one set of parameters from the command line. This means we have to first turn the command line parameter list into a string, with spaces between each parameter, and then turn that string into a list, which in this case will only have one string in it.
  • We then enter a loop which loops through our list of parameter lines. We get each list item into a string.
    • The first thing is to strip any newline characters (which is what we get as part of the input when we use readlines() to read multiple lines from a file, there will be a newline at the end of each line).
    • Next is to split the string into a list whose items are the parameters themselves, using the split function with a space as input.
    • Then finally we can call parse_args with this list as input.
From there the rest of the script is the same as before.

So we have about 20 more lines of code to handle the differences, which include discovering if there is a list file specified, and reading it, and then handling the conversions needed between different data formats. 

So I tested both types of input and it has worked as expected.